Volume 19, Issue 1 p. 71-92
Full Access

The Development of Dynamic Facial Expression Recognition at Different Intensities in 4- to 18-Year-Olds

Rosario Montirosso

Rosario Montirosso

Department of Child and Adolescent Neurology and Psychiatry, Scientific Institute ‘E. Medea’, Bosisio Parini, Lecco

Search for more papers by this author
Milena Peverelli

Milena Peverelli

Department of Child and Adolescent Neurology and Psychiatry, Scientific Institute ‘E. Medea’, Bosisio Parini, Lecco

Search for more papers by this author
Elisa Frigerio

Elisa Frigerio

Institute of Psychology, School of Medicine, Milan

Search for more papers by this author
Monica Crespi

Monica Crespi

Psychology Department, Catholic University, Milan

Search for more papers by this author
Renato Borgatti

Renato Borgatti

Department of Child and Adolescent Neurology and Psychiatry, Scientific Institute ‘E. Medea’, Bosisio Parini, Lecco

Psychology Department, Catholic University, Milan

Search for more papers by this author
First published: 06 January 2010
Citations: 80
Rosario Montirosso, Scientific Institute ‘E. Medea’, Via don Luigi Monza, 20-23842 Bosisio Parini (Lecco), Italy. Email: [email protected]

Abstract

The primary purpose of this study was to examine the effect of the intensity of emotion expression on children's developing ability to label emotion during a dynamic presentation of five facial expressions (anger, disgust, fear, happiness, and sadness). A computerized task (AFFECT—animated full facial expression comprehension test) was used to display facial emotion expressions as animations with four levels of intensity (35, 50, 75, and 100 percent). In this study, which employed a cross-sectional design, 240 participants from 4 to 18 years completed the AFFECT. Results indicated that recognition ability developed for each of the emotions, with the exception of disgust, over the age range tested. Girls were more accurate than boys, especially for anger and disgust expressions. Recognition accuracy was found to increase as a function of the intensity of emotional expressions.

Facial expressions are displayed dynamically, and can vary in intensity and speed. It is thought that the intensity of an emotion expression serves as a cue to the emotional states between people (Ekman, Friesen, & Ancoli, 1980). Consequently, the correct interpretation of the intensity of facial expressions allows for better adjustment of the receiver's behavior to suit the sender's behavior or emotional intent (Gosselin & Pelissier, 1996). Although considerable attention has been devoted to the categorical recognition of the facial expressions of emotions, little is known about the role intensity plays during emotion recognition or how its effects change with development. Thus, the main purpose of the current study was to provide evidence of the effects of different levels of intensity on the labeling accuracy for animated facial expressions in 4- to 18-year-olds.

Recognition is related to age, the emotion expressed, task demands, encoding abilities, and stimulus properties. Behavioral data suggest that infants are able to categorize some emotions in their first year of life (Caron, Caron, & Myers, 1982; Izard, 1971; Nelson, 1987). During their second year of life, infants start interpreting the reactions of adult emotional behavior (Pons, Lawson, Harris, & de Rosnay, 2003) whereas a more accurate interpretation of emotions appears between three and six years of age (Boyatzis, Chazan, & Ting, 1993; MacDonald, Kirkpatrick, & Sullivan, 1995; Widen & Russell, 2003). Recognition of facial emotions improves between 6 and 15 years of age and adulthood (Gosselin, Roberge, & Lavallee, 1995; Kolb, Wilson, & Taylor, 1992; Vicari, Reilly, Pasqualetti, Vizzotto, & Caltagirone, 2000).

There is also evidence that the developmental pattern is not uniform across emotions (Lenti, Lenti Boero, & Giacobbe, 1999). Expressions of happiness and sadness seem to be correctly categorized earlier than those of fear and disgust (Boyatzis et al., 1993; Camras & Allison, 1985; Gosselin & Larocque, 2000). The developmental pattern for anger is less clear, with some results indicating a pattern similar to that for happiness and sadness (Gosselin, 1995), and others indicating that anger is less well categorized than happiness and sadness, but is also less well categorized than fear and disgust (Boyatzis et al., 1993). It is possible that the variation of recognition development is partially due to task demands and/or to the types of response required from subjects (Bruce et al., 2000; Markham & Adams, 1992). For example, a discrimination paradigm (i.e., habituation followed by preference) requires attentional and perceptual abilities (Walker-Andrews, 1997), a matching procedure relies more on visual and spatial abilities (Herba, Landau, Russell, Ecker, & Phillips, 2006) whereas free labeling requires the ability to verbally identify facial expressions. As a free labeling task may be more difficult for younger children (Brody & Harrison, 1987), a common method used in research on emotions is the forced-choice response format, which minimizes verbal demands but still provides evidence of emotion recognition on the basis of semantic categories (Camras & Allison, 1985).

The development of facial expression recognition is also related to the efficiency of encoding faces (Chung & Thomson, 1995). Diamond and Carey (1986) posited that three types of information are involved in face processing: featural (facial parts or discrete components, such as eyes and nose), first-order relations (relative position of features, such as eyes above the nose), and second-order relations (fine spatial information, such as the distance between features). Recognition of faces and the identification of most facial expressions are based on the analysis of both configuration (relations among facial features) and features (Calder, Young, Keane, & Dean, 2000). It has been suggested that adults' expertise at processing faces derives primarily from their ability to use configural information (Diamond & Carey, 1986). By using manipulations that disrupt configural information, Carey and Diamond (1994) found that the effect of face inversion was not present in children younger than 10 years of age. Other studies have found inversion effects at younger ages, suggesting that there is configural processing of faces at these ages, or increasing inversion effects with age (Brace, Hole, Kemp, Pike, Van Duuren, & Norgate, 2001; Mondloch, Le Grand, & Maurer, 2002; Schwarzer, 2000). Configural information also plays an important role in children's recognition of facial emotions. Using a composite face paradigm (the emotion display represented in the top half of an upright or upside-down face is either aligned, or not, with a bottom half that displays another emotion), Durand, Gallay, Seigneuric, Robichon, and Baudouin (2007) found that children from 5 to 11 years old processed facial emotion in a configural/holistic way. Thus, the development of the ability to process facial emotions could be due to the underlying development of the ability to process the configural properties of the face (De Sonneville, Verschoor, Njiokiktjien, Op het Veld, Toorenaar, & Vranken, 2002). However, this conclusion derives primarily from studies that have not taken other properties of facial expression into account. A comprehensive understanding of how children develop this expertise requires examination of the effects of intensity and motion on their developing emotion-processing abilities.

Despite their ecological validity, there is a relative paucity of studies employing dynamic portrayals of emotion (Herba & Phillips, 2004), or examining the role of intensity (see below). There is some evidence that the judgment of facial expressions is affected by the motion of the display (Bassili, 1978, 1979; Berry, 1990; Richardson, Bowers, Bauer, Heilman, & Leonard, 2000). In the majority of facial emotion studies, researchers have used static photographs, such as Ekman and Friesen's (1976) pictures of facial affect. Russell (1994; but see: Ekman, 1999) highlighted considerable theoretical and methodological difficulties regarding studies on emotion based on the posed photographs. These artificial materials are largely limited to a small number of facial configurations relative to all the possible combinations of muscle contractions that the face is capable of making (Carroll & Russell, 1997). Thus, still photographs and other static material such as cartoon drawings do not capture the liveliness and true form of facial expressions that occur in day-to-day interactions. Findings on adults suggest that a dynamic presentation increases overall recognition accuracy (Ambadar, Schooler, & Cohn, 2005; Biele & Grabowska, 2006). Research on the development of the recognition of dynamic facial expressions is extremely limited. De Sonneville et al. (2002) examined the contribution of different face-processing skills (configural processing vs. featural processing) rather than assessing the role of motion in developing patterns of expression recognition as such. More recently, the impact of familiarity upon developing emotion recognition was examined in children aged 4 to 15 years (Herba et al., 2008). Regardless of familiarity (expressions displayed by familiar faces compared with unfamiliar faces), findings indicated that with increasing age, there was an improvement in accuracy for sad and fear expressions, but not anger. However, it should be noted that expressions increased by 10 percent per second rather than in a continuous manner, as what happens in ‘real life’ facial expression movements. Thus, the dynamic stimulus presentation used by Herba et al. (2008) could have reduced ecological validity.

In ‘real life’ facial expression, movements can be displayed at several intensities, suggesting that the decoding of emotional expressions does not follow the ‘all or nothing’ rule and that individuals are sensitive to intensity changes (e.g., exaggeration; Calder, Rowland, Young, Nimmo-Smith, Keane, & Perrett, 2000; Montagne, Kessels, De Haan, & Perrett, 2007; Nowicki & Carton, 1993). Children as young as a few months old have been shown to discriminate between different intensities (i.e., mild vs. intense happy expressions). For example, Kuchuk, Vibbert, and Bornstein (1986) and Nelson (1987) documented that four to seven month-olds can discriminate between two fearful faces that vary in intensity, and older infants can discriminate between two happy faces that vary in intensity. Gosselin and Pelissier (1996) found increased accuracy related to the intensity of action units making up the facial expressions of happiness and disgust in 9–10 year-old children. However, these findings have not always been confirmed. Gosselin et al. (1995) presented four emotions, each varying in intensity (happiness, anger, surprise, disgust), to a group of 5- to 10 year-olds. Happiness, but not anger, was more readily recognized at a higher intensity. Thus, exaggerating expression does not necessarily facilitate recognition. Herba et al. (2006) employed two emotion-matching tasks requiring children between 4 and 15 years old to match a target face stimulus on the basis of emotion category at four intensities (25, 50, 75, and 100 percent). Children's accuracy for emotion category improved with increased intensity, particularly from 25 percent to the higher intensities and also for fear and happy expressions.

Other factors, such as gender, intelligence, and socioeconomic status (SES), may affect the development of emotion processing (Herba & Phillips, 2004). Evidence regarding gender differences in facial emotional processing in preschool and school-age children is inconsistent (Brody & Hall, 1993; Gross & Ballif, 1991). Many studies fail to report gender effects in childhood (De Sonneville et al., 2002; Lenti, Boero, & Giacobbe, 1999; Vicari et al., 2000); however, in a meta-analysis, McClure (2000) found evidence of a small but robust advantage for girls in processing facial expressions. Few studies have explored the relations between the development of emotion recognition and SES. Smith and Walden (1998) found that family income was positively correlated with the children's accuracy score. On the other hand, disadvantaged children were more accurate in the recognition of fearful facial expressions, possibly due to their exposure to highly stressful environments. However, this association was only investigated in preschool children, so the relation between the development of emotion recognition accuracy and SES is limited. It has also been suggested that intelligence (or highly related factors as such verbal ability) may affect the children's emotion processing (Herba & Phillips). Bennett and colleagues found that four-year-olds with higher cognitive abilities tend to have higher emotion knowledge scores, including their recognition of expressions, labeling of expressions and situational knowledge (Bennett, Bendersky, & Lewis, 2005). In children from economically disadvantaged families, the ability to recognize and label expressions mediated the effect of verbal ability on academic competence (Izard, Fine, Schultz, Mostow, Ackerman, & Youngstrom, 2001). In a group of preschool children, Field and Walden (1982) found that the ability to read facial expressions was related slightly, but significantly, to the intelligence quotient (IQ). More recently, Herba et al. (2006) reported that verbal ability did not affect emotion processing accuracy in children between 4 and 15 years. These authors suggested that this lack of significant effect could be related to the use of matching tasks, which are reliant on visuospatial ability rather than on verbal ability.

The primary aim of this study was to examine the developmental effects of intensity on emotion recognition as a step toward gaining a comprehensive understanding of how children become experts in processing facial expressions. We employed a cross-sectional design to examine the influence of age from 4 to 18 years and the moderating factors of gender, SES, and IQ on the development of emotion recognition. We examined children's emotion recognition of five basic emotions (anger, disgust, fear, happiness, and sadness) using the animated full facial expression comprehension test (AFFECT—Gagliardi, Frigerio, Burt, Cazzaniga, Perrett, & Borgatti, 2003).

We explored several hypotheses based on previous developmental studies. First, we expected a progressive improvement in accuracy with the intensity of the display. However, we did not expect that the intensity effect would be the same for the different emotions. In particular, we assumed that among negative emotions, fear is better recognized at lower intensities (consistent with Herba et al., 2008). Second, we expected a gradual improvement with age, although there may be distinct developmental patterns for each of the five emotions. Similarly to Herba et al. (2006), we predicted that with increasing age, children would be more accurate in labeling lower intensities of expression. Third, we assumed that girls would be slightly more accurate compared with boys, but we did not expect to find this difference for all the emotions (Biele & Grabowska, 2006). Lastly, based on earlier research (Herba et al., 2006, 2008), we expected that neither SES nor IQ would be associated with the recognition of facial expressions across childhood.

Method

Participants

The sample consisted of children between 4 and 18 years, drawn from 15 kindergartens and schools randomly selected in three provinces with comparable urbanization levels (Milano, Como, and Lecco) of Northern Italy. The purpose of the study was presented to school directors, either by phone or by fax. Of the 15 schools contacted, 12 gave their consent for their students to participate in the study. In every school, one class from each grade level was randomly selected, and the same number of boys and girls were identified at random by a preprepared list of random numbers. This ensured an equal distribution of participants according to gender and age. If a selected child was absent or had a diagnosis of mental retardation or psychological disability certified by the public mental health service, the next pupil on the list was selected. The parents of each selected child were given an envelope containing a letter explaining the purpose of the research, requesting permission to include their child in the study, and a questionnaire about family sociodemographic characteristics.

Participation was voluntary. Contact was made with 312 participants who fulfilled the inclusion criteria. The parents of 69 (22 percent) children declined to participate. Participants taking part in the study and participants whose parents declined to participate were not significantly different in gender or age. Three other boys were excluded because their IQ score was below the normal range (IQ < 85). The final group consisted of 240 participants (120 boys and 120 girls). Table 1 shows the participants' sociodemographic characteristics and IQ. The study was approved by the Ethics Committee of the ‘E. Medea’ Scientific Institute, and written informed consent was obtained from all parents.

Table 1. Demographic Characteristics, Socioeconomic Status (SES), and Intelligent Quotient (IQ) of Participants
Variables AFFECT IQ
N (%) Mean SD
Gender
 Boys 120 (50) 111.7 (11.7)
 Girls 120 (50) 110.8 (11.7)
Age
 4–6 years 48 (20) 110.8 (10.6)
 7–9 years 48 (20) 113.6 (11.5)
 10–12 years 48 (20) 108.4 (12.6)
 13–15 years 48 (20) 111.9 (12.9)
 16–18 years 48 (20) 111.5 (11.4)
SES
 Unscorable 4 (2)
 Lower (10–35) 42 (18) 109.3 (10.8)
 Medium (40–65) 121 (50) 110.3 (11.5)
 Higher (70–90) 73 (30) 114.7 (11.9)
  • AFFECT = animated full facial expression comprehension test.

Procedure

Testing took place over two days. On the first day, participants completed a standardized intelligence scale. On the next day, each participant completed the AFFECT individually in quiet conditions in a room within their school building. Participants completed the task while sitting 50 cm away from the 17-inch computer screen on which the test stimuli were presented.

Measures

Socioeconomic Status. SES was coded according to the information provided by caregivers on the basis of Hollingshead's (1975) classification for parental occupation. Scores ranging from 70 to 90 correspond to the upper status, scores ranging from 40 to 65 correspond to the middle status, and scores ranging from 10 to 35 correspond to the lower status.

Intelligence Quotient. Although it would have been preferable to use only one test, two intelligence tests were conducted at different age ranges due to organizational constraints. Participants were examined with the Stanford–Binet intelligence scale (Bozzo & Mansueto, 1969) up to age 13 years and with the Cattell and Cattell (1963) culture fair intelligence test (scale 3-form a) from the age of 14 onwards. The culture fair is a test of general mental ability consisting of a series of perceptual tasks. Its scores represent two identifiable types of intelligence (i.e., fluid and crystallized, see: Cattell, 1963). The Stanford–Binet scale can be regarded as providing a measure of crystallized intelligence. Despite their different approaches to the measurement of intelligence (i.e., a culture-fair and a culturally biased test), the two scales show a significant, although moderate, concurrent validity (r = .66; Cattell & Cattell, 1963) and provide a measure of crystallized intelligence (Kellaghan, 1977). Furthermore, both scales are widely used measures of intelligence with well-established reliability and validity, and provide norms with a mean value of 100 and a cut-off ± an SD, allowing for comparison of the scores' distributions.

Animated Full Facial Expression Comprehension Test. AFFECT is a computerized task that displays facial emotion expressions as animations with four levels of intensity (35, 50, 75, and 100 percent). The stimuli for the AFFECT (Gagliardi et al., 2003) were derived from prototype expressions from Ekman and Friesen (1976) that are available in the facial expressions of emotion: stimuli and tests (FEEST) database of facial expressions (Young, Perrett, Calder, Sprengelmeyer, & Ekman, 2002). As suggested by Calder, Burton, Miller, Young, and Akamatsu (2001), there are at least three advantages to using these pictures. First, each of seven facial expressions (happiness, sadness, anger, fear, disgust, surprise, and neutral) is associated with distinct facial musculatures that are recognized by many cultures throughout the world (Ekman & Friesen, 1978). Second, these stimuli have been used in several studies, thus confirming that the expressions are recognized as the intended emotions. Third, principal component analysis of these stimuli has found that they are consistent with psychological accounts of facial expression recognition developed by social psychologists. The grayscale photographs of four individuals (two males) posing with neutral, angry, disgust, fearful, happy, and sad expressions were delineated (Benson & Perrett, 1991). As surprise was found to be one of the most difficult emotions to recognize (Gosselin et al., 1995; Vicari et al., 2000; Wiggers & van Lieshout, 1985), it was not included to avoid making the test too difficult. For each of the five emotions, an image morphing technique (Benson & Perrett, 1991; Calder et al., 2001) was used to generate sequences of 21 frames running from neutral (0 percent, frame 1)—through subsequent frames (35 percent, 8th frame; 50 percent, 11th frame; 75 percent, 16th frame)—to the full-blown (100 percent, 21st frame) facial expressions. As in previous studies (LaBar, Crupain, Voyvodic, & McCarthy, 2003), emotion morphs were used instead of videotaped expressions to allow experimental control over the rate and duration of the changes. During animation, each frame in the sequence was shown for approximately .05 seconds so that the duration approximated real-time changes of facial emotional expression. The effect obtained is similar to a natural expression, and all the expressions were posed with full frontal orientations. Examples of the graded intensities of the five emotional expressions are shown in Figure 1.

Details are in the caption following the image

Examples of the Five Expressions Illustrating Different Intensities.
In each trial, each emotion was animated with increasing intensity of the facial expression from a neutral face (a) to 35 (b), 50 (c), 75 (d), or 100 percent (e).

Training Task. Participants began with a ‘game’ to familiarize them with the recognition of emotional expressions and to ensure that the children could label and understand facial expressions. The participants were asked to label the experimenter's facial expression and to produce on their own face each of the five facial expressions. Before the real testing began, five practice trials were performed with faces different from those used in the experiment. In both the training task and the test (see below), the experimenter read out a list of five emotional labels from which the participant had to make a choice in order to label the expression. As in a previous study including younger children (cf. Nowicki & Mitchell, 1998), the instructions given by the experimenter to the participant were: ‘I would like you to guess how these persons are feeling. I would like you to guess if they are feeling anger, disgust, fear, happiness, or sadness’.

Task. Participants were presented with four test blocks (data 1, data 2, data 3, data 4), each containing 20 trials, for a total of 80 trials: 5 expressions × 4 identities × 4 intensities. Trials were counterbalanced between blocks so that each block contained five expressions animated to the four intensities, and each intensity level of an expression within a block was presented by a different individual. The order of items within each block was randomized with one restriction: the same face could not be shown consecutively. The test took up to 20 minutes. To facilitate the participant's response, the last frame in the animation (i.e., the 8th, 11th, 16th or 21st frame) remained on screen until participants responded. If the participant did not give an answer within 30 seconds, the experimenter read out the list of emotional labels again. This was done for about 1 percent of all responses. A participant's answer was only recorded if it was on the list of emotional labels or it was a close synonym of one of the words on the list (e.g., ‘joyful’ for happiness, ‘scared’ for fearful). If the participant's answer did not meet these criteria or was a paraphrase (e.g., ‘he hurts himself’, ‘it is when his mother scolds him’), the experimenter asked for a single emotion label from the list.

Recognition Accuracy Score and Data Analyses

Accuracy scores were calculated for each of the five expressions and for each of the four levels of intensity by the unbiased hit rate (Hu) proposed by Wagner (1993), which takes response biases into account. Hu is ‘an estimate of the joint probability both that a stimulus is correctly identified (given that it is presented) and that a response is correctly used (given that it is used)’ (Wagner, 1993, p. 16). Hu is calculated by multiplying the hit rate for that expression (the number of correct uses of the expression divided by the number of times that that particular type of expression was presented) by the differential accuracy (the number of correct uses of the expression divided by the total number of uses of that expression). Values are proportions ranging from 0 to 1. To explore the accuracy scores based on age, the participants were divided into five age ranges roughly reflecting developmental groupings: age 4–6 = preschoolers, age 7–9 = schoolchildren, age10–12 = preadolescents, age13–15 = adolescents, and age16–18 = late adolescents.

Although the AFFECT has already been used in a clinical population (Gagliardi et al., 2003), no analysis has been made of its psychometric properties. Therefore, in addition to the principal objectives of the current study, we also investigated the internal consistency and the test–retest reliability of AFFECT. The results are reported in Appendix 1.

Due to the number of comparisons, a significance level of p < .01 was set for all analyses.

Results

Effects of Intensity, Age, Emotion, and Gender

Means of Hu and SDs for intensity and age are reported in Table 2. The Hus for each emotion at each level of intensity were analyzed in a repeated-measures analysis of variance (ANOVA), with gender and age (age 4–6, age 7–9, age 10–12, age 13–15, age 16–18) as between-subjects factors and emotion (anger, disgust, fear, happiness, sadness) and intensity (35, 50, 75, and 100 percent) as within-subjects factors. Because the stimulus emotion and stimulus intensity were within-subject factors with more than 1 df, the epsilon adjustment (Greenhouse–Geisser test) was reported for repeated measures omnibus test. The main effects were: intensity, F(2.86, 658.35) = 289.68, p < .001, η2p = .56, age F(4, 230) = 21.11, p < .001, η2p = .27, emotion F(3.25, 747.83) = 211.17, p < .001, η2p = .48, and gender, F(1, 230) = 7.99, p < .01, η2p = .03. The main effect of intensity indicated a greater accuracy with increasing intensity. Pairwise comparisons (Bonferroni corrected) suggested a gradual improvement with greater intensities: 35 percent (.52), 50 percent (.67), 75 percent (.74), and 100 percent (.77). The main effect of age proved to be that accuracy improved with age. Tukey's post hoc tests showed that preschoolers (.55) were significantly less accurate than all other ages, and schoolchildren (.65) were less accurate than adolescents (.73) and late adolescents (.74), with no difference between adolescents and late adolescents. Preadolescents (.71) differed neither from schoolchildren nor from the oldest age groups. The main effect of emotion suggested a difference in the accuracy of the five expressions (see Table 3). Pairwise comparisons (Bonferroni corrected) indicated that anger was the least accurately recognized (.54), followed by sadness (.59) and disgust (.59), fear (.67), and happiness (.90).1 The main effect of gender indicated a greater accuracy for girls (.70) than boys (.66).

Table 2. Means of Unbiased Hit Rates, Standard Deviations (SDs), F Values, and Post Hoc Analyses for the Four Intensities
Intensity (%) Age 4–6 Age 7–9 Age 10–12 Age 13–15 Age 16–18 All years F Tukey post hoc
Mean SD Mean SD Mean SD Mean SD Mean SD Mean SD Age
Boys Girls Total Boys Girls Total Boys Girls Total Boys Girls Total Boys Girls Total Boys Girls Total
35 .36 .35 .36 .45 .50 .48 .54 .59 .57 .59 .64 .61 .58 .63 .60 .50 .54 .52 25.54* Age 4–6 < age 7–9, age 10–12, age 13–15, age 16–18
.11 .14 .12 .16 .16 .16 .14 .14 .14 .18 .18 .18 .15 .10 .13 .17 .18 .18 [.30] Age 7–9 < age 13–15, age 16–18
50 .54 .53 .54 .63 .64 .63 .67 .74 .71 .69 .80 .74 .68 .78 .73 .64 .70 .67 14.75* Age 4–6 < age 10–12, age 13–15, age 16–18
.16 .15 .15 .17 .17 .17 .15 .15 .15 .16 .15 .16 .15 .10 .13 .17 .17 .17 [.20] Age 7–9 < age 13–15
75 .65 .64 .65 .73 .72 .72 .73 .81 .77 .74 .80 .77 .78 .84 .81 .73 .76 .74 8.25* Age 4–6 < age 10–12, age 13–15, age 16–18
.14 .19 .16 .16 .18 .17 .15 .16 .16 .16 .15 .16 .12 .10 .11 .15 .17 .16 [.12]
100 .68 .66 .67 .77 .78 .78 .77 .82 .80 .78 .84 .81 .75 .86 .80 .75 .79 .77 8.77* Age 4–6 < age 7–9, age 10–12, age 13–15, age 16–18
.12 .17 .14 .17 .16 .16 .14 .13 .13 .10 .11 .11 .13 .07 .11 .13 .15 .14 [.13]
  • Note: Effects sizes (η2p) are shown in square brackets. The value of η2p (partial Eta square) represents the proportion of variance accounted for by each variable with higher numbers indicating a stronger effect.
  • *  p < .001.
  • Age 4–6 = preschoolers, age 4–6 years; age 7–9 = schoolchildren, age 7–9 years; age 10–12 = preadolescents, age 10–12 years; age 13–15 = adolescents, age 13–15 years; age 16–18 = late adolescents, age 16–18 years.
Table 3. Means of Unbiased hit Rates, Standard Deviations (SDs), F Values, and Post Hoc Analyses for Each Emotion and for Total
Emotions Age 4–6 Age 7–9 Age 10–12 Age 13–15 Age 16–18 All years F Tukey post hoc
Mean SD Mean SD Mean SD Mean SD Mean SD Mean SD Gender Age
Boys Girls Total Boys Girls Total Boys Girls Total Boys Girls Total Boys Girls Total Boys Girls Total
Anger .40 .44 .42 .50 .56 .53 .57 .61 .59 .52 .64 .58 .54 .63 .58 .51 .58 .54 8.60* 7.39** Age 4–6 < age 10–12, age 13–15, age 16–18
.13 .15 .14 .20 .22 .21 .18 .20 .19 .22 .20 .22 .16 .13 .15 .19 .19 .19 [.04] [.11]
Disgust .46 .54 .50 .54 .62 .58 .58 .68 .63 .58 .69 .64 .52 .68 .60 .54 .64 .59 13.15** 2.55
.22 .24 .23 .30 .22 .26 .23 .23 .23 .23 .23 .24 .23 .18 .22 .24 .22 .24 [.05] [.04]
Fear .50 .42 .46 .63 .62 .62 .70 .77 .74 .72 .76 .74 .76 .77 .76 .66 .67 .67 .04 22.50** Age 4–6 < age 7–9, age 10–12, age 13–15, age 16–18
.25 .21 .23 .20 .22 .21 .16 .15 .16 .17 .15 .16 .15 .13 .14 .21 .22 .21 [.00] [.28] Age 7–9 < age 16–18
Happiness .82 .75 .78 .86 .91 .89 .94 .94 .94 .96 .97 .96 .93 .95 .94 .90 .90 .90 .03 28.72** Age 4–6 < age 7–9, age 10–12, age 13–15, age 16–18
.13 .15 .14 .10 .09 .10 .07 .09 .08 .05 .04 .04 .10 .05 .08 .11 .12 .11 [.00] [.33] Age 7–9 < age 13–15
Sadness .46 .44 .45 .55 .50 .53 .52 .62 .57 .63 .70 .66 .68 .78 .73 .57 .61 .59 2.40 17.19** Age 4–6 < age 13–15, age 16–18
.16 .20 .18 .18 .22 .20 .24 .20 .22 .21 .15 .19 .16 .13 .16 .21 .22 .21 [.01] [.23] Age 7–9, age 10–12 < age 16–18
Total .56 .55 .55 .64 .66 .65 .68 .74 .71 .70 .77 .73 .70 .78 .74 .66 .70 .68 7.99* 21.11** Age 4–6 < age 7–9, age 10–12, age 13–15, age 16–18Age 7–9 < age 13–15, age 16–18
.10 .13 .12 .14 .14 .14 .11 .12 .12 .13 .11 .12 .10 .06 .09 .13 .14 .14 [.03] [.27]
  • Note: Effects sizes (η2p) are shown in square brackets. The value of η2p (partial Eta square) represents the proportion of variance accounted for by each variable with higher numbers indicating a stronger effect.
  • *  p < .01;
  • **  p < .001.
  • Age 4–6 = preschoolers, 4–6 years; age 7–9 = schoolchildren, 7–9 years; age 10–12 = preadolescents, 10–12 years; age 13–15 = adolescents, 13–15 years; age 16–18 = late adolescents, 16–18 years.

The main effects were qualified by interactions. The significant intensity × age interaction, F(11.45, 658.35) = 3.43, p < .001, η2p = .06, indicated that the recognition accuracy for the four intensities differed with age. This interaction was further evaluated using a multivariate ANOVA (MANOVA), with accuracy at different intensities as the dependent variable and age as the between-subjects factor. As can be seen in Table 2, there were significant age effects for all intensities. Tukey's post hoc tests revealed that preschoolers were significantly less accurate compared with all the other age ranges, but they did not differ from 7–9-year-olds at 50 percent and 75 percent intensity. Schoolchildren were less accurate than adolescents at 35% intensity. At 50 percent intensity, preschoolers and schoolchildren showed similar levels of accuracy, but schoolchildren were less accurate than adolescents. A significant intensity × emotion interaction, F(10.16, 2337.97) = 9.23, p < .001, η2p = .04, indicated that the recognition accuracy of emotions differed at specific intensity levels. Simple main-effect analyses showed an effect of intensity for all emotion categories (see Table 4). The Tukey test revealed that anger and sadness were the most difficult to recognize at 35 percent intensity. Across the four levels of intensity, sadness is recognized as accurately as disgust (50 percent intensity), anger is only recognized as accurately as disgust and sadness (75 percent intensity), and fear is recognized as accurately as disgust (100 percent intensity). The significant age × emotion interaction, F(13.00, 747.83) = 3.52, p < .001, η2p = .06, and emotion × gender interaction, F(3.25, 747.83) = 5.61, p < .001, η2p = .02, indicated that the recognition accuracy of the five emotions differed across age and gender. This was confirmed by a further MANOVA in which the emotions were used as dependent variables, and age and gender as between-subjects factors. As shown in Table 3, gender significantly affected anger and disgust, but not the other emotions. Furthermore, age affected all the emotions except disgust. However, there were no consistent age-related trends across emotions. Tukey's post hoc tests revealed that accuracy for anger improved from school age to the older age ranges. Accuracy for fear and happiness was poorest for preschoolers compared with all the other age ranges, which did not differ. Accuracy for sadness was poorest for preschoolers compared with adolescents and late adolescents, but did not differ from schoolchildren and preadolescents.

Table 4. Means of Unbiased Hit Rates, Standard Deviations (SDs), F Values, and Post Hoc Analyses for Each Emotion Subdivided for Each Intensity
Levels of intensity (%) Anger Disgust Fear Happiness Sadness F Tukey post hoc
Mean SD Mean SD Mean SD Mean SD Mean SD
35 .37 .46 .55 .81 .43 105.14* Anger, sadness < disgust < fear < happiness
.26 .29 .29 .21 .24 [.26]
50 .52 .62 .69 .90 .61 74.91* Anger < sadness, disgust < fear < happiness
.25 .28 .26 .16 .29 [.20]
75 .66 .69 .74 .96 .73 62.75* Anger, sadness, disgust < fear < happiness
.25 .28 .25 .10 .29 [.17]
100 .70 .69 .76 .98 .73 69.33* Anger, sadness, disgust, fear < happiness
.23 .27 .24 .07 .25 [.19]
  • Effects sizes (η2p) are shown in square brackets. The value of η2p (partial Eta square) represents the proportion of variance accounted for by each variable with higher numbers indicating a stronger effect.
  • *  p < .001.

Effects of Socioeconomic Status and Intelligence Quotient

No differences resulted from a chi-square test applied to evaluate the distribution of SES levels of participants across the five age ranges χ2 (8) = 9.70, p = .29. Furthermore, correlations between SES and recognition accuracy, calculated separately for intensity and type of emotions, were not significant (all p > .01). A preliminary analysis of distribution highlighted that 80% of the sample had obtained mid-range to higher scores. To investigate a possible SES-related bias, an additional ANOVA was run, with three SES levels (upper, middle, and lower) as the between-subjects factor. There were no differences among the three SES levels either for intensity or for type of emotions (all p > .01).

To evaluate the relation between full IQ and accuracy, Pearson correlations were calculated separately for intensity and type of emotions, and independently for the two IQ tests. No significant correlations were found (p > .01). A preliminary analysis of distribution highlighted that the full IQ range differed from what would be normally expected (85–115) for the 93 participants (38.7 percent), who showed high scores (>115). To examine possible differences in accuracy between children with normal and higher scores, t-tests were run separately for intensity and type of emotions. No differences in accuracy were found between participants with normal and higher IQ scores (all p > .01).

Discussion

The study demonstrates that the recognition of emotions is related to the specific emotion, its intensity, and the age and gender of the observer. Accuracy of recognition improved for all emotions at higher levels of intensity compared with the lowest intensity level (35 percent). These results are consistent with Herba et al.'s (2006) findings, although the children's accuracy improved the most from 25 percent to the higher intensities in their study. Poorer accuracy at low intensity may be due to the fact that some of the features of the facial expression that make an expression distinct are not expressed clearly enough to be encoded effectively. Our results also show that intensity effects were not uniform across the expressions; for example, happiness was unaffected by intensity. Negative emotions were equally well recognized when fully expressed (100 percent) but at the lower intensity (35 percent) there was greater accuracy for fear compared with sadness and anger. Despite different experimental procedures, Herba et al. (2008) reported a higher expression intensity of fear correct recognition compared with ours (66 vs. 35 percent) and found that among negative emotions, fear was better recognized at lower intensities. From an evolutionary perspective, fear may be the most critical of the negative emotions because it signals danger, and the brain may be more tuned to detect it (Plutchik, 1980). However, this explanation is not fully satisfying because it does not explain why anger, which might signal interpersonal threat, or sadness, which might signal unavailability, would be so affected by the intensity of the display. Perceptual analysis would suggest that, although the three negative emotions require the integration of both the upper and lower face, the action units of fear would be more distinctive than those of anger and sadness. In particular, at the initial stage of expression, both anger and sadness utilize brow contraction whereas fear involves raised eyebrows, wide open eyes with tense lower eyelids and nostril dilation (Ekman & Friesen, 1978). Thus, with a low-intensity dynamic display, anger and sadness overlap each other more, thus interfering with recognition accuracy. By using several expression recognition tasks, De Sonneville et al. (2002) documented that a search for patterns to distinguish between similar features requires highly demanding, effortful, and controlled information processing, suggesting the use of a featural processing strategy. Therefore, a low intensity dynamic display does not provide clear configural information as would normally emerge from motion (Ambadar et al., 2005). Specifically, it requires the participant to consider each feature separately (i.e., piecemeal processing) due to the number of motion-related feature changes of mouth, eye, nose, and eyebrow positioning, etc. However, caution is required because of a possible confounding effect of intensity and motion. It would therefore be useful to compare the perception of dynamic sequences with the perception of static ‘sequences’ (Ambadar et al., 2005).

Our hypothesis that accuracy would improve with age was supported. Overall, consistent with previous research (Gosselin & Larocque, 2000; Gosselin et al., 1995; Herba et al., 2006), we found an increasing ability to recognize facial expressions in a stepwise pattern between the 4–6 years and 7–9 years age groups, as well as an increasing ability between the 7–9 year-old and adolescent/late adolescent age groups. The 10–12 year-old age group differed neither from school age children nor from older participants.

Such improvement with age, however, should be seen in terms of intensity and specific emotions. With regard to intensity, our findings partially support our hypothesis that increasing age is associated with more accurate recognition at lower intensity levels. The results suggest a slight improvement in recognition at low and medium intensities (i.e., 35 and 50 percent) in older participants (13–15 and 16–18 years old), but not in younger participants (7–9 years old). Thus, accuracy at lower intensity expressions progresses until early adolescence, probably due to a more sophisticated perceptual ability in detecting facial emotion changes (cf. Gosselin & Pelissier, 1996; Soppe, 1988). Second, accuracy with increasing age was related to specific emotions, except for disgust. There was little change from preschool through late adolescence for disgust. This finding contrasts with that of Vicari et al. (2000), who found a marked improvement between 5 and 10 years whereas Herba et al. (2008) found no age-related improvement for disgust. The facial configuration for disgust is very distinctive (Ekman & Friesen, 1978), probably because it communicates information about the need to avoid offensive contaminating stimuli (e.g., decayed food). Thus, the finding of minimal change with age in this study is consistent with Rozin and Fallon's (1987) suggestion that the recognition of disgust would be expected to develop at a young age as a response addressed in order to avoid the ingestion of potentially harmful substances.

Accuracy improved with age for fear, happiness, anger, and sadness, but not in identical ways. Accuracy for the recognition of happiness improved between 4–6 years and 7–9 years, for anger from 4–6 years and 10–12 years, and for sadness from 4–6 years to 13–15 years. These findings are only partly consistent with prior research (Herba et al., 2006; Lenti et al., 1999; Vicari et al., 2000), suggesting a slower improvement in accuracy for both anger and sadness in contrast to happiness and fear. Within the negative emotions, our findings indicate that children were less accurate at decoding anger, followed by sadness, disgust, and fear. Previous findings on emotion recognition are at best inconsistent, so comparisons are difficult. Boyatzis et al. (1993) found that anger was the most difficult to recognize whereas others report that sadness was the most difficult (Holder & Kirkpatrick, 1991; Lenti et al., 1999; Philippot & Feldman, 1990). In contrast, Vicari et al. (2000) found that sadness was more easily recognized than fear, anger, and disgust. These differences in findings on the changes with age in the recognition of emotions suggest that some negative emotions are recognized better than others, and that the ability to recognize different emotional expressions does not emerge as a complete package. It is quite possible that this discrepancy is due to the complexity of expressions or socialization factors or, as is most likely, a combination of these causes.

Consistent with the findings from previous research in preschoolers (Boyatzis et al., 1993; Philippot & Feldman, 1990) and late childhood (Hall, 1984), girls were more accurate than boys in facial emotional recognition. As expected, however, accuracy is not ubiquitous, being specific to different emotions. Girls were more accurate for anger and disgust, but not for the other emotions. This finding is consistent with recent investigations, which demonstrated specific gender differences in adults for the recognition of disgust and anger with both static stimuli (Campbell et al., 2002) and dynamic stimuli (Montagne, Kessels, Frigerio, de Haan, & Perrett, 2005). Thus, our results suggest a specific, rather than a general, effect of gender on the perception of facial expressions across childhood and adolescence.

Low SES has often been found to be associated with a lower socioemotional competence (McLoyd, 1990). We found no effects of SES on accuracy in our study. Perhaps the discrepancy with prior research is due to different SES measures. On the other hand, our finding is consistent with Herba et al.'s (2008) study, in which no SES effect was found in the facial expression recognition of the same emotion categories used in our study. Thus, it is quite possible that low socioeconomic status may only be indirectly linked to less optimal emotional development (i.e., parental conflict, maternal depression, large family size; Bradley & Corwyn, 2002; Garner, Jones, & Miner, 1994; Garner & Spears, 2000; Shaw, Keenan, Vondra, Delliquadri, & Giovannelli, 1997).

It has been suggested that emotional competence is positively related to general cognitive ability (Field & Walden, 1982) or to verbal skills (Pons et al., 2003). However, these correlations were generally moderate at best. Furthermore, many prior studies only included preschoolers (Smith & Walden, 1998) or children from disadvantaged families (Izard et al., 2001). Consistent with Herba et al. (2006, 2008), our findings suggest that IQ was unrelated to emotion processing across development. However, caution is needed, because the sample in this study was not representative of the general population. Nevertheless, the lack of association between full-scale IQ and accuracy should not be so surprising, given that facial emotion recognition could be related to factors other than cognitive or linguistic abilities (e.g., atypical development, environmental risk, depressive maternal symptoms, etc.—cf. Pollak, Cicchetti, Hornung, & Reed, 2000; Pollak & Sinha, 2002).

There were a number of limitations in the study. Firstly, although a procedure which minimized verbal demands was employed, it was not possible to rule out that younger children make more errors than older, not only because of limitations in attention or memory, but also for different mental categories for emotion underlying emotional labeling (Russell & Widen, 2002). The use of forced-choice response format would lead to artifactual results, as it may force younger children (age 2–5) to make choices that they would otherwise not do (Widen & Russell, 2003).

Secondly, research demonstrates that surprise is the emotion that children most commonly confuse with fear (Gosselin & Simard, 1999), and it is possible that because surprise was not included as a response alternative, fear was identified better than other negative emotions were.

Thirdly, although stimulus intensity was associated with accuracy, its effects may have been partly influenced by the use of dynamic vs. static stimuli. Therefore, better performance at higher intensities could be an effect of motion; a larger sample of expressions (number of frames), as well as unique temporal information (different ‘exposure time’) about the expressions, would increase accuracy. In short, spatial manipulation of expressions could mask the effects of motion, thus preventing us from univocally demonstrating the effect of intensity.

A final limitation of this study is related to the fact that the intensity levels used in the AFFECT are not directly measurable values in terms of real emotion expressions, except for the 100% intensity. The intensities used in the AFFECT were chosen arbitrarily, and animation was obtained through morphing. The possibility that similar data would have been found with the same, non-animated stimuli, cannot be ruled out. In the AFFECT, intensity and motion speed apply to all the emotions under study whereas in real life, they can vary according to emotion expressions. For example, Kamachi, Bruce, Mukaida, Gyoba, Yoshikawa, and Akamatsu (2001) found that happiness is associated with rapid movements and sadness with slow movements. Thus, in future work, it would be useful to determine whether or not intensity judgments are monotonically related to the degree of intensity display by AFFECT.

Conclusions

The current study demonstrates that greater intensity of facial expression facilitates emotion recognition. This intensity advantage was generalized across different emotions and over a wide age range. These results are important, because previous developmental research using emotions of varying intensity only examined non-dynamic stimuli. It is possible that the intensity advantage reflects a general perceptual–configural processing, as it is found in all the emotions taken into consideration. However, as the results suggest a slight but significant, trend showing that with age, children become more accurate in the use of early partial information, it is possible that at least in the lower intensities, the development of ability in facial emotion processing may be due to the underlying development of the ability to use a featural-based processing. These results are also important in the light of prior research, which has documented that by preschool age, the ability to decode emotions of varying intensity is associated positively with social competence (Nowicki & Mitchell, 1998). In this perspective it might be relevant to investigate further the relationship between the development of facial emotion recognition and social adjustment. At the same time, the findings on the gender effect suggest that the differences, as well as the similarities in recognition between the genders, are likely to have effects on the social development of boys and girls and should be considered as a potential mechanism generating some of the differences in the reactivity and social behavior of boys and girls.

Although the levels of intensity used in the AFFECT are non-veridical in the sense that they are derived from a prototype image, it is important to note that they do not have an ‘artificial’ appearance. Thus, the results of the present study contribute at the very least to raising questions about the role of facial motion and the intensity of expression in the development of emotion recognition. Studies using static facial expressions are even more artificial and distant from the facial expressions that children are encoding in everyday life. The adoption of this approach brings the experimental condition closer to real-time changes of facial emotion expressions that are relevant for the children's social communication of emotions.

Appendix

Appendix 1. Analyses of the Psychometric Properties of the Animated Full Facial Expression Comprehension Test

The animated full facial expression comprehension test (AFFECT) has already been used in a clinical setting (Gagliardi et al., 2003). However, as no attempt has been made yet to test its psychometric properties, we also investigated its internal consistency and test–retest reliability. Data reported in the main study were used to evaluate internal consistency whereas the test–retest reliability was explored using a different group of healthy participants (N = 30) matched for age, gender, socioeconomic status, and intelligence quotient, who were administered AFFECT in two separate sessions, one month apart. Internal consistency was calculated based on the four test blocks, each containing a total of 16 trials: 4 identities × 4 intensities. Test–retest reliability was calculated by Pearson correlation coefficients on the accuracy of each single emotion, regardless of intensities. Anger, disgust, fear, and sadness showed a high degree of internal consistency, ranging from .70 to .85 whereas the internal consistency for happiness was less than optimal (α = .54), because variance was 0 for 5 out of 16 items composing happiness. These stimuli were responded to correctly by all 240 participants, and were not included in the statistics of the alpha coefficient. In line with Miller (1995), this difference in the number of items (11 vs. 16) accounts for a lower alpha coefficient for happiness compared with all the other emotions, and is consistent with the results of psychometric analyses performed on other emotional expression recognition tests (Rojahn, Gerhards, Matlock, & Kroeger, 2000). Although somewhat counterintuitive, this finding suggests a reduced efficiency of happiness-related items because this emotion is easier to recognize. Nevertheless, the alpha coefficients (≥.70 for all emotions except for happiness) suggest that AFFECT has a good internal consistency. In the test–retest reliability estimates, disgust, fear, happiness, and sadness showed a high correlation, ranging from .73 to .84 whereas anger showed a moderate correlation index (r = .65). Significance was p < .001 for all emotions. Good reliability was obtained for all single emotions except for anger, which was the most difficult emotion to recognize, and it can be assumed that this modest test–retest correlation is due to its greater recognition difficulty. In conclusion, AFFECT appears to have satisfactory psychometric properties, although additional information about its validity should be obtained (e.g., construct validity).

Acknowledgments

Thanks are due to Michael Burt for his comments on an earlier version of this manuscript. We are deeply indebted to Professor Ed Tronick for his useful comments and suggestions on all aspects of the text of the earlier draft. We are also grateful to Professor Lynne Murray for her remarks improving the English language. We wish also to thank the schools' directors and teachers for their support; Debora Castellano, Elisa Ceppi, Sara Colombo, and Chiara Vago, who at the time of the study, were graduate psychology students at the Psychology Department, Catholic University of Milan, for their help in data collection. Finally, we would like thank the participating children and their parents.

This work was partially supported by the Italian Health Ministry (RC 2001).

    Note

  1. 1 The fact that happiness is the most accurately recognized emotion could be due to a ‘ceiling effect’ or a ‘happy advantage’ phenomenon. To control for this effect, the statistics were repeated, and happiness-related data were excluded. No differences were found between the two sets of statistics (including and excluding happiness).
    • The full text of this article hosted at iucr.org is unavailable due to technical difficulties.