Volume 15, Issue 2 p. 292-298
PAPER
Full Access

Infants generate goal-based action predictions

Erin N. Cannon

Erin N. Cannon

Department of Human Development, University of Maryland, College Park, USA

Search for more papers by this author
Amanda L. Woodward

Amanda L. Woodward

Department of Psychology, University of Chicago, USA

Search for more papers by this author
First published: 03 December 2011
Citations: 144
Erin N. Cannon, Department of Human Development, University of Maryland, College Park, MD 20742, USA; e-mail: ecannon@umd.edu

Abstract

Predicting the actions of others is critical to smooth social interactions. Prior work suggests that both understanding and anticipation of goal-directed actions appears early in development. In this study, on-line goal prediction was tested explicitly using an adaptation of Woodward’s (1998) paradigm for an eye-tracking task. Twenty 11-month-olds were familiarized to movie clips of a hand reaching to grasp one of two objects. Then object locations were swapped, and the hand made an incomplete reach between the objects. Here, infants reliably made their first look from the hand to the familiarized goal object, now in a new location. A separate control condition of 20 infants familiarized to the same movements of an unfamiliar claw revealed the opposite pattern: reliable prediction to the familiarized location, rather than the familiarized object. This study suggests that by 11 months infants actively use goal analysis to generate on-line predictions of an agent’s next action.

Introduction

Predicting the actions of others is integral to social functioning. Crossing a busy street, minding an active toddler, winning a tennis match, and engaging in a conversation all require on-line, moment-to-moment predictions about one’s partner’s actions. Often the most useful predictions reflect an analysis of one’s partner’s intentional states. A driver’s focus of attention can help one predict whether he is likely to pull into the intersection and a toddler’s expression of curiosity can help one predict where she will head next. Indeed, it is generally assumed that a central function of folk psychology is to generate predictions about others’ actions both in the moment and over longer time scales. In this way, social cognition is inherently prospective. The current study explored the developmental emergence of this aspect of social information processing.

By 3 years of age children can provide explicit predictions about a person’s next actions based on her intentions, desires and knowledge states. For example, knowing that Sally wants her puppy and believes it to be in the garage, young children predict that she will go to the garage, even if, in fact, the puppy is elsewhere (Wellman, 1992; Wimmer & Perner, 1983). Recent findings from Southgate and colleagues (Southgate, Senju & Csibra, 2007) demonstrate similar nonverbal predictions in 24-month-old children. When they saw a protagonist approach two containers, children in these studies looked predictively toward the one in which the protagonist had previously seen a toy hidden, even if the toy was no longer there. Thus, by 2 to 3 years of age, children recruit their understanding of others’ intentions in service of generating action predictions.

Do younger infants generate action predictions? Recent research has shown that infants in the first year of life analyze others’ actions in terms of their intentional structure. To illustrate, when infants are habituated to a goal-directed action, they subsequently show a stronger novelty response (longer looking) to test events which alter the goal of the action than to test events that preserve the goal while varying the physical properties of the action (Biro & Leslie, 2007; Woodward, 1998, 2009). Infants show this sensitivity to goals even when actions are not completed. For example, infants understand that reaches that strain toward but fail to contact distant objects are goal-directed (Brandone & Wellman, 2009; Hamlin, Hallinan & Woodward, 2008).

Although these findings demonstrate that infants are adept at analyzing the goal structure of others’ actions, they do not clarify whether infants generate action predictions based on this analysis. One interpretation of these experiments is that infants expect the actor to continue to act on the same object, and thus their longer looking on goal-change trials indicates surprise when this expectation is violated. However, an equally viable alternative is that infants respond to the novelty of the goal change without first generating a prediction about the agent’s subsequent actions. Similar questions have arisen in studies of physical reasoning: In some cases toddlers seem unable to generate predictions based on physical knowledge that appears to be present in younger infants (see discussions by Hood, Carey & Prasada, 2000; Keen, 2003; and Keil, 2006).

Another body of work shows that infants visually anticipate others’ actions, but this work leaves open the question of whether infants’ anticipatory responses are based on an analysis of the agent’s goal. Falck-Ytter, Gredebäck and Von Hofsten (2006) tested infants and adults in an eye-tracking paradigm using video sequences in which a person iteratively put balls into a container. Adults and 12-month-old infants looked to the container reliably before the ball arrived there, and showed a stronger anticipatory response for these events than for matched events that did not involve a human agent. Further studies showed that infants anticipated more robustly when they viewed actions directed at a clearly marked endpoint (Gredebäck, Stasiewicz, Falck-Ytter, Rosander & von Hofsten, 2009). Six-month-old infants tested in Falck-Ytter and colleagues’ (2006) procedure failed to reliably anticipate the actions, but two recent studies indicate that infants at this age anticipate simpler actions that are common in their own experience. Kochukhova and Gredebäck (2010) found that 6-month-old infants who viewed feeding actions anticipated the arrival of the spoon at the mouth, but did not show similar anticipation for a less familiar combing action. Similarly, Kanakogi and Itakura (2011) found that 6-month-old infants anticipate object-directed grasping actions, and that infants’ tendency to do so is related to their own skill at reaching for objects (see Daum & Gredebäck, 2011, for related findings).

These findings show that infants attend prospectively to others’ actions from early in life. However, they do not clarify whether infants anticipated the goal per se because the goal and pattern of movement were confounded. That is, for events like putting several objects into a container, or moving a spoon repeatedly to the mouth, saccades to the goal are also saccades to the endpoint of a familiar trajectory. Infants may anticipate regularities in familiar patterns of movement (knowing that spoons go to mouths, for example), but not predict that an agent will maintain the same goal in subsequent actions. A stronger test of whether infants can generate action predictions based on an agent’s prior goal would assess infants’ predictions when the context has changed so that the same movement will not realize the prior goal.

The current study sought to address this issue. To distinguish goal prediction from movement anticipation, we adapted a strategy from earlier looking time studies (Woodward, 1998). Infants were familiarized to a repeated reaching action directed to one of two objects. The locations of the objects were then swapped, and infants’ anticipatory looks to the prior location versus the prior goal were assessed as the agent made an incomplete reach between the objects (see Figure 1a). If infants generate goal-based predictions, then we expected that they would look predictively toward the prior goal, rather than to the prior location. To evaluate whether infants’ patterns of predictive looking depended on their analysis of a goal-directed action, we included a control condition in which a claw moved in a similar manner to the hand (see Figure 1b). Prior studies have shown that infants do not readily encode the movements of a claw as goal-directed (Cannon & Woodward, 2010; Jovanovic, Király, Elsner, Gergely, Prinz & Aschersleben, 2007; Woodward, 1998), although they may do so when given more information about the source of the claw’s movements (Hofer, Hauf & Aschersleben, 2005).

Details are in the caption following the image

Depiction, from left to right, of familiarization phase, swap trial, and test probe in the (a) hand and (b) claw conditions. In familiarization (left panels), the hand/claw made a straight midline reach, followed by slight vertical movement then to the right until contact was made with the toy, on a curvilinear path. In the swap trial (middle panels), no movement was present. Toys were shown in swapped locations. In the test probe (right panels), toys remained in the swapped locations and the hand/claw made only a straight midline reach.

Method

Participants

Forty 11-month-old infants were tested (M = 10 months; 28 days, range: 10;15–11;16). Half were randomly assigned to a human agent (hand) condition (10 males, 10 females; M =10;25) and half to the less familiar claw condition (11 males, 9 females, M =11;1). All infants had a minimum 37 weeks gestation. An additional two infants were tested in the claw condition and excluded from the analysis due to insufficient data.

Procedure

Data were collected via corneal reflection on a Tobii 1750 eye tracker with infant add-on, a remote 17′′ monitor with integrated eye tracking technology and a sampling rate of 50 Hz. The monitor was attached to a movable arm so viewing position could be adjusted to the optimal distance and height of the infant on the parent’s lap (approximately 60 cm from the screen). Clearview 2.5.1 software (Tobii Technology, Sweden) was used to collect and record calibration, stimulus presentation, and for the integration of gaze data with the viewed events. Prior to testing, each infant received a 9-point calibration to Clearview’s pre-set locations. Individual calibration points highlighted as unreliable by the program were repeated until reliability was obtained.

On each trial infants were shown an 18.5 second (s) video (resolution 720 × 480) presented in the center of the monitor, containing a rubber toy frog in one corner, and a rubber ball in the opposite corner of the scene (see Figure 1). The video comprised three familiarization events (3.5 s each), each followed by 500 ms of black screen, one swap event in which the toys were shown in new positions (3.5 s) also followed by 500 ms of black screen, and finally, one test probe event (2.5 s). During the familiarization events, infants viewed either an experimenter’s arm and hand, or a plastic rod with claw, as it moved across the scene to contact and grasp one of the toys. The hand or claw entered from the left side of the screen and moved straight across the scene, until just past midline and equidistant from the two toys (1500 ms). The hand or claw then deflected on a curvilinear path toward one of the toys, making contact after an additional 1500 ms. The hand or claw then grasped and held the toy for 500 ms (Figure 1, left panel). On each familiarization event, the entry of the hand/claw was accompanied by a brief double ‘boatbell’ sound, and the grasp of the toy was accompanied by a squeak sound. Following the three familiarization events, infants viewed the swap event in which the two objects were shown in reversed positions for 3.5 s accompanied by a jingle sound for the first 1800 ms (Figure 1, middle panel). Finally, on the test probe, infants saw the hand/claw enter from the left and move just past midline (1500 ms) and pause in this position for 500 ms before the screen went black for 500 ms (Figure 1, right panel). This sequence was repeated for four trials. Each trial was preceded by an attention-getting animation in the center of the screen.

The initial positions of the objects and the object grasped during familiarization were counterbalanced across infants in each condition. Because hands are asymmetric, the configuration of the hand could have served as a cue to reach direction. For example, on seeing the hand reach for the object nearest the thumb in familiarization, infants might assume thumbward movement in test. To prevent infants from extracting or using rules like this, we constructed the hand stimuli so that hand orientation could not be used to predict the goal on probe trials: The four test trials presented to each infant crossed the hand orientation (thumb toward versus away from the goal object) during familiarization with hand orientation on the test probe, thus presenting each of the four possible pairings. The order in which these four trials were presented was randomly assigned for each infant.1

Coding

A recording of each infant’s gaze coordinates was integrated and overlaid on the stimulus video for fixations spanning 200 ms within a 50 pixel radius (approximately 1.6° visual angle). Each recording was exported at 15 fps from Clearview with a 500 ms gaze trace visible per frame for coding. Coders scored the test probe events only, without viewing familiarization trials so they were unaware of the target object for each infant. Coding of the test probe spanned the entire 2.5 s period, including the 500 ms black screen at the end. Trials were included in the analysis if the gaze trace revealed that the infant first looked at the hand/claw region (start AOI, see Figure 2) and then to one of the two objects (similar to the coding scheme used by Southgate et al., 2007). The saccade had to end in a visible fixation in the region of the toy in order to be coded, as opposed to cases in which a gaze was moving off screen but passed through the region where the toy was located. The toy regions of interest are displayed in Figure 2. The coders were trained to use these static regions of interest, but the AOIs were not visible in the gaze trace during coding. For this reason, responses on the test trials were coded by two independent observers (who were unaware of the familiarization goal) with 100% overlap. Agreement was reached on 94% of trials, confirming that coders could implement the coding criterion reliably. Infants were excluded from the analysis if two or more trials could not be coded using the defined criteria (n =2). Thus, all of the infants included in the study produced either three or four scoreable trials. Thirteen infants in the hand condition and 10 infants in the claw condition produced scoreable responses on all four trials.

Details are in the caption following the image

Areas of interest (AOIs) used in all analyses. The Start AOI (highlighted in green) encompassed the entire length of the arm or claw extended during familiarization and test probes. The target and distractor AOIs encompass the region surrounding the toys (highlighted in yellow and blue). For half of the infants, the frog was the target (goal) object, and half received the ball as the target (goal) object.

Results

Figure 3 summarizes the proportion of trials on which infants looked predictively to the prior goal versus the prior location on test probe trials. Initial analyses revealed no reliable effects of the sex of the infant, the identity of the goal object, or the position of the goal during familiarization, and therefore these factors were not included in subsequent analyses. Further, an analysis of variance with Hand Orientation (concordant: goal location concordant with hand orientation during familiarization and test vs. discordant: goal location disconcordant with hand orientation during familiarization and test) as a repeated measures factor in the hand condition yielded no reliable effect, F(1, 19), = .38, p > .54, thus showing that the orientation of the hand in the familiarization and test phase did not influence infants’ responses in this condition. Therefore, this counterbalanced factor was not included in subsequent analyses.

Details are in the caption following the image

Mean proportion of predictive looks to the goal object during test probes in the hand and claw conditions. Error bars represent standard errors. Chance = .50.

The focal analyses evaluated infants’ predictive looks during the test phase in the two conditions. A one-way ANOVA with Condition (hand versus claw) as a factor on the proportion of goal object predictions was significant, F(1, 38) = 14.34, p < .01, ηp2 = .27. Infants were more likely to predict the goal object in the hand condition than in the claw condition. Planned comparisons against chance (.50) indicated that infants in the hand condition systematically generated predictive looks to the goal object (M = .65, SD = .28, t(19) = 2.50, p < .05), whereas infants in the claw condition systematically generated predictive looks to the location of the familiarization movements (M = .29, SD = .33, t(19) = 2.85, p = .01). Moreover, as indicated by the individual trial means in Figure 4, these trends were evident from the first trial onward.

Details are in the caption following the image

Mean proportion of predictive looks to the goal object across trials. Error bars represent standard errors. Chance = .50.

Follow-up analyses were conducted to test whether infants were equally attentive to the hand and claw events during the familiarization phase. First, to evaluate whether the two conditions led infants to attend in similar ways to the target and distractor objects, areas of interest (AOIs) were created encompassing the areas surrounding the target (goal) and distractor objects (see Figure 2), and infants’ duration of attention to these regions during the familiarization phase was calculated using the fixation durations exported from Clearview. The average proportion of the trial infants fixated the target (goal) object was .33 (SD = .16) and .30 (SD = .12) for the hand and claw conditions, respectively, whereas the average proportion of the trials infants fixated the distractor during familiarization was .11 (SD = .10) and .07 (SD = .06) for hand and claw conditions, respectively. A 2 (Condition: hand, claw) × 2 (Region: target, distractor) mixed ANOVA on the average proportion of the trial indicated a significant main effect of Region, F(1, 38) = 71.46, p < .001, ηp2 = .65, and critically, no effect or interaction of Condition (ps > .14). Thus, infants in both conditions looked more to the target object than the distractor during familiarization, and they did not differ in this regard.

We next asked whether infants in the hand and claw conditions attended similarly to the movements of the hand or claw toward the target during familiarization. To do this, we calculated infants’ latency to reach the target object, following an initial fixation to the start AOI (see Figure 2) during the familiarization phase in each condition. On average, infants shifted gaze to the target 1863 ms (SD = 438 ms) into the event in the hand condition, and they shifted to the target object at 1963 ms (SD = 416 ms) in the claw condition. Given that it takes up to 200 ms to launch a saccade (e.g. Engel, Anderson & Soechting, 1999), it is clear that in both conditions saccades were launched, on average, after the hand or claw began deflecting towards the object (1500 ms into the event), yet before the agent clearly arrived in the target region (approximately 2300 ms into the event). Latency to fixate the target did not differ between the two conditions, t(38) = .74, p > .46. Thus infants in the two conditions were equally attentive to the movements during familiarization, and anticipated the endpoints of the familiarization movements to the same extent in the two conditions.

Thus, these analyses indicate that differences in infants’ attention to the hand and claw familiarization events did not drive their differential predictions during the test phase. These results are consistent with analyses of infants’ attention during looking time and imitation studies, which have consistently shown that inanimate movement toward a target can entrain infants’ attention and lead them to attend to the target as effectively as goal-directed actions do, even when infants do not perceive the inanimate movement as goal-directed (e.g. Gerson & Woodward, in press; Mahajan & Woodward, 2009; Woodward, 1998).

Discussion

Research has shown that infants represent others’ actions as goal-directed. The current findings provide the first evidence that infants in the first postnatal year use this analysis to generate rapid, on-line predictions about others’ next actions when the context has changed. When 11-month-old infants viewed a person reaching for one of two objects, and then saw that the objects’ locations had changed, they predicted that her subsequent reaches would be directed to the prior goal. When infants viewed similar events involving a claw instead of a person, they generated the opposite prediction – looking systematically to the prior location to which the claw had moved. Infants never viewed completed actions during probe trials in either condition, and thus their predictive gaze shifts could not have reflected learning about where the hand or claw would go once the targets had moved. Rather, infants generated differential predictions about these two kinds of events based what they know about agents and objects.

These findings raise several questions for further research. To start, looking time studies indicate that infants perceive action as goal-directed by 3 to 5 months of age (Sommerville, Woodward & Needham, 2005; Woodward, 1998). Are infants able to use this knowledge to generate on-line predictions from the start, or does this ability emerge later in development? As reviewed above, infants attend prospectively to movement patterns in others’ actions by 6 months of age. It is possible that this early prospectivity reflects the ability to generate goal-based predictions, although further research is needed to evaluate whether this is the case. It is also possible, however, that the ability to generate goal-based action predictions depends on later developments in general cognitive capacities, such as working memory, or in domain-specific abilities, such as the nature or robustness of action knowledge.

Further, looking time studies indicate that infants’ analysis of action goals is influenced both by first-person action experience (Gerson & Woodward, under review; Sommerville et al., 2005; Sommerville, Hildebrand & Crane, 2008) and by abstract movement cues (Biro & Leslie, 2007; Gergely & Csibra, 2003; Luo, 2011). Is infants’ on-line goal prediction informed by both sources of information or does it rely, at least initially, on only one of them? One approach to this question is to assess whether infants generate goal-based predictions equally readily for abstract events as for human actions. That is, when cues are present that have been shown to support infants’ analysis of abstract events as goal-directed, do infants also generate goal-based action predictions? A recent study by Daum and colleagues (Daum, Attig, Gunawan, Prinz & Gredebäck, under review) suggests not. In their study, infants and young children viewed animated events in which a novel, self-propelled agent approached one of two targets. Although infants responded systematically to goal changes in the looking time procedure, it was not until 2 to 3 years of age that children produced goal-based action predictions in response to the events. Likewise, the claw events we used here included two cues during familiarization thought to be important to animacy detection – movement along a non-linear trajectory and the presence of an end-effect (the sound that accompanied contact with the object). Further research is needed, however, to evaluate whether including different or more numerous animacy cues would support goal-based predictions in infants.

A second approach to this question is to investigate the role that action experience plays in supporting on-line goal prediction. It has been suggested that shared action production-perception systems support developments in action understanding (Falck-Ytter et al., 2006; Gerson & Woodward, 2010) as well as mature action anticipation (Johansson, Westling, Bäckström & Flanagan, 2001). Given the prospective nature of action control, if this suggestion is correct then infants’ goal prediction may be strongly influenced by action experience. Recent findings indicate correlations between infants’ own actions and their anticipation of others’ movements (Cannon, Woodward, Gredebäck, von Hofsten & Turek, 2011; Gredebäck & Kochukhova, 2010), and between motor system activity at the neural level and action anticipation (Southgate, Johnson, El Karoui & Csibra, 2010), but as yet connections between motor experience and goal-based action predictions, as assessed in the current study, have not been investigated.

These open questions aside, the current findings contribute to growing evidence that infants’ understanding of goal-directed action is robust and generative, even in the first year. This understanding drives infants’ responses in looking time experiments (Woodward, 1998), imitative behavior (Gerson & Woodward, in press; Hamlin et al., 2008; Mahajan & Woodward, 2009), and behavior during social interactions (Behne, Carpenter, Call & Tomasello, 2005). The current findings show that infants’ understanding of others’ goals also shapes their on-line predictions about others’ next actions.

Footnotes

  • 1 A pilot study, which did not counterbalance the orientation of the hand, obtained similar results to those reported here. Forty 11-month-old infants were tested (M =10 months; 29 days, range: 10;15–11;15). Half were randomly assigned to the human agent condition (8 males, 12 females; M =10;26) and half to the mechanical claw condition (14 males, 6 females, M =11;2). Infants were tested in a similar procedure to the one reported here except that hand orientation with respect to the target was consistent across the familiarization and test phases. For example, if the right hand reached for the target object during the familiarization phase, the left hand made the reach at test. Because both the location of the objects and the hands were switched, the orientation of the hand could be used to predict the direction it would take. Infants were given nine trials, but because attention greatly declined in the second half of the session, analyses were conducted only on the first four trials. A one-way ANOVA of Condition (hand or claw) on the proportion of goal object predictions revealed that there were reliably more goal predictions in the hand condition than in the claw condition, F(1, 38) = 10.00, p < .01, ηp2 = .21. Infants in the hand condition generated systematic goal predictions (M = .69, SD = .36, t(19) = 2.35, p < .05), whereas infant in the claw condition generated systematic location predictions (M = .31, SD = .39, t(19) = 2.14, p < .05).
  • Acknowledgements

    This study was supported by a Young Scholars Award from the Jacobs Foundation to E. Cannon, and by grants from the National Science Foundation (#0951489), the Office of Naval Research (#N000140910126) and NICHD (P01 HD064653) to A. Woodward. We thank Courtney Keeler for assistance in coding these data, and all of the families who participated.

        Wiley Online Library requires cookies for authentication and use of other site features; therefore, cookies must be enabled to browse the site. Detailed information on how Wiley uses cookies can be found in our Privacy Policy [https://www.wiley.com/en-us/privacy].

        The full text of this article hosted at iucr.org is unavailable due to technical difficulties.