The impact and implications of virtual character expressiveness on learning and agent–learner interactions
Abstract
The possible benefits of agent expressiveness have been highlighted in previous literature; yet, the issue of verbal expressiveness has been left unexplored. I hypothesize that agent verbal expressiveness may improve the interaction between pedagogical agents and learners, ultimately enhancing learning outcomes. Evidence from a quasi-experimental investigation, indicates that learners who interacted with an expressive agent 1) scored higher on a post-task exam; and 2) rated the agent's ability to interact higher, than learners who interacted with a nonexpressive agent. Qualitative results provided insight into this finding, while indicating the complexity of deploying pedagogical agents in educational settings.
Introduction
Computer-mediated interactions often lack visual and emotional cues (Burgoon & Le Poire 1999). While humans rely on nonverbal cues to supplement verbal interactions and infer emotions and intentions in face-to-face interactions, such capabilities are partly lost when interaction is mediated by technology. Although the lack of nonverbal cues is prominent in text-based interactions (e.g. discussion boards), such cues may also be absent in audio-based interactions. Most notably, in the instances where software applications translate text to speech, oral cues, such as verbal intonations, may be lost. The absence of such cues may impede interaction, usefulness, and usability, while the presence of cues enhances people's ability to understand speech (Killion 1993). For instance, when engaging with an online module to learn a foreign language, the absence of proper pauses, enunciation, and emphasis may hinder engagement with the task. Conversely, the presence of verbal cues may enhance comprehension (Jones et al. 2007) and learning activity (c.f. Peña et al. 2002).
Pedagogical agents, virtual characters employed in educational settings for instructional purposes, interact with learners using a range of text-based and audio-based communication. With regards to audio-based communication, agents are often employed with text-to-speech software where they are able to respond to learners dynamically, translating text-based information into its equivalent audio form (Song et al. 2004). The expressive shortcomings inherent in text-to-speech algorithms and technology however, may impede interaction (Nass et al. 2006), especially since computer-synthesized voices are perceived less favorably than human voices (Stern et al. 1999; Mayer et al. 2003). More specifically, the lack of pedagogical agent verbal expressiveness may hinder communication and learning (Veletsianos et al. 2009). To investigate this hypothesis, I first review the literature with regards to instructor and pedagogical agent expressiveness and note the theoretical propositions regarding the ways learners and agents interact. Next, I present my research questions, method, and results. I conclude by discussing the findings of this study and making recommendations for future research and practice.
Instructor expressiveness
A large body of literature has examined the issue of instructor expressiveness, particularly as it relates to teaching evaluations. While learners may use multiple sources of information to rate instructors, educator personality appears to heavily influence instructional ratings (Feldman 1986). Equally important, while instructor expressiveness appears to heavily influence ratings, lecture content seems to relate minimally to instructional ratings (Abrami et al. 1982). Anderson (1979), Perry (1985), Cashin (1995), and Murray (1997) highlight the importance of expressiveness in relation to cognitive and affective outcomes in what Brown and Atkins (1988, p. 15) call ‘an essential ingredient of lecturing’. More specifically, these authors argue that instructor expressiveness may be related to learning outcomes because expressiveness can:
- •
be interpreted as instructor enthusiasm;
- •
attract and keep student attention;
- •
facilitate communication and learner understanding;
- •
promote learner engagement;
- •
intrigue learners, making class more interesting;
- •
encourage active participation;
- •
smooth interactions; and
- •
reduce psychological distance, thus promoting immediacy.
The influence of instructor expressiveness has been termed the ‘Dr. Fox’ effect. The original ‘Dr. Fox’ study found that an actor that presented an animated version of a lecture void of educational content received high instructional ratings (Naftulin et al. 1973). Due to its surprising results, the original study was replicated and extended. A meta-analysis of the ‘Dr. Fox’ studies found that instructor expressiveness heavily impacts student ratings but has a small influence on learner attainment (Abrami et al. 1982): students may be intrigued and engaged by expressive instructors, but not ‘seduced’ into learning more. It is therefore important to highlight the value of expressiveness as a means to enhance communication, smooth interaction, and orchestrate class enthusiasm, rather than as a goal of teaching.
Pedagogical agent expressiveness
Prior to introducing the pedagogical agent expressiveness literature, it may be helpful to situate pedagogical agents within a theoretical learning framework. Specifically, pedagogical agents are often viewed in a sociocultural context (Gulz 2004). Sociocultural theorists note that social interaction plays a central part in cognition (Vygotsky 1962). Individuals learn by interacting, communicating, collaborating, and negotiating meaning with each other in a social context. Situating pedagogical agents in sociocultural views of learning means viewing students and agents as participants in a learning relationship where learners are scaffolded to higher levels in their Zone of Proximal Development (Vygotsky 1978), where socially shared activities between learners and agents are transformed and internalized (c.f. John-Steiner & Mahn 1996). The roles ascribed to pedagogical agents (e.g. digital teachers, tutors, and learning companions (Payr 2003)) reinforce the notion that social interaction is central to learning with agents.
To date, extensive multidisciplinary work has focused on enhancing the expressive qualities of pedagogical agents (Johnson et al. 2002b). Nevertheless, the term ‘expressiveness’ is often used with varied meanings. For instance, researchers have investigated the use and impact of agents' affective expressions (Lester et al. 1999; Baylor et al. 2005), emotional expressiveness (Bickmore & Picard 2005), and communicative expressiveness (Lester et al. 1997). In other instances, the term expressiveness has been used more loosely to refer to pedagogical agents that are animated as opposed to static (Baylor & Ryu 2003). Importantly, our investigation of the literature has revealed minimal work with respect to pedagogical agents' verbal expressiveness in formal higher education contexts.
Prior work in verbal expressiveness has focused on expressivity as a way to instill human-like emotions to virtual characters, with diverse domains requiring different expressive styles (Theune et al. 2006). Johnson et al. (2002a) note that although natural sounding and expressive synthetic voices are important to match the lifelike appearance of virtual characters, domain-dependent factors necessitate further voice granulation. These authors describe efforts to develop synthetic voices for military training applications that encompass pedagogical agents noting that, ‘the style of speech used by military officers to give orders … is spoken to be understood clearly and to convey authority’ (p. 164). The importance of voice is also highlighted by Atkinson et al. (2005) who present evidence indicating that agents with a human voice are preferred to agents with a synthetic voice.
The [EnALI] framework for agent–learner interactions highlights the importance of pedagogical agent expressiveness (Veletsianos et al. 2009). Specifically, to enhance learning outcomes and student experiences when learners interact with pedagogical agents, it is recommended that agents should display socially appropriate demeanor, posture, and representation by being expressive. Expressiveness may allow the sharing of important social and verbal cues, reinforcing agent–learner interaction, and relationships. In addition, expressive clarity may enhance understanding, reduce frustration, and improve the ease of use of agent-based systems. Enriched clarity and agent–learner relationships may then improve comprehension, cooperation, attention, and ultimately learning.
Interacting with expressive agents: the media equation link
An important facet of mediated communication and interaction that applies to agent–learner interactions and needs to be illuminated is what researchers have called the media equation (Reeves & Nass 1996). Evidence by Reeves and Nass showed that humans treat computers and media in fundamentally social, natural, and human-like ways. By replicating experiments designed to examine social interactions between humans and applying them to interactions between humans and media, Reeves and Nass showed that users ascribe social rules to their interactions with media. For example, humans trust ‘expert’ computers more than ‘nonexpert’ computers (akin to humans trusting human experts more than nonexperts) even though computers lack any notion of ‘expertise’. The media equation explains how learners may interact with pedagogical agents: If humans treat media in inherently social ways, then, correspondingly, learners will treat pedagogical agents in a human-like fashion. For instance, learners stereotype pedagogical agents according to agents' visible outer characteristics, just like humans stereotype others (Veletsianos 2007; Haake 2009).
Research questions
For the purpose of this study the following research questions were specified:
- •
What is the impact of pedagogical agent expressiveness on learning?
- •
What is the impact of pedagogical agent expressiveness on student perceptions of the agent's ability to interact with them?
I hypothesize that learners interacting with expressive agents will learn more than learners interacting with nonexpressive agents. In addition, expressive agents will be rated higher in terms of perceived interaction ability than nonexpressive agents.
Method
Participants
Invited participants were enrolled in one early childhood education technology course and two elementary/special education technology courses. The courses were content and cohort-specific, part of a 15-month post-baccalaureate masters programme in education, and the first and only required educational technology course taken by these students as part of their programme of study. Eighty-one students were invited to participate. Out of those, 59 chose to participate. Of the 59 students who participated, 54 were females and five were males, and their average age was 21.63 years of age (sd = 2.05).
Materials
The materials used in this study consisted of one tutorial lesson, two pedagogical agents, a post-test survey, a post-test exam, and an open-ended interview protocol.
Tutorial lesson
Two versions of a tutorial lesson were developed. The two lessons were presented in an informal and conversational tone and were identical in content. The lessons introduced participants to the use of technology in the classroom and raised multiple issues that teachers need to consider when integrating technology in their classrooms. The content of the tutorials was authentic and relevant to teacher practice. Importantly, the issues raised in the tutorial lessons were issues that were going to be explored in class but were unfamiliar to the students as this study occurred on the first time the class met.
The two versions of the tutorial lesson differed in terms of verbal expressiveness. In particular, the second version of the lesson was an exact replica of the first version of the lesson, except that it emphasized certain parts of speech by including 13 additional pauses, eight instances where the content was delivered in a louder voice, and six instances where words were better enunciated. These modifications improve acoustic parameters and represent one step towards enhancing agents' expressive abilities. While expressivity also serves communicative functions, such functions do not appear to be served by the modifications implemented in the second tutorial. These specific parts of speech were investigated due to ease of implementation and manipulation. It is important to note that discrepancies between experimental groups may arise when manipulating agents' expressive features that vary with nonverbal behavior. Such discrepancies were not evident in our study due to the nature of the specific features being investigated.
Pedagogical agent
One female pedagogical agent was used in this study (Fig 1). The same pedagogical agent delivered two versions of the tutorial lesson described above. In each case, the pedagogical agent was identical in body image, clothing, animation, dimensions, voice, and facial expressiveness. Using lip synchronization software and text-to-speech software, the agent was able to present the tutorial lesson described above in a spoken voice. In terms of the agent's nonverbal behaviors, gaze was predetermined, eye and eyebrow movement were coordinated, and advanced facial expressions were lacking. Even though the tools used to develop the pedagogical agent did not provide the capability for more advanced facial animations, these low-fidelity behaviors were perceived to match the agent's visual realism. Finally, although the agent's behavioral fidelity was simple, most real-world pedagogical agent deployments do not seem to employ behavioral fidelity beyond gaze and eye blinking. Using an agent with low behavioral fidelity therefore may yield results that would be comparable to what can be seen in real-world implementations.

The pedagogical agent used in this study (designed using http://Oddcast.com).
Post-task survey
Participants were asked to complete a survey that collected 1) demographic information (gender, age, and grade point average); 2) information on computer knowledge and skills; 3) information regarding knowledge of technology use in education; and 4) information regarding perceptions of the agent's ability to communicate and interact with participants. Survey responses were combined to form an index measuring computer knowledge and skills and one measuring knowledge of technology use in education. Cronbach's alpha – a coefficient of reliability – was used to measure how well the survey responses measured the internal consistency of the aforementioned indices, essentially being used to justify combining a set of items in an index. In the social sciences, values above 0.70 are considered satisfactory. Cronbach's alpha for the computer knowledge and skills index was assessed at 0.71, while for the knowledge of technology use in education was measured at 0.94.
Post-task exam
Participants were also asked to complete a test consisting of fifteen questions. Of the fifteen questions, 12 were information recall and three were analytical, while seven were multiple-choice, five were fill-in-the-blanks, and three were true-false questions. To minimize threats to the exam's validity, participants were encouraged to avoid guessing if they did not know or remember the answer to a question.
Open-ended interview protocol
Participants were also invited to a focus group session. In these group discussions, participants were asked to discuss their experiences of interacting with the pedagogical agent. The focus group sessions were conducted in an open-ended and unstructured manner. Appendix A presents the guiding focus group questions. Expressiveness (i.e. the main variable of interest) was deliberately omitted from the guiding questions to avoid influencing participants' post hoc rationalization of their experience: expressiveness was to be discussed only if participants alluded to it.
Experimental design and treatments
A between subjects factorial design with two independent samples was employed. The experimental factor was pedagogical agent expressiveness with participants randomly assigned to either the expressive or the nonexpressive group.
Dependent measures
Perceived interaction ability
The agent's ability to interact with learners was evaluated as a composite measure of three survey items. Specifically, participants were asked to rate their communication with the agent in terms of smoothness, naturalness, and effectiveness. Smooth communication is perceived to be one that flows and is not abruptly interrupted; natural communication is one that occurs in a ‘normal’ way and is not imposed upon communicators; and effective communication refers to communication that is able to achieve its intended purpose. These three variables were combined to form the perceived interaction ability index. Cronbach's alpha for the index was assessed at 0.79.
Learning
Outcomes were assessed via the post-task exam described above. Answers were graded to form a total score for each participant.
Data analysis
Quantitative data
Experimental data were analyzed using the between subjects Multivariate Analysis of CoVAriance (MANCOVA) procedure. Specifically, MANCOVA assisted in examining the extent to which expressiveness influences 1) learning outcomes; and 2) perceptions of pedagogical agent interaction ability. Significant MANCOVA effects were further examined with univariate ANOVA procedures. For all quantitative analyses, alpha was set at 0.05.
Qualitative data
To analyze the qualitative data collected, I used the constant comparative (grounded theory) method (Glaser & Strauss 1967), arriving at salient categories and data patterns. Qualitative data were analyzed on five instances. First, the data from each experimental group were analyzed individually to note emerging patterns and to gain a broad understanding of the learner experience. Second, the data from each experimental group were analyzed individually in search for higher-level patterns that would enable the researcher to understand and interpret the meaning of the learner experience. Third, data across groups were analyzed in search of common themes and meanings. Fourth, data from all groups were analyzed to probe the concept of agent expressiveness. Finally, once all focus group transcripts were analyzed in the manner described above, the patterns were compiled and reanalyzed in order to confirm and disconfirm the themes across all qualitative data. Analysis across and between focus group data continued until no more patterns could be identified.
Methodologically, the qualitative portion of this study falls within the broad framework of the interpretive research paradigm and the constructivist realm. Under the interpretive research paradigm I employed a case study method (Yin 2003), where the integration of the pedagogical agent in these classrooms was perceived to be the case under investigation. The case study method was chosen because of a desire to describe, understand, and explain complex real-life phenomena that occurred in semi-authentic situations (Haas Dyson & Genishi 2005). Finally, I understand this analysis to fall within the constructivist realm as I do not purport to uncover a single, monolithic truth. Rather, I perceive the existence of multiple, complementary, and contradictory truths that coexist within the use and deployment of pedagogical agents in education. Thus, the analysis keeps an open eye for the unknown and the unexpected that may coexist and be symbiotic to the known and the expected, ultimately informing the experimental results of this investigation.
Procedure
One researcher visited three educational technology course sections on their first course session. At that time, the students were informed of the research and the tasks involved. Participants were told that a virtual character would present them with information regarding the use of technology in the classroom. To avoid confounding the results, participants were not informed that expressiveness was the main variable examined in this research. Participation in the research was strictly voluntary and those students who chose not to participate were permitted to work on course assignments.
Participants were then directed to view the tutorial lesson presented by the pedagogical agents. Each participant was seated in front of a Windows desktop computer, equipped with headphones and the pedagogical agent software. Prior to commencing the task, participants were directed to pay attention to the lesson and to refrain from taking notes or engaging with any other computer task. Participants wore the headphones, launched the software, tested the audio equipment, and, if all worked fine, began viewing the tutorial lesson. If any issues arose prior to the commencement of the presentation, participants were directed by the software to raise their hands and a researcher would provide any necessary assistance. At the end of the lesson, participants were redirected to a website where they could enter their answers to the post-task survey and test. On average, this process lasted for approximately 40 min. At the end of the experiment, participants were asked to share and discuss their experiences in a focus group format. The researcher first explained the focus group setting and its intention and then engaged participants in a discussion regarding the experience of interacting with pedagogical agents. The focus group sessions lasted approximately 20 min each.
Results
Quantitative results
A MANCOVA indicated a significant main effect for the treatment factor [Wilks' Λ = 0.83, F (2, 52) = 5.38, P = 0.01, partial η2 = 0.17]. Means and standard deviations for the treatment factor are shown in Table 1.
Dependent variable | Agent | n | Mean | sd |
---|---|---|---|---|
Learning | Nonexpressive | 29 | 6.72 | 1.73 |
Expressive | 30 | 7.90 | 1.61 | |
Total | 59 | 7.32 | 1.76 | |
Interaction ability | Nonexpressive | 29 | 7.55 | 1.74 |
Expressive | 30 | 9.00 | 2.69 | |
Total | 59 | 8.29 | 2.37 |
Results indicated that the type of treatment that participants were assigned influenced learning outcomes and student perceptions of the agent's interaction ability. No significant effects were observed for the covariate factors of gender, grade point average, computer skills, and knowledge of classroom integration practices. In other words, agent expressiveness impacted learning outcomes and student perceptions of agent ability.
Perceived interaction ability
Follow-up ANOVA tests (Table 2) indicated that the two pedagogical agents (expressive vs. nonexpressive) differed significantly in terms of their perceived interaction ability [F = (1, 53) = 5.82, P = 0.019]. In other words, expressiveness influenced agent ratings with regards to agents' ability to interact effectively, smoothly, and naturally with the learners. Specifically, the expressive agent's perceived interaction ability was rated more favorably (M = 9.0, sd = 2.69) than the nonexpressive agent's interaction ability (M = 7.55, sd = 1.74). The standardized effect size for this difference was medium-large (Cohen's d = 0.64). The difference between the groups is illustrated in Fig 2.
- * Significant at the 0.05 level.

Dependent variable means by group.
Learning
Follow-up ANOVA tests (Table 3) indicated that learner outcomes significantly differed between the two treatment groups [F = (1, 53) = 7.13, P = 0.01]. Specifically, participants assigned to the expressive agent group, scored significantly higher in the post-task exam (M = 7.90, sd = 1.61), than participants assigned to the nonexpressive agent group (M = 6.72, sd = 1.73). The standardized effect size for this difference was medium-large (Cohen's d = 0.71). The difference between the groups is illustrated in Fig 2.
Qualitative results
A qualitative analysis of the focus group data revealed six themes that were divided into three levels of analysis (Table 4). These will be described in turn.
Level of analysis | Theme |
---|---|
Differences between groups | Expressiveness. |
Stated similarities across groups | Multidimensional and contrasting agent perceptions. |
Items of distraction. | |
Pedagogical agent affordances. | |
Unstated similarities across groups | Use of the ‘human’ as a measure of comparison. |
Agent-focused with the lesson on the periphery. |
Expressiveness
Without being prompted, approximately 86% of participants raised the issue of agent expressiveness. Agent expressiveness dominated the focus group discussion with about 73% of participants in the expressive pedagogical agent group noticing the heightened expressive abilities of the agent with whom they interacted. More specifically, about 54% of participants in both groups commented that the pedagogical agent lacked expressive capabilities, but only participants in the second group noted that the agent's voice encompassed expressive cues. This result indicates that the statistical differences between the two groups are meaningful.
Most frequently, in both groups, the agent was described as ‘monotone,’‘robotic,’ and as having ‘no emotion’. Sue1 summarized the perceived impact of the agent's oral expressiveness as, ‘Without having emphasis on anything you didn't really know if you needed to be paying attention to her, so it was easy to get distracted by what was on the screen and not pay attention to what she was saying and not focusing’. In the expressive group however, the pedagogical agent was also described as encompassing expressive capabilities. One participant mentioned that ‘she [the agent] seemed to say some things louder to place emphasis on words’ while another noted that ‘there were stopping points and the voice flowed’. The importance of pedagogical agents' expressive capabilities is a point to which I return to in the discussion and implications section of this paper.
Multidimensional and contrasting agent perceptions
Multiple readings of the transcripts indicated that, regardless of group assignment, participants held multidimensional and contrasting perceptions of the pedagogical agents. On the one hand, they perceived the agent as encompassing a natural and real appearance, while on the other they noted the unnatural voice and communicative interaction (Isbister & Nass 2000; Sanchez-Vives & Slater 2005). Mary described the agent as ‘interesting to look at, boring to listen to,’ while Joan noted that ‘her appearance was natural … her voice was off’. The conflict between the agent's appearance and voice, between the naturalness and unnaturalness of her overall presence was well captured by Vanja who noted that the agent ‘didn't feel too realistic, but it also captivated me cause it was moving like it was real’. Bill expressed the same idea by focusing on the agent's eyes and voice, ‘The blinking was very human. It kept me looking at her. Although her voice was so monotone. But the eyes kept going on’. Finally, Jen expressed her views of the agent's head movement, ‘I liked the fact that she was moving her head, but she seemed like she was moving it too much. I liked that she did it – but, it was a little bit TOO much!’ To summarize, the students were both critical of and complimentary to the pedagogical agents, noting their dissatisfaction with the agents' voice while at the same time highlighting their approval of various other agent features that they found appealing.
Items of distraction
Regardless of agent expressiveness, researchers have noted that pedagogical agents may be distracting to learners (Clark & Choi 2005; Choi & Clark 2006). Although students participating in this study did not mention that they were distracted by the pedagogical agent per se, they noted specific features of the agent that distracted them or reasons that the experience was distracting. A tabulation of the features mentioned most frequently by the students as distracting revealed that ‘expressiveness,’‘word emphasis,’ and ‘not sounding right’ topped the list. Following those items were the agent's ‘nonenthusiasm,’ her blinking and moving eyes, the frequent movement of her head, student's preoccupation of the agent's ability to lip-sync, and the novelty of the learning experience.
As already noted in the first identified theme, learners were distracted by the agent's inability to be expressive. Yet, they were distracted by other features as well. For instance, Michelle said, ‘I almost was obsessed with watching her eyes cause there were times when her eyes to me just looked like really off … that was kind of distracting to me’. Jenny's distraction arose because of her desire to understand whether the agent's speech matched her eye blinking, ‘I was focusing more on the pace of the blink. Was it in line [with the agent's speech]? I was more preoccupied with that besides paying attention to what she was saying’. In addition, Mark explained that when something is novel it may also be distracting, ‘I was paying attention more to the actual 3D character than what she was saying … Maybe because I am not used to seeing and interacting with a 3D image like that – in education we are used to presentations and slideshows that are often still and mute’. Finally, 50% of participants in one group agreed with Sally's comment that if the agent was more natural and expressive, students would not have been distracted as much by her flaws, essentially allowing those qualities to compensate for some of the inadequacies of the agent, ‘I think a lot of it I would have overlooked more if she would have been more natural and refined. I would have been able to listen to her so I wouldn't have been looking for the … I don't know. Not looking for faults’.
Pedagogical agent affordances
The experience of interacting with a pedagogical agent encouraged participants to think about the possibilities of using a pedagogical agent in learning and teaching contexts. In the words of one participant, ‘What opportunities could this provide for students?’ Affordances were initially defined by Gibson (1979) as ‘possibilities for action’. For example, an apple affords to be eaten, while a baseball affords to be thrown. The idea of affordances was refined by Norman (1988) as ‘perceived possibilities for action’. Specifically, affordances are those possibilities for action that are perceived by the observer. To illustrate the difference, in the context of the apple and baseball example, Gibson's definition allows the baseball to be eaten and the apple to be thrown, while Norman's definition allows for the more realistic likelihood of the apple to be eaten and the baseball to be thrown. Thus, affordances are suggestions based on a multiplicity of factors ranging from cultural to social to contextual considerations.
In our conversations, learners discussed the agents' educational and social affordances (Kirschner et al. 2004). In terms of social affordances, Kathy highlighted the social dimensions that pedagogical agents could introduce in distance learning courses, ‘I took a distance learning class and the format was that you had to go through the class and you don't ever interact with anyone. You go through everything yourself, so in that case I guess maybe the computer, maybe it would be nice to have someone talking to you, maybe being with you, instead of being alone in front of a computer screen’. In terms of educational affordances, Kris explained how a pedagogical agent could engage and motivate children to interact with the content, ‘it would be really interesting for learning for elementary kids. Not teaching a whole lesson or anything but if you like gave the agent a persona and gave them a name and like during your lesson they give you the fun facts or something and like the person would be theirs [the students']. They would know this person with its name and it is the person that pops up in the screen when you are doing this or that’.
Use of the ‘human’ as a measure of comparison
A universal observation that arose during conversations with the learners is the notion that the pedagogical agent was compared to and evaluated according to human considerations, even though the agent was not always referred to as being ‘human’. From one point of view, this observation should not be surprising since 1) previous literature has shown that humans respond to media in human-like ways (Reeves & Nass 1996); and 2) the agent was portrayed as a human-like figure with human-like abilities. From another point of view however, the pedagogical agent is simply a technology tool that was consciously referred to by participants as ‘it,’‘a robot,’ and ‘technology that was speaking, but not an actual person’. In addition, participants were quick to critique and criticize the agent, whereas such overt assessments are frequently self-regulated and controlled in human–human interactions (Gilbert et al. 1988). Yet, students noted that ‘she didn't seem friendly,’‘her blinking was very human,’‘the voice … was not human,’‘she kind of looked away … that was in a sense natural,’ and ‘it's trying to do intonations, but it's still not natural’. All student quotes above illustrate the idea that the agent was judged according to human and social norms. Rather than evaluating various facets of the agent in terms of the degree to which they were effective, appealing, or even efficient, the students chose to evaluate these facets in terms of human-likeness. Essentially, the students argued that ‘the technology’ was not friendly, the voice and intonations were not natural, and the blinking and gaze were human – all agent features compared to the human norm. Finally, it is also important to note that recommendations for improvement not only focused on how to make the agent more human-like, but also centered on student perceptions of what the teacher is supposed to do while teaching. For instance, Penny noted that it would be helpful if the agent used more than one mode of communication to convey a point, ‘if she had something on the screen while she was talking and maybe some highlights of what she was saying. Or, you know, how teachers sometimes write on the boards and show major points’. John echoed Penny's thoughts by noting that if the agent lost student attention she would not be able to regain it whereas a teacher could ‘throw something else in there that can maybe grab the student, bring everybody back maybe’.
Agent-focused with the lesson on the periphery
The focus group conversations centered on the agent and the learner experience in relation to interacting with the agent, whereas only about 5% of student comments were related to the lesson. Importantly, students also indicated that they paid more attention to the agent rather than the content of the lesson. For example, in one focus group 20 of 25 students agreed with the comment, ‘I paid more attention to the agent than the information in the lesson’. Although the focus of this research was the pedagogical agent, it was surprising that the actual lesson received such minimal attention, especially since the content of the lesson was relevant to the course and this was the students' first class session.
Discussion
The quantitative findings of this research support the hypothesis that an expressive agent, as defined in this paper, enabled participants to recall more information than a nonexpressive agent. In addition, the expressive agent's interaction ability was rated more favorably than the nonexpressive agent's interaction ability. Nevertheless the qualitative results of this study reveal the bigger picture in which agent expressiveness is situated: participants in both treatment groups commented that the agent's expressive ability was disappointing, even though participants in the expressive agent group noticed the agent's heightened expressive abilities. These findings indicate that 1) changes in agents' expressive qualities appear to influence learning outcomes and student perceptions; and 2) agent expressiveness is an important element of pedagogical agent design. In light of these findings, supportive theoretical propositions and related empirical literature presented in the beginning of this paper, designers are advised to embed verbal expressive qualities in agent implementations. Researchers are also encouraged to explore the issue of agent expressiveness beyond its limited notion presented within this paper. Specifically, future research could focus on further delineations of verbal expressiveness (e.g. tone and pitch), oral cues, and/or facial expressiveness. Additionally, future research can further improve the work presented herein by investigating further granulations of agent expressivity in line with message coherency.
Student comments regarding the agents' demeanor and representation indicated the complexity of designing pedagogical agents for real world settings. Whereas the design community has mostly focused on designing agents for use by learners, the current findings indicate that learners have valuable insights on how pedagogical agents should be designed. Rather than treating learners as mere users, it may be worthwhile to involve them in the design process to enhance pedagogical agents. As evidenced in this study, users can offer feedback that may prove beneficial in the development of pedagogical agents. Importantly, due to the fact that learners have experienced the interaction that was planned for them by the designer, they can provide valuable insight on the experience and the intended and unintended consequences of design decisions.
This study also revealed that learners may be distracted by numerous items when interacting with pedagogical agents (c.f. Clark & Choi 2005; Choi & Clark 2006). Researchers and designers should enhance various aspects of pedagogical agents to eliminate distractions. This can be achieved by investigating 1) which distracting features can be improved; and 2) which pedagogical agent features can be eliminated because they serve no social, cognitive, or pedagogical value. For instance, researchers could investigate additional ways to enhance pedagogical agent expressiveness to reduce learner distraction and enhance learning and agent–student interactions. For example, it may be that the agent's blinking eyes sidetrack learners. Nevertheless, the agent's blinking eyes also make the agent appear more natural. The differential impact of such variables needs to be investigated to understand the relative influence that these variables exert on learning and the learner experience. Another example relates to agent believability: would these distractions persist if the agent's behavior was more natural and believable? Or, would more natural and believable behavior still be distracting due to the agent appearing to be so ‘real?’ Future research on these issues will be worthwhile.
Students also discussed the differential benefits that pedagogical agents could bring to learning contexts, raising the important issue of agent affordances. Pedagogical agents are often viewed with a technological affordance lens, concerned with the efficient and effective accomplishment of tasks (c.f. Reigeluth 1983). Nevertheless, this view ignores the educational and social capabilities that pedagogical agents can add to online learning contexts. For instance, well-designed pedagogical agents can establish social and cultural links with learners, enhancing affective aspects of learning. In addition, pedagogical agents can offer important pedagogical affordances. For example, agents can act as collaborators to learners, providing assistance only after learners have attempted to solve a given task and have demonstrated that they are in need of a scaffold. In essence, the notions of educational, social, and technological affordances appear important to enable designers and researchers to enhance the design of pedagogical agents. Delineating the advantages of pedagogical agents in terms of what pedagogical agents make possible may be beneficial for both researchers and designers.
In our attempts to enhance education via the use of technology, we often focus on the media and disregard the value and power of the lesson. Indeed, when pedagogical agents are used in learning and teaching contexts, researchers and designers need to 1) enhance the agent, its interaction capabilities, and its demeanor; and 2) ensure that the lesson, tutorial, or task, is also interesting and engaging to the learner (c.f. Merrill 2008). The impact of pedagogical agent research and advances will be minimal if the agent is designed to deliver prerecorded and dispassionate lectures. As designers of instructional and learning experiences, we not only need to focus on designing media, but also on transforming content to engage and capture student attention. Rather than focusing our efforts strictly on researching agent characteristics and features, I suggest investigating how to improve the learner experience through a creative exploration of agent–learner interactions, agent-enhanced pedagogies (e.g. playful simulations), and learning environments mediated by virtual and pedagogical agents (e.g. MultiUser Virtual Worlds).
Finally, the idea that learners perceive agents as humans and interact with them in inherently social ways may be a double-edged sword. Even though perceiving the agent in a social way may enhance affective aspects of learning, judging the agent using a human gauge poses immense difficulty for designers and researchers. For example, the divergent comments made by the students could also be attributed to a mismatch between the agent's visual realism and behavioral fidelity. This issue is important because prior research has highlighted the benefits of perceiving media in inherently human ways (e.g. Reeves & Nass 1996), without paying sufficient attention to what it means for media to be evaluated using the human as the point of comparison. To resolve this problem, we are faced with two possible options.
The first option entails the elimination of some of the more advanced features of pedagogical agents (e.g. smooth and natural head movements) so as to lower user expectations (c.f. Norman 1997). Logically, reducing anthropomorphism would also reduce expectations of human behavior, demeanor, and intellect. Yet, this line of thought assumes that further agent improvements could make agents ‘more human’ and elimination of agents' advanced features makes them ‘less human’– a reasoning that is not supported by prior empirical work. Humans apply human-like qualities to media even when such advanced characteristics are absent. For instance, Nass et al. (1997) discovered that vocal cues alone were sufficient to evoke sex-based stereotypes.
The second option, in accordance with previous work such as the one described by Woo (2009), entails the holistic improvement and transformation of pedagogical agents. For instance, it is not enough to enhance pedagogical agents' conversational ability. Rather, designers and researchers need to enhance the interaction capabilities of pedagogical agents such that agents interact with learners in a smooth, natural, effective, efficient, engaging, and socially appropriate manner. One step towards this outcome is provided by the [EnALI] framework (Veletsianos et al. 2009) that represents a well-defined, extensive, and multifaceted framework for the design of agents and their interaction capabilities. Efforts aimed at holistically enhancing pedagogical agents and their interaction capabilities will pave the way for truly effective and engaging virtual companions.
Note
Appendix A
- •
What did you think of the virtual character?
- •
What was difficult about your interaction with the virtual character?
- •
What was easy about your interaction with the virtual character?
- •
What did you like the most about the virtual character?
- •
What did you like the least about the virtual character?
- •
What are some adjectives that come to mind when asked to describe the virtual character?
If and when participants mention agent expressiveness:
- •
What did you think of the character's expressiveness or lack thereof? Was her demeanor helpful? Was it appropriate?