Volume 47, Issue 6 p. 1083-1095
Original Article
Full Access

Using eye-tracking technology as an indirect instruction tool to improve text and picture processing and learning

Lucia Mason

Corresponding Author

Lucia Mason

Address for correspondence: Professor Lucia Mason, Developmental Psychology and Socialization, University of Padova, Via Venezia 8, Padova 35131, Italy. Email: lucia.mason@unipd.itSearch for more papers by this author
Patrik PluchinoMaria Caterina Tornatora

Maria Caterina Tornatora

Search for more papers by this author
First published: 20 March 2015
Citations: 66

Abstract

This study used an eye-movement modelling example (EMME) in the school context to corroborate and extend recent findings about the educational potential of eye-tracking technology for supporting strategic processing and learning from an illustrated text. Sixty-four seventh graders were randomly assigned to the modelling and non-modelling conditions to investigate whether (1) those with the opportunity to observe a model's eye movements while reading an illustrated text show greater integrative processing in their own reading and (2) they learn more deeply from text. Findings reveal that the students who observed the model's visual behaviour showed greater integrative processing of text and picture. They made more transitions from one representation to the other and strategically spent longer re-inspecting the picture while rereading the text and vice versa. These students also outperformed those in the non-modelling condition for deeper learning as revealed in the transfer of knowledge task. Moreover, students with lower reading comprehension skills benefitted more from observing the model's gaze replay when considering both the acquisition and transfer of knowledge.

Practitioner Notes

    What is already known about this topic
  • Successful learning from an illustrated text requires the integration of verbal and graphical information.
  • Often students do not pay much attention to pictures as they believe they comprehend them very easily.
  • Eye-tracking technology can be used as a research tool to track cognitive processing while reading.
    What this paper adds
  • Students who observed a video of a model's eye-movements while reading an illustrated text showed a more integrative processing of text and picture during their own reading of an illustrated text on a different topic.
  • Students who observed the model's visual behaviour were also superior in the task requiring the transfer of new knowledge.
  • Students with low reading comprehension skills benefitted more from observing the video with the model's gaze replay than students with higher reading comprehension skills.
    Implications for practice and/or policy
  • Eye-tracking technology has the potential to be an educational technology. The study suggests that it can be used to model one of the most common learning activities in the school context, that is reading an illustrated informational text.
  • The potential benefit of an eye-movement modelling example seems to extend to deeper learning from text.
  • Less skilled readers in particular can benefit more from observing a model's visual behaviour while strategically reading an illustrated informational text.

Introduction and rationale

This study aimed to corroborate and extend current research on the potential of eye-tracking technology in the educational context to model one of the most common activities carried out to learn disciplinary knowledge, that is, reading an illustrated text.

It is well-known that eye-tracking technology can be used as a research tool in learning contexts to track cognitive processing while carrying out various tasks (eg, Wang, Chen & Lin, 2014; Zheng & Cook, 2012). It provides unique information concerning perceptual and cognitive processes that underlie learning performances (van Gog & Scheiter, 2010). Recently, this technology has attracted interest in educational research on multimedia learning (Eitel, Scheitel, Schüler, Nyström & Holmqvist, 2013; Mason, Pluchino & Tornatora, 2013; Mason, Tornatora & Pluchino, 2013).

What is less known is that eye-tracking technology can be used as an indirect instruction tool for sustaining the processes and outcomes of learning in educational contexts (Jarodzka et al, 2012; Mason, Pluchino & Tornatora, 2015; Seppänen & Gegenfurtner, 2012). In other words, eye-tracking technology has been increasingly used as a tool to investigate the allocation of attention while executing of crucial tasks and its relation with learning performance. However, eye-tracking technology has not been used yet as an educational technology in the classroom. How can this technology become educational? Nowadays, there are non-intrusive eye-trackers that not only record eye-movements while performing a learning activity, but also make gaze replays available in the form of videos. These videos can easily be used in the service of learning. In this regard, eye-movement modelling examples (EMMEs) is a recent technology-supported tool in which videos of the gaze replays of an expert model a learner's behaviour. Specifically, the position of a skilful expert's eye-movements while performing an activity is shown to less skilful learners to help them carry out the same activity (van Gog, Jarodzka, Scheiter, Gerjets & Paas, 2009). Therefore, eye-tracking technology becomes an educational technology when is used to teach—indirectly—how to perform a learning activity well.

Theoretically rooted in observational learning (Bandura, 1977), EMMEs are a special case of video-based modelling that has been increasingly used in education to model various task and activities (eg, Groenendijk, Janssen, Rijlaarsdam & van den Bergh, 2013). In the present study, we focused on modelling the strategic reading of an illustrated science text for two reasons: (1) it is crucial for successful learning in the school context given that almost all textbooks include text and illustrations and (2) students usually spend more time reading the verbal parts than they do inspecting the graphical parts of learning materials (Cromley, Snyder-Hogan & Luciw-Dubas, 2010a; Hannus & Hyönä, 1999; Schmidt-Weigand, Kohnert & Glowalla, 2010) and are often under the illusion that they comprehend illustrations (Schroeder et al, 2011).

When learning complex concepts, multimedia materials are potentially beneficial for the different functions that they offer (Ainsworth, 2006). The current study deals with the most common multimedia materials used in schools—texts accompanied by instructional pictures. Examining how students process and learn from illustrated texts is important for understanding how this crucial learning activity can be supported to make it more effective.

Theoretical accounts of learning from text and pictures (Mayer, 2009, 2014; Schnotz, 2014; Schnotz, Baadte, Johnson & Mengelkamp, 2012) agree that integrating verbal and graphical information is crucial for learning better from an illustrated text than from a text alone. Eye-tracking studies have investigated how the integration of text and pictures can be fostered by presenting signals (Jamet, 2014). For example, the most important terminology is written in the same colour within the text and the picture to increase its salience (Ozcelik, Arslan-Ari & Cagiltay, 2010; Ozcelik, Karakus, Kursun & Cagiltay, 2009).

In this study, we sought to sustain the integration of verbal and graphical information in a different way, that is, through the use of eye-movement modelling in the school context. To our knowledge, only one study has been carried out using this technology-supported tool for modelling lower secondary school students' integrative behaviour while reading an illustrated text to learn conceptual knowledge from it (Mason et al, 2015). In the present study, we sought to corroborate and extend the positive findings of this previous investigation. Corroborating means providing new evidence for the educational potential of eye-tracking technology in sustaining integrative processing during the more strategic second-pass, that is, rereading, in the service of learning. Extending the previous findings means examining its educational potential in relation to the crucial individual characteristic of reading comprehension skill which, by definition, is related to learning from text (Cromley, Snyder-Hogan & Luciw-Dubas, 2010b).

Specifically, three research questions (RQs) guided our study. RQ1: Does an EMME of integrative processing of text and picture in reading an illustrated text help students to integrate more verbal and graphical information during their own reading of an illustrated text? RQ2: Does the EMME of integrative processing of text and picture also help students to learn more deeply from the illustrated text? RQ3: Do reading comprehension skills moderate the effect of the EMME on learning?
Based on previous research (Mason et al, 2015), we formulated the following hypotheses. H1: Students with the opportunity to observe a model's gaze replay would show stronger integrative processing as indicated by the frequency of gaze shifts from the verbal to the graphical representation and vice versa, and the duration of their re-inspection of the graphical information while rereading the verbal segments and vice versa, during the delayed and more purposeful second-pass reading. H2: Students who observed the model's gaze replay would also perform better in the transfer of knowledge. More than surface learning, this measure of deeper learning should require the integration of verbal and graphical information. H3: Reading comprehension skill would moderate the benefit of the EMME, helping low-comprehenders to benefit more from the modelling of strategic reading of an illustrated text, according to an aptitude-treatment interaction. This refers to the fact that some instructional strategies (treatments; in this case the EMME) are more or less effective for particular individuals depending on their specific abilities (aptitudes; in this case reading comprehension) (Park & Lee, 2003; Snow & Swanson, 1992).

Method

Participants and design

Sixty-four students (35 girls and 29 boys) attending seventh grade (Mage = 12.52, standard deviation [SD] = 0.56) in two lower secondary schools in a north-eastern region of Italy were involved on a voluntary basis with parental consent. All were white, native-born Italians with Italian as their first language and shared a homogeneous middle-class social background. At the start of the experiment, participants were randomly assigned to two conditions: EMME (n = 33) and no-EMME (n = 31).

Only in the EMME condition was a video shown depicting the gaze replay of a model reading an illustrated text. The video had already been used in a previous study (Mason et al, 2015). It was introduced as the video of the eye-movements of a student who read an illustrated text and learned very well from it. Students were told that they would see red dots on the text and picture. Each red dot represents how long the model fixated the specific information: the larger the dots, the longer the time spent on the information. The video lasted 2 minutes and 53 seconds.

The video was shown individually to each participant in the EMME condition in a quiet room in the school, for the same period of time, after recording the baseline eye-movements and before the learning episode. Eye-movement recording necessarily requires individual presentation of the learning material on a computer screen. Thus, as well as the classroom, another room should also be available when using an eye-tracker in the school context.

The video was prepared taking into account research on successful strategies for learning from text and picture (Bartholomé & Bromme, 2009). Through the gaze replay, the video showed that the model initially read all the text for an overview of the verbal part of the learning material, which guided the model's picture inspection. The model thus began connecting the text and picture to each other, shifting from one to the other part of the learning material. The model also focused on the text segments that were not visualised.

It is worth noting that the topic of the text read by the model in the video (the water cycle) was different from the text topic used in the learning episode (the greenhouse effect). This allowed us to avoid participants in the EMME condition being more exposed to the content to be learned. It is also worth noting that the video with the model's gaze replay was shown without any simultaneous verbal accompaniment.

Reading material

The illustrated text used in both conditions regarded the atmospheric greenhouse effect. The topic had not been previously presented in any of the science classes attended by the participants. The text was 220 words long (in Italian) and was illustrated by a picture (Figure 1). The text and picture were divided into areas of interest (AOIs) for eye-fixation analyses. The text was divided into 12 sentences (AOIs). Specifically, five sentences were considered as corresponding AOIs (ie, AOIs that contain the same information depicted in the illustration) and seven sentences were considered as non-corresponding AOIs (ie, AOIs that contain information about greenhouse effect but were not depicted in the illustration). The illustration was also divided into corresponding AOIs (areas that visualise text information) and non-corresponding AOIs (areas that do not visualise text information). Both text and picture were saved as images in a .tif format (1024 × 768 pixels). The text was written in Courier 13 font and presented in double interlinear spacing.

figure

The learning material with text and picture on the topic of the greenhouse effect. Highlighted parts of the text and picture are the corresponding segments of the verbal and graphical representations

Apparatus

Eye-fixation data were collected using the Tobii T120 eye tracker manufactured by Tobii Technology (Stockholm, Sweden). It is a portable, non-intrusive eye-tracker integrated into a 17-inch TFT monitor with a maximum resolution of 1280 × 1024 pixels. The eye-tracker embeds five near-infrared light emitting diodes and a high-resolution camera that samples pupil location at the rate of 120 Hz. Data were recorded with Tobii-Studio software.

Eye-fixation measures

We computed indices of first- and second-pass fixation times (Hyönä & Nurminen, 2006; Hyönä, Lorch & Rinck, 2003). First-pass fixation time is considered to reflect immediate and more automatic processing, whereas the second-pass reflects the delayed and more purposeful processing. We also computed the frequencies of first- and second-pass transitions from the verbal to the graphical representation and vice versa (Johnson & Mayer, 2012), both for corresponding and non-corresponding segments. For reasons of space, and given the main focus of this paper, we present only the data regarding the indices of frequency and duration during the more purposeful second-pass, that is, rereading or picture re-inspection. Specifically, the frequency of second-pass transitions indicates how many times a reader's gaze shifted from a given area of the verbal representation to a given area of the graphical representation, or vice versa, during the second encounter with the reading material. Transitions reflect the learner's attempts to integrate words and pictorial elements (Johnson & Mayer, 2012).

For the duration of the second-pass fixations, we computed the look-from fixation time. Look-from text to picture was computed for the corresponding and non-corresponding AOIs by summing the duration of all re-fixations that “took off” from a segment (AOI) of the text and “landed” on a segment (AOI) of the picture. Similarly, the look-from picture to text was computed by summing the durations of all re-inspections that “took off” from a segment of the picture and “landed” on a segment of the text. Look-from measures indicate the integrative processing of the verbal and graphical information of the learning material.

Due to inter-individual variance and non-normal distributions, all measures of eye-movements (duration in milliseconds) were logarithmically transformed to meet the assumptions of parametric statistical tests. We used the procedure available in the Statistical Package for Social Sciences (SPSS, IBM software). We added the constant 1 to all data values of a variable when there were values of zero. In this way, the procedure was able to bypass the problem that the logarithm of zero is not defined mathematically.

Pre- and post-reading measures

Reading comprehension skill

This was measured using the MT (Italian) test for seventh grade (Cornoldi & Colpo, 1995). The reliability coefficient, as measured by Cronbach's alpha, was = 0.77.

Prior knowledge

Factual prior knowledge of the topic was assessed by nine questions: two open-ended and seven multiple choice questions that also required a justification for the chosen option. Cronbach's alpha for this instrument was 0.71. Interrater reliability for the coding of justifications, as measured by Cohen's k, was 0.86.

Posttest recall

Interrater reliability for coding the recall of the text that included 22 information units, was 0.92.

Posttest factual knowledge

This was measured using the same nine factual knowledge questions that were asked at pretest (α = 0.74). Interrater reliability for coding the answers to the open-ended questions, and the justifications for the answers to the multiple-choice questions, was 0.95.

Posttest transfer

The transfer task included eight questions, six open questions and two multiple-choice questions that also required a justification for the chosen option (α = 0.75). Interrater reliability for coding the answers and justifications was 0.90.

Interviews

For “manipulation check,” after the learning episode, all readers in the EMME condition were individually interviewed to ensure that they had perceived the model's integrative visual behaviour and the aim of the video.

Results

Preliminary analyses

First of all, no significant differences emerged for participants' baseline eye-movements that were recorded before the learning episode while reading an illustrated text on a different topic—the food chain. No statistically significant differences between the groups emerged for prior knowledge and reading comprehension, with Fs < 1. In all analyses, Fs were < 1, except for the look-from corresponding and non-corresponding picture segments to corresponding text segments, F(2, 61) = 1.16, p = 0.319, η2p = 0.03. These data indicate that at the onset of the study, participants in the EMME and no-EMME conditions were similar in the ways they read an illustrated science text and for their prior knowledge of the text topic. Finally, the interview data about the “manipulation check” indicated that readers in the EMME condition had substantially perceived the integrative behaviour of the model and the aim of the eye-movement replay. RQ1: Does an EMME of integrative processing of text and picture in reading an illustrated text help students to integrate more verbal and graphical information during their own reading of an illustrated text?

To answer this research question, we carried out several multivariate analyses of variance (MANOVAs) for the frequency and duration indices of second-pass reading. Mean and SDs of the eye-movement (log-transformed) indices are reported in Table 1. As mentioned above, the data regarding the processing measures were log-transformed because of the non-normal distribution.

Table 1. Means and standard deviations of eye-movement indices (frequency and durations) as a function of condition
Index no-EMME (n = 31) EMME (n = 33)
M SD M SD
Second-pass transition frequency
From C_TXT to C_PICT 1.15 0.80 1.86 0.69
From NC_TXT to C_PICT 1.01 0.86 1.66 0.69
From C_PICT to C_TXT 1.20 0.83 1.95 0.77
From NC_PICT to C_TXT 0.60 0.73 0.93 0.72
Look-from fixation time
From C_TXT to C_PICT 6.27 3.29 8.77 0.90
From NC_TXT to C_PICT 5.52 3.97 8.42 1.23
From C_PICT to C_TXT 7.38 3.61 9.54 2.64
From NC_PICT to C_TXT 5.13 4.57 7.12 3.96
  • Note. All reported measures are log-transformed. C_TXT, corresponding text segments; NC_TXT, non-corresponding text segments; C_PICT, corresponding picture segments; NC_PICT, non-corresponding picture segments.

Second-pass transitions from text to picture

A MANOVA with the frequency of transitions from corresponding and non-corresponding text segments to corresponding picture segments revealed the effect of condition, F(1, 62) = 7.69, p = 0.001, η2p = 0.20. Univariate tests showed that learners in the EMME condition shifted more from the former, F(1, 62) = 13.97, MSE [mean square error] = 0.56, p < .001, η2p = 0.18, and the latter, F(1, 62) = 11.01, MSE = 0.61, p = 0.002, η2p = 0.15, text segments to the corresponding graphical representations than learners in the no-EMME condition. These findings indicate that when rereading, learners who observed the model's eye-movements made more attempts to integrate the more relevant verbal and graphical segments than learners who did not observe the model's gaze replay.

Second-pass transitions from picture to text

A MANOVA with the frequency of transitions from corresponding and non-corresponding picture segments to corresponding text segments revealed the effect of condition, F(2, 61) = 6.86, p = 0.002, η2p = 0.18. Univariate tests showed that learners in the EMME condition shifted more from the corresponding graphical segments to the corresponding text segments than learners in the no-EMME condition, F(1, 62) = 13.86, MSE = 0.64, p < 0.001, η2p = 0.18. These findings indicate that when re-inspecting, learners who had the opportunity to observe the model's gaze replay made a significantly higher number of attempts to integrate the more relevant pictorial and verbal parts of the learning material, compared with learners who were not shown the EMME.

Look-from text to picture

A MANOVA with look-from corresponding and non-corresponding text segments to corresponding picture segment fixation times revealed the effect of condition, F(2, 61) = 12.91, p < 0.001, η2p = 0.29. Univariate tests showed significant differences for both the former, F(1, 62) = 17.62, MSE = 5.68, p < 0.001, η2 = 0.22, and latter, F(1, 62) = 16.01, MSE = 8.41, p < 0.001, η2 = 0.20, text segments. All look-from fixation times were longer for learners in the EMME condition.

Look-from picture to text

An effect of condition also emerged from a MANOVA with look-from corresponding and non-corresponding picture segments to corresponding text segment fixation times, F(2, 61) = 6.03, p = 0.004, η2p = 0.16. Univariate tests showed that learners in the EMME condition re-fixated for a longer time the verbal segments when re-inspecting the corresponding graphical segments than learners in the no-EMME condition, F(1, 62) = 7.53, MSE = 9.92, p = 0.008, η2p = 0.11.

These findings about the duration of look-from times indicate that the learners who observed the model's eye-movements were involved in longer integrative processing of the corresponding verbal and graphical information than the learners who did not see the model's gaze replay. This occurred when considering both the text and picture as the anchoring points for rereading or re-inspecting. RQ2: Does the EMME of integrative processing help to learn more deeply from the illustrated text?

To answer this research question, we carried out a MANOVA with condition as the between-subject factor and the z-standardised values of reading comprehension as factor of covariance, as the interaction term (covariate × experimental factor of condition) was also incorporated into the design to control for a possible aptitude–treatment interaction. The MANOVA revealed the multivariate effects of condition, F(3, 58) = 3.66, p = 0.005, η2p = 0.16, and reading comprehension, F(3, 58) = 7.98, p < 0.001, η2p = 0.29.

Regarding the effect of condition, univariate tests showed that students in the EMME condition outperformed students in the no-EMME condition for the transfer of knowledge (no-EMME: M = 3.86, SE [standard error] = 0.35; EMME: M = 5.48, SE = 0.34), F(1, 60) = 10.72, MSE = 3.89, p = 0.002, η2 = 0.15. For verbal recall (no-EMME: M = 5.64, SE = 0.55; EMME: M = 6.31, SE = 0.53) and factual knowledge (no-EMME: M = 5.56, SE = 0.43; EMME: M = 6.43, SE = 0.41), no significant differences between groups emerged (Figure 2). These findings indicate that readers who observed the model's gaze replay outperformed, for deeper learning, readers who did not have this opportunity.

figure

Mean scores for the post-reading measures as a function of condition. Standard errors are represented by the error bars

Regarding the covariate, univariate tests showed that it correlated significantly with all the post-reading measures of verbal recall, F(1, 60) = 21.78, MSE = 9.53, p < 0.001, η2 = 0.26, factual knowledge, F(1, 60) = 16.29, MSE = 5.77, p < 0.001, η2 = 0.21 and transfer of knowledge, F(1, 60) = 9.53, MSE = 3.89, p = 0.003, η2 = 0.13. RQ3: Do reading comprehension skills moderate the effect of eye-movement modelling on learning?

The analysis of the moderating role of reading comprehension on learning from text revealed a significant interaction between the former and condition, F(3, 58) = 2.93, p = 0.041, η2p = 0.13. This indicates that the effectiveness of the instructional intervention on information comprehension was to some extent moderated by participants' overall reading comprehension. Univariate tests revealed that this interaction effect concerned both the acquisition of factual knowledge, F(1, 60) = 6.92, MSE = 5.77, p = 0.011, η2p = 0.10, and the transfer of knowledge, F(1, 60) = 6.39, MSE = 3.89, p = 0.014, η2p = 0.09.

To follow-up on the significant interaction, a simple slope analysis was conducted at −1 SD and +1 SD of the variable (Aiken & West, 1991) for each post-reading measure. For factual knowledge, the slope analysis indicates that readers with lower reading comprehension (ie, 1 SD below the mean) were influenced by condition and acquired better factual knowledge when they observed the model's gaze replay, compared with participants who did not observe the video (b = 3.92, SE = 1.13, β = 0.64, p = 0.010). On the other hand, readers with a higher reading comprehension (ie, 1 SD above the mean) were not influenced by condition in acquiring factual knowledge (b = –0.137, SE = 1.73, β = –0.24, ns [not significant]) (see Figure 3).

figure

Plot of a significant interaction effect of reading comprehension and condition on the acquisition of factual knowledge

The same slope analysis for the transfer of knowledge again revealed that readers with lower reading comprehension were influenced by condition. They were also better able to apply the new concepts when they observed the model's gaze replay compared with readers who were not shown the video (b = 4.28, SE = 1.04, β = 0.75, p = 0.001). On the other hand, readers with higher reading comprehension (ie, 1 SD above the mean) were again not influenced by condition in their transfer performance (b = 0.25, SE = 1.08, β = 0.07, ns) (see Figure 4).

figure

Plot of a significant interaction effect of reading comprehension and condition on the transfer of knowledge

Discussion

The aim of this study was to corroborate and extend previous research on eye-movement modelling. The study examined the potential of this technological tool for indirectly teaching strategic reading of an illustrated text and contributing to deeper learning, especially for students with low reading comprehension skills. The first research question asked whether readers who observed a short video (showing a model's integrative processing of text and picture) would integrate more verbal and graphical information during their own reading of an illustrated text. Results confirmed our hypothesis as they made more gaze shifts from text to picture and vice versa and spent more time re-inspecting the depicted elements when rereading text segments. They also spent more time rereading the latter when re-inspecting the graphical segments. Integrative processing during the second-pass reading indicates more strategic, purposeful and effective reading (Mason, Pluchino & Tornatora, 2013; Mason, Tornatora & Pluchino, 2013). This outcome corroborates the findings of the only previous study that has examined the effectiveness of eye-tracking technology in the school context for modelling text and picture integration (Mason et al, 2015).

The second research question asked whether the readers who had the opportunity to observe the video with the model's gaze replay would also learn better from the illustrated text at the deeper level, as indicated by the transfer task. Results confirmed our hypotheses as the EMME students were better able to apply the new concept of the atmospheric greenhouse effect.

From the findings regarding the first and second RQs, it is plausible that the stronger integrative processing activated by the readers in the EMME condition underlies the deeper level of learning from text. This is revealed in the task requiring the application of newly learned knowledge to situations or phenomena not mentioned in the text. The study also provides further objective evidence of the theoretical assumptions of the multimedia principle (Mayer, 2009, 2014) when learning from multiple representations. Coherence formation through integrative processes is essential for the successful learning from an illustrated text.

The third RQ asked whether reading comprehension skills would moderate the effect of eye-movement modelling on learning. An aptitude–treatment interaction occurred, that is, an interaction between the perceptual tool and a participant characteristic. The video that modelled a strategic reading of the illustrated text was in fact helpful to less proficient readers in particular. Their overall reading comprehension skills moderated not only deeper learning in the transfer task, but also the acquisition of factual knowledge at the text-base level (Kintsch, 1998). An important finding is that students with low reading comprehension skills in particular could benefit from the EMME. It means that they are responsive to a perceptual guide (with no simultaneous accompanying verbal instruction) for the execution of a task that is far from perceptual, although it entails perception processes. This finding regarding the special benefits for weaker readers of modelling through the gaze replay of an expert, extends existing research in (Mason et al, 2015) and out of the school context (Jarodzka et al, 2012; Jarodzka, van Gog, Dorr, Scheiter & Gerjets, 2013) and in medical education (Seppänen & Gegenfurtner, 2012).

Nevertheless, the study has a limitation that is necessary to overcome in future research. The design with two conditions is not optimal. Further investigation should include another condition of eye-movement modelling, or a condition in which a reading strategy is modelled without the use of the gaze replay, in order to be able to examine in depth the effectiveness of an EMME.

Conclusion

This study has educational significance as it provides new evidence of the potential of eye-tracking technology as an indirect instruction tool. It offers quantitative and objective data on the allocation of visual attention to track cognitive processing while reading texts, or inspecting pictures. Eye-tracking technology also makes it possible to replay eye-movements on a video, which can be then easily used in the classroom without an eye-tracker to indirectly teach complex activities. In the present study, the video of a model's eye-movements guided students' attention, helping them carry out a crucial learning activity successfully. Eye-movement modelling was advantageous to less skilful readers in particular, who were supported in online integrative processing during the reading of an illustrated text, and also in learning deeply from it. In this regard, the findings point to the potential of eye-tracking to assess online processing of text reading and picture inspection in students with different levels of reading skills. Seeing the processing patterns of less skilful readers allows teachers to identify the characteristics of their poor reading behaviour and performance. This enables teachers to plan specific interventions in favour of these students, based on objective data about the allocation of their visual attention during reading. For example, it seems crucial that less skilful readers should be made aware that rereading and re-inspecting, or the more purposeful processing of text and graphics, makes a difference.

The use of eye-tracking technology in school to guide students' attention requires a concrete and productive collaboration between researchers and teachers. Researchers can prepare the videos to be used in the classroom to show the ocular behaviour of a model who is carrying out a specific task. Teachers can readily use the videos to facilitate execution of the task in the service of students' learning. Teachers can also use videos with the gaze replays of their students while executing a given task to solicit student reflections on the processing behaviour that they activated (Penttinen, Anto & Mikkilä-Erdmann, 2013). In this way, students can be helped to refine their awareness of what strategies are more and less effective for carrying out a learning task successfully.

Data sharing

At the moment we declare that the data will be available by individual application directly to the first author. In the near future they can be available through an institutional repository.

Ethical guidelines

We declare that the study was conducted in accord with the American Psychological Association (APA) standards for ethical treatment of subjects. For each participant we obtained an informed consent signed by their parents. They had been informed that the tests administered to the students would be easy and would not have any purpose of academic merit evaluation, nor a diagnostic purpose. The collected data would not pertain to an investigation on the individual characteristics of each student. The data would be collected and processed anonymously and exclusively at group level, and they would be subject to scientific communication (oral and written). A scientific report would be provided at the end of the study to share the results. The signed consents are at the university.

Conflict of interest

We declare that we do not have any conflicts of interest regarding the reported study.

Acknowledgement

The study is part of a research project on learning difficulties in the science domain (STPD08HANE_001) funded by a grant to the first author from the University of Padova (Italy), under the founding program for “Strategic Projects.” We are very grateful to all the students involved in the study, their parents and teachers, and the school principals.

    Biographies

    • Lucia Mason is a Professor of Educational Psychology in the Department of Developmental Psychology and Socialization of the University of Padova.

    • Patrik Pluchino is a post-doctoral research fellow in the Department of General Psychology of the University of Padova.

    • Maria Caterina Tornatora is a research fellow in the Department of Developmental Psychology and Socialization of the University of Padova.