The effect of guiding questions on students’ performance and attitude towards statistics
Abstract
Background. In this study, the effect of guidance on students’ performance was investigated. This effect was hypothesized to be manifested through a reduction of cognitive load and enhancement of self-explanations.
Aim. The goal of this study was to investigate the effect of guiding questions on students’ understanding of statistics.
Sample and Method. In an experimental setting, two randomly selected groups of students (N= 49) answered achievement and transfer questions on statistics as a measure of performance. Students in the intervention condition were given guiding questions to direct their way of reasoning before they answered the achievement questions. The students in the control condition were asked to write down their way of thinking before they answered the same achievement questions. In this way, both groups were stimulated to self-explain, but only the reasoning processes of the students in the intervention condition were guided.
Results and Conclusion. It was found that students in the intervention condition performed significantly better on achievement and transfer questions and that this effect of guidance was mediated by self-explanations. Attitude towards statistics was positively related to performance.
Background
The purpose of this study was to investigate the effect of guidance on students’ performance in statistics and their attitude towards statistics. In general, evidence from empirical studies has favoured guided instruction over unguided instruction (Kirschner, Sweller, & Clark, 2006; Mayer, 2004). Guidance during instruction has proven to be effective with regard to students’ performance both in research on human tutoring (Chi, 1996) and in research on worked examples (Tuovinen & Sweller, 1999). In the educational context of problem-based learning, Budéet al., (2005) have shown that extra guidance provided by the tutor improved students’ performance of statistics.
The positive effects of guidance on students’performance may be attributed to a reduced cognitive load (CL) (van Merriënboer & Sweller, 2005) and to enhanced self-explanations (Chi, 1996; Renkl, 2002). Improved performance in turn may lead to a more positive attitude towards the subject matter (Graham & Weiner, 1987; Weiner, 1986, 1992). A more positive attitude may lead to more active study behaviour of the students in subsequent courses (Peterson, Maier, & Seligman, 1993; Pintrich & Schunk, 1996) and consequently to an even further improved performance.
The present study was set up to examine the relations between these concepts in an experiment in which students had to apply their statistical knowledge. A group that received directive guiding questions was compared to a group with unguided self-explanation instructions, with respect to their performance and their attitude towards statistics.
Cognitive load
CL can be defined as the amount of effort needed by the human cognitive system to process information (Sweller, 1988). The capacity to process newly presented information is limited especially in the working memory (Baddeley, 1992, 2000). CL is high when subject matter makes high demands on the working memory, for example, when multiple concepts have to be simultaneously processed in the working memory because of a high interrelatedness of the concepts. This working memory load directly imposed on the human cognitive system by the inherent nature of the subject matter is called intrinsic CL (Sweller, van Merrienboer, & Paas, 1998). The way in which tasks are presented also affects working memory load. This load is called extraneous CL. When extraneous load is unnecessarily high, learning is hampered (e.g., van Merriënboer & Sweller, 2005). Instruction methods that take the limited capacity of the working memory into account are therefore aimed at reducing this extraneous CL. As a result of the reduction of extraneous CL, the working memory can more optimally be employed for processes relevant to learning and for the acquisition of schemata. The working memory load caused by such relevant information processing is called germane CL (Sweller, van Merrienboer, & Paas, 1998).
Although these different types of CL cannot be measured separately, it is postulated that extraneous and germane CL could be altered by instructional interventions (van Merriënboer & Sweller, 2005). When an instructional intervention leads to improved performance of the students, without raising the overall CL, it is inferred that extraneous load is decreased and germane CL is increased (e.g., Paas, Renkl, & Sweller, 2004; van Gog, Paas, & van Merriënboer, 2006; van Merriënboer & Sweller, 2005).
Carroll (1994) found that an intervention directed at the reduction of extraneous CL, improved performance in mathematics. Paas (1992) found that students’ transfer performance in statistics improved due to this reduction. Students with limited prior knowledge profit most from instruction methods that are aimed at reducing extraneous CL. For more experienced students however, the extra information given to reduce CL may be redundant. This may impede their performance, the so-called expertise reversal effect (Kalyuga, Chandler, Tuovinen, & Sweller, 2001; Tuovinen & Sweller, 1999).
On the basis of these findings, it was expected that reducing extraneous CL for novice statistics learners by providing guidance would free cognitive capacity that consequently could be used for germane CL, that is, for cognitive activities relevant for correct reasoning, such as more elaborate and correct self-explanations.
Self-explanations
Self-explanations can be defined as the internal active explanations of the reasoning steps the learner makes, when answering a question or solving a problem. It has been observed that the majority of the students do not spontaneously engage themselves in self-explanations (Chi, Bassok, Lewis, Reimann, & Glaser, 1989; Renkl, 1999).
Stimulating or prompting self-explanations increases cognitive activity. Relevant cognitive activity promotes the construction of coherent knowledge structures, which enhances understanding of the subject matter and performance of the learner (Chi, 1996; Chi, Siler, Jeong, Yamauchi, & Hausman, 2001). Such relevant cognitive activity increases germane CL and can only be accomplished when the total CL does not lead to an overload of the limited cognitive system. Accordingly, Renkl and colleagues have shown that students’ performance can considerably be improved by combining the elicitation of self-explanations with the reduction of extraneous CL (Renkl, 1997, 1999, 2002; Renkl, Stark, Gruber, & Mandl, 1998)..
The research model
In the present experimental study, the effect of guidance on performance in statistics and attitude towards statistics, via CL and correct self-explanations was investigated. The model presented in Figure 1 shows the hypothesized relations between these constructs and their effects.

The effect of guidance on performance and attitude.
In the model, it is portrayed that guidance is expected to reduce extraneous CL and elicit more correct self-explanations. These effects are hypothesized to enhance performance. High performance, in turn, is expected to be related to a more positive attitude towards statistics.
The relation between CL and self-explanations can be characterized as recursive. The reduction of extraneous CL enables the working memory to be used to a greater extent for self-explanations. The increased number of self-explanations, in turn, heightens germane CL. The circular arrows in the model express this recursive relation.
Guidance
The guidance that was afforded to the students in the present study consisted of providing directive questions, which were aimed at successive reasoning steps. The line of reasoning of the students in the intervention condition was thus guided by the directive questions. In this way, the questions were supposed to reduce extraneous CL.
Research on reducing extraneous CL in instruction has mainly focused on worked examples (e.g., Tuovinen & Sweller, 1999; Renkl, 2002; Gerjets, Scheiter, & Catrambone, 2006). In worked examples often the successive steps that are needed to come to the solution of a problem are given. These worked examples are usually presented to students in an instructional phase. Subsequently, students have to solve similar problems themselves in a test phase. Worked examples can be considered a form of guidance, because learners do not have to discover all the solution steps themselves.
The guidance in the present study differed from research on worked examples for three reasons. First, there was not a separate instructional phase, because all students had already studied the topics that were questioned in the present study, in an introductory statistics course. The intervention with the directive questions was integrated in the test phase. The question was whether it would lead to a better performance. Second, we did not provide the students information about the correct answers, as is usually done in research on worked examples (e.g., Renkl, 1999, 2002; Ward & Sweller, 1990). The directive questions were only intended to guide students’ reasoning. This intervention was chosen, because we did not want to provide the correct answers to the students. We wanted them to come up with the correct answers themselves and thus to actively apply and expand their knowledge. It is well known that active processing of knowledge leads to better performance (e.g., Bodemer, Ploetzner, Feuerlein, & Spada, 2004; Catrambone and Yuasa, 2006; McNamara, Kintsch, Butler-Songer, & Kintsch, 1996). Third, we did not give students problems to solve, but open-ended questions in which they had to formulate the answers in their own words.
Human tutors can also guide student reasoning. By means of successive dialogue exchanges between the tutor and the students, and by focussing students’ attention, the tutor can scaffold the construction of self-explanations, which in turn promotes learning (Chi, 1996; Chi et al., 2001; Merrill, Reiser, Merrill, & Landes, 1995; Merrill, Reiser, Ranney, & Trafton, 1992). The intervention with the directive questions mimicked human tutors who guide student reasoning, because it focused students’ attention and was expected to elicit self-explanations.
Our approach is different from human tutoring, because no dialogue interaction between the students and their tutor took place. Dialogue interaction and individual differences of tutors are difficult to control in an experimental study and can cause confounding effects. To eliminate this threat to internal validity, we standardized guidance by using only written directive questions. The directive questions were expected to combine the best of both worked examples and human tutoring, respectively, the reduction of CL and the promotion of correct self-explanations. However, by only providing directive questions instead of giving them specific information about the correct answers we diminished the amount of guidance, compared to worked examples. Moreover, the standardized guidance through the directive questions was less flexible than that of human tutoring. Finally, the directive questions were given in the test phase, in order to guide the activation of relevant prior knowledge. A key question was whether this trimmed version of guidance would still reduce extraneous CL and stimulate correct self-explanations, and thus to active application and expansion of knowledge, leading to enhanced performance of the students.
Performance
Performance of all students was measured in two ways, with achievement and transfer questions. Both were open-ended questions, where students had to write down the answers themselves. These open-ended questions intended to measure higher levels of understanding of statistical concepts. The achievement questions required students to combine a number of statistical concepts. For example, explaining the relations between different statistical techniques in a hypothetical study or interpreting the results of the statistical analyses. Answering questions that require the explanation of the relations between several concepts and the explanation of the applicability of those concepts is only possible when students have reached a high level of understanding (Dochy, 2001; Feltovich, Spiro, & Coulson, 1993; Gijbels, Dochy, van den Bossche, & Segers, 2005; Jonassen, Beissner, & Yacci, 1993).
In addition to these achievement questions, performance was also measured with transfer questions. The questions that were used in this study can be characterized as pertaining to near transfer, that is questions covering similar but slightly differently presented statistical topics. Transfer questions can be regarded as the ultimate measure of understanding. Previous studies have shown that performance on transfer questions is better when students develop a high level of understanding (Barnett & Ceci, 2002; Catrambone, 1998; Mayer, 1989; Olson & Biolsi, 1991).
Efficiency
Guidance is expected to reduce extraneous CL and to enhance correct self-explanations. Enhanced self-explanations were expected to recursively raise germane CL. Due to the guidance, the resulting overall level of CL could therefore be higher, lower, or remain the same. However, the overall level of CL per se is not very meaningful; the combination with performance is more important. Efficiency is a concept that combines performance with perceived CL (Paas & van Merriënboer, 1993; Tuovinen & Paas, 2004). When students in one group perceive the same CL as students in another group, but their performance is higher, then their efficiency is higher. Alternatively, when students in one group perform equal to students in another group, but their CL is lower, then their efficiency is also higher. In the present study, it was expected that due to the guidance, the efficiency would be raised. For an explanation of how performance was measured see sections ‘Achievement questions’ and ‘Transfer questions’, of how CL was measured see section ‘Mental effort rating’, and for the formula to compute efficiency see section ‘Variables’.
Attitude
Attitudes and beliefs of learners are rarely studied in research on the effect of guidance during instruction. However, attitudes and beliefs are known to affect students’ learning of statistics (Gal, Ginsberg, & Schau, 1997). Negative events can trigger a downward spiral of an increasingly negative attitude (Graham & Weiner, 1987; Peterson et al., 1993; Pintrich & Schunk, 1996; Weiner, 1986, 1992). For example, a negative attitude may impede learning of statistics. Unsuccessful learning and failing exams in turn will lead to an even more negative attitude (Budéet al., 2006). In the present study, it was investigated whether a positive experience would lead to a more positive attitude. More specifically, the question was whether guidance, via an improvement of the performance, would lead to a more positive attitude towards statistics. This effect is displayed on the right-hand side of the model in Figure 1.
In the present study, the effects displayed by this model were investigated. The research hypotheses were:
- 1
Guidance will improve performance via reduced extraneous CL and enhanced correct self-explanations.
- 2
Guidance will raise the efficiency of the students.
- 3
Guidance will lead to a more positive attitude towards statistics via a higher performance.
Method
Participants
Second year bachelor students from the faculty of Health Sciences were recruited during educational activities, after they had followed an introductory course of statistics a half year before the present study. This introductory course covered the subject matter that was tested in this study, that is, the students had previously studied the topics that were questioned. They were told that they had to fill in questions about statistics and that they would be paid 10 euro. This payment was given to avoid attracting only motivated students who are particularly interested in statistics. Thus, a pool of 110 volunteers was obtained. From this pool, 50 students were randomly assigned to two conditions. One student in the intervention condition did not show up resulting in: ncontrol= 25 and nintervention= 24. Thirty-nine of these students were female, 10 were male. Usually, approximately 75% of the Health Sciences’ students are female. The age of the students in this study ranged from 20 to 26 years (M= 20.6; SD= 0.4).
Design and procedure
In this experiment, there were no separate instructional phase and test phase, because all students had previously studied the topics that were tested in this study, in an introductory statistics course. Therefore, it was assumed that all students were acquainted with these topics before the start of the test sessions. The sessions were either of the intervention or the control condition. Before the start of the test, students in both conditions received written and oral instructions about how to answer all the questions, to rate the CL, and to fill in an attitude questionnaire. In these instructions, it was emphasized to answer all test questions as elaborately as possible. Students in the control condition were asked to report their line of reasoning while trying to answer the test questions, that is, to write down all their thinking steps. Students in the intervention condition were asked to answer some guidance questions before answering the test questions. After receiving these instructions, students in both conditions then answered 10 achievement questions. In the intervention condition, guidance was afforded to the students to direct their line of reasoning. The guidance consisted of a few directive questions prior to answering each achievement question. The students in the intervention condition first answered these guiding questions, and then answered the concerning achievement question. In the control condition, the students wrote down all their thinking steps before they answered the same achievement questions. That is, they were asked to report how they arrived at their answers. This approach was chosen for two reasons. First, in this way students in both conditions were stimulated to engage in self-explanations, but only the students in the intervention condition were guided by directive questions. This enabled the comparison of mere activation versus guidance of the reasoning process. Second, in this way the time-on-task for both conditions was balanced.
After each achievement question, students in both conditions rated the CL they perceived with regard to answering that question. After completing the 10 achievement questions, students in both conditions answered three transfer questions. Answering the transfer questions was the same in the two conditions. Finally, they filled in a questionnaire measuring their attitude towards statistics.
So, students in the intervention condition (1) received instructions, (2) received the first open-ended achievement question, (3) received some guiding questions belonging to that achievement question, (4) answered these guiding questions, (5) answered the concerning achievement question, and (6) rated their CL. After that, they proceeded to the second achievement question. This procedure was repeated until all 10 achievement questions were answered. Next, they answered the three transfer questions. Finally, they filled in the attitude questionnaire. The students in the control condition followed the same procedure except for steps 3 and 4. They had to report their thinking steps before they answered each achievement question.
There were no time limits. It was recorded how much time the students needed for the whole procedure (time-on-task). No difference in required time for the transfer questions and the questionnaire between the conditions was expected, because as explained, for both conditions, the same transfer questions and the same questionnaire were used. Therefore, possible differences in the recorded time were supposed to reflect differences in time needed for answering the achievement questions. The whole procedure took for both groups of students little over 1 hr (see Table 1). Students were tested in small group sessions ranging from four to eight people. On completion of the attitude questionnaire students received €10.
M control | M intervention | SE | t | Cohen's d | p-value | |
---|---|---|---|---|---|---|
Achievement Questions | ||||||
(Ten questions, range 0–28) | 5.40 | 9.71 | 1.16 | 3.72 | 1.06 | .002 |
Transfer Questions | ||||||
(Three questions, range 0–12) | 2.96 | 4.75 | .52 | 3.42 | 0.98 | .002 |
Self-explanations | 5.76 | 19.21 | 1.66 | 8.19 | 2.33 | .000 |
Mental Effort | 64.88 | 66.88 | 2.56 | .78 | 0.22 | .876 |
Efficiency | –.38 | .49 | .29 | 3.04 | 0.86 | .004 |
Attitude | 80.48 | 86.46 | 4.83 | 1.24 | 0.35 | .222 |
Time | 64.23 | 67.42 | 2.55 | 1.25 | 0.36 | .567 |
Grades | 7.62 | 7.75 | .47 | .27 | 0.07 | .792 |
- Note. p-values Bonferroni corrected
Materials
Guiding questions
For the intervention condition, directive questions were formulated. These directive questions helped the students to gradually build up their line of reasoning. They focused the students on relevant issues, indicating the direction of reasoning step by step. Students had to explain each step themselves. In comparison to the control condition, no extra information was given about the correct answers to the achievement questions. For examples of guiding questions, see Appendix A.
Achievement questions
Ten achievement questions (Achievement Questions) were formulated towards a hypothetical health sciences study in which two independent groups were compared. The difference between the means of the two groups was analysed with three techniques, a t-test, a regression analysis, and a one-way analysis of variance. The hypothetical study and the results of the analyses were presented to the students. The achievement questions asked for analogies and differences of the three techniques or the interpretation of the results of the analyses. Correct answers required relating the three statistical techniques to each other, to the study, or to the analysis results.
The hypothetical study and examples of the questions are also given in Appendix A. Students in both conditions received the same achievement questions. Their answers were scored blindly making use of an answer key. A reliability analysis showed that the Achievement Questions (α= .72) were quite reliable.
Transfer questions
Three open-ended transfer questions (Transfer Questions) were formulated towards a similar hypothetical study and the same statistical techniques for the comparison of two independent groups. However, a different research question, different figures, and different p-values were presented. For this study and a corresponding question, see Appendix B. The transfer questions were identical for both conditions and also scored blindly making use of an answer key.
The Transfer Questions were also quite reliable (α= .68). The interrater reliability for two raters, calculated for the combination of the Achievement Questions and the Transfer Questions, was high. The mean ratings did not significantly differ (N= 10; Mrater 1= 14.7; Mrater 2= 13.8; t= 1.13, p= .287), the correlation was high (r= .955), and the standard deviations were similar (SDrater 1= 7.9; SDrater 2= 6.7).
Mental effort rating
CL was operationalized as the intensity of mental effort being expended by students. Students rated this mental effort (Mental Effort) on a 9-point Likert scale, responding to the following question: Indicate how much mental effort it demanded to answer the above question. The labels assigned to the numerical values of the scale ranged from: very, very low mental effort (1), neither low nor high mental effort (5), to very, very high mental effort (9). After each achievement question, it was asked how much mental effort the answer to that particular question required.
This scale for mental effort can be used as an indicator of CL (Paas, 1992; Paas & van Merriënboer, 1993), and has shown to be a reliable and valid measure (Gimino, 2002; Paas, Tuovinen, Tabbers, & Van Gerven, 2003; Paas, van Merriënboer, & Adam, 1994). The reliability analysis in this study also showed a proper reliability: Mental Effort (α= .73).
Attitude questionnaire
An existing attitude questionnaire, the Survey of Attitudes Toward Statistics (SATS; Gal et al., 1997), was modified to measure the attitudes towards statistics in this study. The SATS focuses on attitude towards statistics in general. It consists of statements that have to be rated on a 7-point Likert scale. For this study, some of the statements were adapted and some statements were added. The original statements focused on studying statistics in general and the adapted statements focused on the participation of the students in the present study. For the statements of our attitude questionnaire, including the original and the adapted statements, see Appendix C.
Our questionnaire, like the SATS, comprises four subscales, measuring separate aspects of the attitude towards statistics. These four aspects are: value, difficulty, affect, and cognitive competence. Statements regarding value are focussed at usefulness, relevance, and worth of statistics. Statements regarding difficulty focus at the difficulty of statistics. Statements regarding affect were directed at emotional aspects and feelings towards statistics. Finally, statements regarding cognitive competence focussed at students’ intellectual ability and skills with respect to statistics.
Reliability analyses showed that all four subscales of the attitude questionnaire: Value (α= .74), Difficulty (α= .76), Affect (α= .73), and Cognitive Competence (α= .74) were quite reliable.
Analysis
Variables
Guidance was the independent variable with two levels: the control condition and the intervention condition. Achievement Questions consisted of the sum scores of the 10 achievement questions. Some questions were more extensive and were allotted a greater weight. The scores could range from 0 to 28. Transfer Questions consisted of the sum scores of the three transfer questions, ranging from 0 to 12. Mental Effort is composed of the sum scores of the mental effort ratings. Self-Explanations comprises the correct answers to the guiding questions of the students in the intervention condition and the counted correct steps in the lines of reasoning of the students in the control condition, as we were only interested in correct answers and correct reasoning. Performance stands for the overall performance score and was obtained by summing up Achievement Questions and Transfer Questions. The four subscales of the attitude questionnaire constitute: Value, Difficulty, Affect, and Cognitive Competence. Attitude is the sum score of the four subscales.

Time stands for the registered time needed for students in both conditions to answer all questions and to fill in the attitude questionnaire. Finally, Grades are the students’ grades on the exam of the preceding statistics course in which students studied the topics that were questioned in the present study.
Statistical analyses
The first hypothesis was that guidance would improve performance via CL and self-explanations. To confirm this hypothesis, direct and indirect effects of Guidance were tested, that is once without and once with controlling for mediating variables (Baron & Kenny, 1986). To test for the direct effects of Guidance on the Achievement Questions, Transfer Questions, Mental Effort, Self-Explanations, and Time, the two conditions were compared with t-tests. Regression analyses were done to establish the indirect effect of Guidance via Mental Effort and Self-Explanations on both Achievement Questions and Transfer Questions separately. In both regression analyses, Time was included in the model as a covariate.
The second hypothesis was that guidance would raise the efficiency of student learning. To confirm this hypothesis, a t-test was done to compare the Efficiency of the two conditions.
The third hypothesis was that guidance would lead to a more positive attitude via performance. To confirm this hypothesis again a t-test and a supplementary regression analysis were done. To test for the direct effect of Guidance on Attitude, a t-test was done. To establish the effect of Guidance via Performance on Attitude, a regression analysis was done (Baron & Kenny, 1986).
To determine the construct validity of the attitude questionnaire, the correlations were calculated among the four separate attitudinal aspects (Value, Difficulty, Affect, and Cognitive Competence) and Performance. Finally, as a randomization check, Grades of the students in both conditions were compared with a t-test.
Results
The first hypothesis was confirmed, the t-tests showed significant differences on Achievement Questions, Transfer Questions, and Self-Explanations, but no significant difference on Mental Effort and Time. Thus, students in the intervention condition performed better and reported more correct self-explanations, but perceived the same CL as the control group and used the same amount of time. The results of the t-tests are presented in Table 1.
The regression analysis of Achievement Questions on Guidance, Mental Effort, and Self-Explanations showed a significant effect of Mental Effort and Self-Explanations, but no significant effect of Guidance. The covariate Time had no significant effect and was excluded from the model. The results of the final regression model are presented in Table 2.
Model | B | SE | t | p-value |
---|---|---|---|---|
Constant | 10.00 | |||
Guidance | –1.29 | 1.34 | –0.97 | .337 |
Mental Effort | –0.11 | 0.05 | –2.24 | .030 |
Self-Explanations | 0.43 | 0.08 | 5.68 | .000 |
- Note: The non-significant covariates, Time and Mental Effort × Self-Explanations, were excluded from the model in a backward stepwise model-selection procedure
The regression analysis of Transfer Questions on Guidance, Mental Effort, and Self-Explanations analogously showed a significant effect of Mental Effort and Self-Explanations, but no significant effect of Guidance and Time. The results of this analysis are presented in Table 3.
Model | B | SE | t | p-value |
---|---|---|---|---|
Constant | 9.47 | |||
Guidance | 0.37 | 0.58 | 0.63 | .533 |
Mental Effort | –0.11 | 0.02 | –5.22 | .000 |
Self-Explanations | 0.12 | 0.03 | 3.70 | .000 |
- Note: The non-significant covariates, Time and Mental Effort×Self-Explanations, were excluded from the model in a backward stepwise model-selection procedure
These results of the t-tests and both regression analyses support the first hypothesis partially. The effect of Guidance is mediated Self-Explanations but not by Mental Effort.
The second hypothesis with respect to the effect of Guidance on Efficiency was confirmed. Efficiency was significantly higher for students in the intervention condition (see Table 1).
The third hypothesis referred to the attitude towards statistics. It was hypothesized that the students, who perform better due to guidance, would display a more positive attitude towards statistics after the experiment. This hypothesis was not confirmed. The t-test comparing the Attitude of participants in the two conditions showed no significant difference (see Table 1). Additionally, the regression of Attitude on Performance, Grades, and Guidance showed only a significant effect of Performance and Grades. This means that our intervention did not lead to a more positive attitude, but that students who performed better in our study and on the preceding statistics course showed a more positive attitude towards statistics. The results of the final regression model are presented in Table 4.
Model | B | SE | t | p-value |
---|---|---|---|---|
Constant | 42.24 | |||
Performance | 1.29 | 0.33 | 3.87 | .000 |
Grades | 3.50 | 1.27 | 2.77 | .008 |
The correlations among the subscales of the attitude questionnaire and Performance are presented in Table 5. The pattern of correlations is in line with the results reported by Gal et al., (1997) and Budéet al. (2006). This seems to indicate that our modification of the SATS has proper construct validity.
Value | Difficulty | Affect | Cognitive Competence | |
---|---|---|---|---|
Value | 1 | .362 | .691* | .484* |
Difficulty | 1 | .646* | .611* | |
Affect | 1 | .701* | ||
Cognitive Competence | 1 | |||
Performance | .455* | .450* | .650* | .514* |
- Notes. *p < .05; p-values Bonferroni corrected. N= 49
The randomization check showed no difference between the intervention and the control condition, as Grades of students in both conditions did not differ significantly (see Table 1).
Discussion
In this study, the positive aspects of human tutoring and learning with worked examples were combined in our intervention, guidance by directive questions. It was expected that this form of guidance would stimulate correct self-explanations, decrease extraneous CL, and enhance performance of statistics. The results of this study show that the directive questions stimulated correct self-explanations and enhanced performance on the achievement questions, without raising the overall CL. Consequently, the efficiency of the students in the intervention condition was significantly higher than in the control condition.
These results indicate that providing directive questions is an efficient way to successively guide the students’ line of reasoning, which in turn enhances the performance without causing cognitive overload.
The direct effect of guidance, as shown by the significant results of the t-tests on the achievement and transfer questions, disappeared in the regression analyses when self-explanations were included as covariate in the model. This indicates that the effect of guidance is mediated by self-explanations (Baron & Kenny, 1986). In other words, the effect of guidance is mediated by self-explanations as we hypothesized. It was not mediated by CL, because the overall CL did not significantly differ between the two conditions. Time-on-task had no significant effect, neither when it was examined in the t-test nor in the regression analyses. There is no reason to suppose that the two groups differently distributed their time over the tasks. Both groups were stimulated to self-explain while answering the achievement questions, they answered the same transfer questions and filled in the same questionnaire. It can, therefore, be concluded that the time spent on answering the achievement questions had no confounding effect in this study, although only the total time was recorded.
The achievement questions were designed to elicit from the students the explanation of the relations between several statistical concepts and the applicability of these concepts. Answering such questions correctly is only possible when a higher level of conceptual understanding is attained (Dochy, 2001; Feltovich et al., 1993; Gijbels et al., 2005; Jonassen et al., 1993). Students in the intervention condition showed a better performance on the achievement questions. Moreover, they also answered the transfer questions significantly better than the students in the control condition. As transfer questions are regarded as the ultimate measure of understanding (Barnett & Ceci, 2002; Catrambone, 1998; Mayer, 1989; Olson & Biolsi, 1991), the enhanced performance on especially the transfer questions can be interpreted as a higher level of conceptual understanding of the students in the intervention condition. These results indicate that, even within one session, students in the intervention condition more actively applied and consequently expanded their knowledge of the subject matter.
The attitude towards statistics was hypothesized to be more positive for students in the intervention condition. The attitude questionnaire consisted of statements on studying statistics in general as well as on the participation in the experiment. Although the effect of guidance on attitude did not reach significance, there was a positive trend in the expected direction. This is in line with our hypothesis, but the guidance effect of the intervention was probably too small to reach significance. An additional problem is that it cannot be ruled out that the students who volunteered to participate in the study had a more positive attitude than the overall population of students. In this light, it is worthwhile to consider the mean grades of the preceding statistics course. The mean grades are rather high for the students in both conditions. Possibly, relatively good students with already a positive attitude prior to the study volunteered to participate. This restricted range may have lowered r2.
We did find, however, a significant effect of performance and the students’ grades on attitude. This means that, aggregated over the two conditions, students who performed better on the questions in this study and on the preceding introductory statistics course, showed a more positive attitude. So, it can be concluded that performing well is positively correlated with attitude towards statistics. This conclusion is in line with theories regarding attitudes (Graham & Weiner, 1987; Peterson et al., 1993; Pintrich & Schunk, 1996; Weiner, 1986, 1992). Of the four attitude scales, affect was most strongly correlated with performance. This corroborates previous research on attitudes and performance (Budéet al., 2006). Stimulating a positive affect of students towards statistics may improve performance in statistics education.
The results also showed that the reliability of all the measurements was relatively high. Moreover, the reliability of the subscales of the attitude questionnaire as well as the pattern of correlations among them is in line with previous research. We found no significant correlation between value and difficulty, the strongest correlation between affect and cognitive competence, and moderate correlations for the other combinations. This resembles the pattern of the SATS that is reported by Gal et al. (1997). The similarities of these patterns can be interpreted as a validation of the modification of the SATS that was used in this study.
A limitation of this study is that possibly more proficient students volunteered to participate. The randomization check showed no significant difference between the two conditions regarding the mean grades on the exam of the preceding statistics course. As mentioned above, however, the mean grades of students in both conditions were higher than the mean grade of the rest of the cohort of bachelor students. This might indicate that the students in the study had more prior knowledge of the subject matter than the overall population of students. It might be that directing reasoning processes by asking guiding questions requires a certain level of prior knowledge of the students. In future research it could be investigated whether the guiding questions have the same positive effect on students’ conceptual understanding with different levels of prior knowledge.
A second limitation of the study is that we could not measure how well the treatment of the students in the control condition has activated them. The students in the control condition were stimulated to reflect on their responses and asked to write down their thinking steps. It was expected that this procedure would enable us to answer our research question whether guiding the self-explanations would lead to an improved performance compared to unguided self-explanations. For this reason, we did not add a condition in which no treatment was given at all. It has already been shown that more active reasoning leads to an improved understanding (e.g., Bodemer et al., 2004; Catrambone and Yuasa, 2006; McNamara et al., 1996). However, asking students to write down their thinking steps may not have activated them as much as the students in the intervention condition. Future research could be directed at a comparison between guiding questions and questions that are less directive for reasoning activities. In this way, mere activation of reasoning can better be contrasted with directing the reasoning processes.
In conclusion, guiding the students by providing directive questions, that is, without spoon-feeding them specific information about the correct answers on the achievement questions, stimulated the number of correct self-explanations without raising the overall CL, and this resulted in a higher level of understanding. It can be inferred that this higher level of understanding was caused by the guiding questions in this randomized experiment, because it can be presumed that before the start of the study, students in both conditions had an equal understanding of the topics that were questioned.
The conclusion that directive guidance leads to better understanding may have wide implications for educational practice. Directive questions that step-by-step guide the correct activation of students’ prior knowledge could, for example, be used in problem-based learning, in textbooks, in formative testing, or in electronic learning environments.
Appendices
Appendix A
A hypothetical study and examples of the achievement questions
A health scientist studied the effect of a drug in two randomized groups of patients with hypertension. One group received a placebo (condition = 0); the other group received the drug for 1 month (condition = 1). After this month, the researcher measured the diastolic blood pressure of all subjects and compared the two groups. The research hypothesis was: the mean diastolic blood pressure of the group patients who received the drug will be lower than the mean diastolic blood pressure of the group patients who received the placebo. The data were analysed with a t-test, linear regression analysis, and analysis of variance. The results of these analyses are presented below.
t-Test
Group statistics | |||||
---|---|---|---|---|---|
Condition | N | Mean | Std. deviation | Std. error mean | |
Blood pressure | 0 | 30 | 98.367 | 9.6888 | 1.7689 |
1 | 30 | 93.043 | 8.5685 | 1.5644 |
Independent samples test | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Levene's test for equality of variances | t-Test for equality of means | |||||||||
Sig. | 95% confidence | |||||||||
(two- | Mean | Std. error | interval of the | |||||||
F | Sig. | t | df | tailed) | difference | difference | difference | |||
Lower | Upper | |||||||||
Blood pressure | Equal variances assumed | .111 | .740 | 2.255 | 58 | .028 | 5.324 | 2.3615 | .5972 | 10.0511 |
Equal variances not assumed | 2.255 | 57.146 | .028 | 5.324 | 2.3615 | .5957 | 10.0526 |
Regression
Variables entered/removedb | |||
---|---|---|---|
Model | Variables entered | Variables removed | Method |
1 | Conditiona | . | Enter |
- aAll requested variables entered.
- bDependent variable: blood pressure.
Model summary | ||||
---|---|---|---|---|
Model | R | R 2 | Adjusted R2 | Std. error of the estimate |
1 | .284a | .081 | .065 | 9.1459 |
- aPredictors: (constant), condition.
ANOVAb | ||||||
---|---|---|---|---|---|---|
Model | Sum of squares | df | Mean square | F | Sig. | |
1 | Regression | 425.198 | 1 | 425.198 | 5.083 | .028a |
Residual | 4,851.512 | 58 | 83.647 | |||
Total | 5,276.709 | 59 |
- aPredictors: (constant), condition.
- bDependent variable: blood pressure.
Coefficientsa | ||||||||
---|---|---|---|---|---|---|---|---|
Unstandardized | Standardized | |||||||
Model | coefficients | coefficients | 95% confidence interval for B | |||||
B | Std. error | Beta | t | Sig. | Lower bound | Upper bound | ||
1 | (Constant) | 98.367 | 1.670 | 58.909 | .000 | 95.024 | 101.709 | |
Condition | –5.324 | 2.361 | –.284 | –2.255 | .028 | –10.051 | –.597 |
- aDependent variable: blood pressure.
One-way ANOVA
Blood pressure
ANOVA | |||||
---|---|---|---|---|---|
Sum of squares | df | Mean square | F | Sig. | |
Between groups | 425.198 | 1 | 425.198 | 5.083 | .028 |
Within groups | 4,851.512 | 58 | 83.647 | ||
Total | 5,276.709 | 59 |
Examples of achievement questions
Question 1. How would you interpret the result of the t-test?Explain your conclusion with regard to the research hypothesis.
Question 2. What is the value of the regression coefficient for the independent variable?Explain what effect this coefficient represents.
Question 3. In the regression analysis table a t-test is given for condition. Describe and explain what is tested with this test and why.
Question 4. Write down the regression equation and calculate the mean diastolic blood pressure of both groups with this equation.
Question 5. In the ANOVA-table an F-value is given. Describe how it is calculated and explain why this is done in this way (explain the rationale for this procedure).
Examples of guiding questions (pertaining to the achievement questions above).
Guiding questions for achievement question 1:
How is a t-value calculated, what formula is used?
What is the t-value here, what is its p-value?
When is a null hypothesis rejected?
What does the t-test say about the two groups in the presented experiment?
Guiding questions for achievement question 2:
Plot the regression line in a x- and y-axis diagram
What is the independent variable, what is the dependent variable?
Through which points does the regression line go?
What do regression coefficients stand for?
What does the coefficient for the independent variable in this analysis stand for?
Guiding questions for achievement question 3:
If there is no difference between the two groups, what would the slope of the
regression line be?
What value would the regression coefficient of the independent variable have?
What test can be done to establish whether there is a difference between the groups?
Guiding questions for achievement question 4:
Consider again the regression line that you drew. What/where is the intercept?
What does the intercept stand for in this case?
Where did you pinpoint the two conditions?
Guiding questions for achievement question 5:
What information is used in the analysis of variance?
What sources can you distinguish and what do they mean?
How is the F-ratio constructed?
What is in the numerator, what is in the denominator of the F-ratio?
Appendix B
An example of a transfer question and the alternative hypothetical study
A dermatologist studied the effect of an ointment in two randomized groups of patients with a skin disease. One group received the ointment for 1 month; the other group received a placebo. After this month, the researcher measured the number of complaints of all subjects and compared the two groups. The research hypothesis was: the number of complaints of patients who received the ointment will be lower than of patients who received the placebo. The data were analysed with linear regression.
Interpret the results, in terms of means, R2-value, p-value, research hypothesis, etc, of these analyses below as completely as possible. Explain your conclusions.
Variables entered/removedb
Model | Variables entered | Variables removed | Method |
---|---|---|---|
1 | Conditiona | . | Enter |
- aAll requested variables entered.
- bDependent variable: eczema.
Model summary | ||||
---|---|---|---|---|
Model | R | R 2 | Adjusted R2 | Std. error of the estimate |
1 | .131a | .017 | .000 | 17.5263 |
- aPredictors: (constant), condition.
ANOVAb | ||||||
---|---|---|---|---|---|---|
Model | Sum of squares | df | Mean square | F | Sig. | |
1 | Regression | 312.209 | 1 | 312.209 | 1.016 | .318a |
Residual | 17,815.974 | 58 | 307.172 | |||
Total | 18,128.184 | 59 |
- aPredictors: (constant), condition.
- bDependent variable: eczema.
Coefficientsa | ||||||||
---|---|---|---|---|---|---|---|---|
Unstandardized | Standardized | |||||||
Model | coefficients | coefficients | t | Sig. | 95% confidence interval for B | |||
B | Std. error | Beta | Lower bound | Upper bound | ||||
1 | (Constant) | 23.977 | 3.200 | 7.493 | .000 | 17.572 | 30.383 | |
Condition | –4.562 | 4.525 | –.131 | –1.008 | .318 | −13.621 | 4.496 |
- aDependent variable: eczema.
Appendix C
The modified SATS, per subscale
The modified or added statements are marked with an asterisk. All statements were presented in Dutch and in a random order. Each of the items had to be rated on a 7-point Likert scale.
Value
Statistics is worthless.
Participating in this study was useful*.
I consider other courses to be more interesting.
Statistical thinking is not applicable in my everyday life.
Statistics is irrelevant in my life.
The questions in this study were interesting*.
The questions in this study were relevant for upcoming statistics courses*.
Difficulty
Statistics is easy to understand.
Statistics is a complicated subject.
It was easy to answer the questions of this study*.
Statistics is a subject quickly learned by most people.
Affect
I like statistics.
I enjoy taking statistics courses.
I did not like to participate in this study*.
I feel insecure when I have to do statistics problems.
I was under stress during this study*.
I am scared by statistics.
Cognitive competence
I have trouble understanding statistics because of how I think.
I had no idea what the questions in this study were about*.
I feel confident that I will pass all my statistics exams*.
I can learn statistics.