Volume 56, Issue 12 p. 1148-1160
Original Article
Full Access

Demystifying moderators and mediators in intellectual and developmental disabilities research: a primer and review of the literature

C. Farmer

Corresponding Author

C. Farmer

Ohio State University, Psychology, Columbus, OH, USA

Dr Cristan Farmer, 10 Center Drive, Room 1C250, Bethesda, MD 20892, USA (e-mail: farmerca@mail.nih.gov).Search for more papers by this author
First published: 29 January 2012
Citations: 29

Abstract

Background  Intellectual and developmental disability (IDD) researchers have been relatively slow to adopt the search for moderators and mediators, although these variables are key in understanding how and why relationships exist between variables. Although the traditional method of causal steps is useful for describing and understanding moderators and mediators, it is not sufficient for statistical analysis.

Methods  The theoretical and statistical processes of evaluating moderators and mediators are explained in terms familiar to IDD psychologists, using examples from IDD literature. Moderator and mediator analyses in five leading IDD journals are assessed for patterns of usage.

Results  Although the number of publications in the past decade exceeds previous years, the field is still behind others in both the quantity and quality of the use of moderators and mediators.

Conclusion  The field as a whole will advance if the recent theoretical and technical advances outlined in this paper are employed.

Introduction

The examination of ‘third variables’[i.e. other than independent variables (IV) and dependent variables (DV)] in psychological research has a long tradition. A social psychology article by Baron & Kenny (1986) has proven to be especially useful to researchers. According to the ISI Citation Database, the article has been cited over 13 000 times in the years since its publication. The authors laid out definitions for moderation and mediation that are still widely used, although clarifications (mostly pertaining to randomised controlled trials) have been widely accepted (i.e. the MacArthur Guidelines; Kraemer et al. 2001, 2002, 2008). Both the Baron and Kenny (referred to here as ‘BK’) and MacArthur (‘MA’) approaches are couched in terms of causal steps, where a series of paths in a diagram are shown to be significant in order to infer moderation or mediation. Although causal steps has a long history in the social sciences, it is currently being usurped in the literature by an approach that applies direct statistical testing to the moderator and mediator effects (e.g. Preacher & Hayes 2004; Hayes 2009; Hayes & Matthes 2009). Here, the causal steps approach is utilised as a way to (1) understand a popular method that remains in wide use and (2) graphically depict and describe moderated and mediated effects, but it is the direct-testing method that is suggested to be the ‘gold standard.’

Moderator and mediator analyses are extremely sparse in intellectual and developmental disability (IDD) research, as compared with social psychology research. It is not certain why IDD researchers have been relatively slow to adopt the search for moderators and mediators, although several potential roadblocks are readily identified. First, moderator and mediator variables were initially couched in social psychology terms, which can be confusing and daunting to outsiders. Second, even following the Baron & Kenny (1986) paper, the distinction between the variables themselves is often confusing. Third, although technical papers have been written on the subject (e.g. MacKinnon et al. 2002), the exact mechanism for testing the statistical significance of moderators and mediators remains unclear to many researchers.

The goal of this manuscript is to address the three roadblocks described above; I shall attempt to explain moderators and mediators in terms familiar to IDD psychologists, using examples from our literature where possible. I employ some aspects of the BK/MA guidelines in order to limit the confusion between moderator and mediator variables, and describe newer methodology for the direct testing of moderators and mediators. Finally, I review the IDD literature, with a critical eye towards moderation and mediation analysis, in order to identify patterns of usage that should be addressed by future researchers.

It is relevant to note that the discussion in this paper will be limited to cross-sectional (rather than longitudinal) designs. The reasons for this are twofold: those studies in IDD that have used moderator and mediator analyses are almost exclusively cross-sectional, and the addition of time into the moderation or mediation model complicates the definition and statistical analysis in a way that is beyond the scope of this paper. An important debate exists regarding the use of such cross-sectional data: it is technically incorrect to evaluate causal relationships (such as those in a mediated effect) with the cross-sectional data that are most abundant in the IDD field (see Selig & Preacher 2009). However, the fact remains that researchers have, and will continue to, use these data in the exploration of moderators and mediators. This manuscript is meant to aid researchers in employing the best possible techniques for this type of data.

Note about figures

Causal steps diagrams will be referred to in this paper; Fig. 1 depicts the effect of the IV (also referred to as ‘X’) on the DV (also referred to as ‘Y’), Fig. 2 illustrates a moderated effect, and Fig. 3 shows a mediated effect. The X*Mod term in Fig. 2 is the interaction between the IV (X) and the moderator (Mod) in predicting the DV (Y). The lines between the boxes in each of the figures represent a predictive relationship in the direction in which the arrow points.

Details are in the caption following the image

A diagram of the relationship between the independent variable (IV) and the dependent variable (DV). Path c represents the regression coefficient in an equation where X predicts Y.

Details are in the caption following the image

A causal steps diagram of the effect of a moderator (Mod) on the relationship between the independent variable (X) and the dependent variable (Y). Paths c′, b, and d represent the regression coefficient of the respective variable in predicting Y. Mod is a moderator if the interaction X*Mod significantly predicts outcome (path d), indicating that the relationship between X and Y differs depending on the level of Mod.

Details are in the caption following the image

A causal steps diagram of a mediator (Med) of the relationship between an independent variable (X) and a dependent variable (Y). Path a represents the regression of Med on X, path b the regression of Y on Med, and c′ the regression of Y on X. According to the Baron & Kenny (1986) guidelines, Med is a mediator if paths a and b are significant, and if path c′ is weaker than path c from Fig. 1. The MacArthur guidelines add that a significant interaction between X and Med is also a possible condition for mediation.

Moderators

A moderated relationship is simply an interaction between two variables, where there is little or no causal relationship between the variables. A moderator model specifies conditions under which the relationship between X and Y differs (Chaplin 2007). Psychological science depends on generalising findings to populations, so it is important to be able to identify the group to which the X/Y relationship can be generalised. As Chaplin (2007) pointed out, a moderator model specifies the ‘it depends’ relationship in scientific questions. For example, let us imagine the research question: ‘Is parental age associated with a child's autism diagnosis?’ The answer, according to Reichenberg et al. (2006), is ‘It depends.’ If we apply the results of this study to Fig. 2, we see an interaction between parent age (X) and parent gender (Mod) that predicts significantly (path d) the diagnosis of autism (Y). Such an interaction indicates that the effect of the IV varies across the levels of the moderator variable. Therefore, parent gender could be said to moderate the relationship between parent age (the IV) and autism (the DV): a significant relationship existed between parent age and autism, but only for fathers.

Although a significant interaction is the single statistical criterion for achieving moderation, two additional theoretical criteria are important. In cases of simple moderation, there should be no causal relationship between the moderator and the IV. As Kraemer et al. (2008) pointed out, if the moderator and IV are causally related, then the effect of the IV on the outcome in the total population is in some way dependent on the moderator. This is inconsistent with the role of a moderator, which is to help determine the subpopulations within which the relationship between IV and DV changes. Thus, the theoretical criteria for moderation address this issue via the temporal and causal relationships between the moderator and the IV.

The causal steps approach requires that the moderator must precede the IV, although it should be noted that especially in cross-sectional studies, the moderator may not be measured before the IV. This criterion clarifies the causal relationship: if the moderator precedes the IV, then it cannot be caused by the IV. This is also useful for distinguishing moderators from IVs or mediators. Often (but not always), candidate moderators are obvious grouping variables that precede the IV (e.g. gender, age, or diagnosis). However, the temporal relationship may be less clear with variables that have an ambiguous ‘start date’ (these are often continuous variables). In these cases, temporal precedence allows the researcher to identify which variable is the moderator and which is the IV. Thus, as a prerequisite of many cases of simple moderation, the researcher should provide a theoretical justification regarding the temporal precedence of the moderator. If the variables are both trait-type variables that occur contemporaneously, then the moderator and IV designations may be interchanged. This is acceptable and expected in cross-sectional studies; the Reichenberg et al. (2006) example cited above illustrates this phenomenon with age (IV) and gender (Mod). Because temporal precedence is not theoretically justifiable for either variable, the study model determines which variable is considered the moderator. In this case, because age was the IV of the study, gender is considered the moderator.

A second mechanism for establishing a lack of a causal relationship between the IV and the moderator is the degree of correlation between the two. It is very unlikely for any two constructs in the behavioural sciences to be totally uncorrelated, so it is up to the researcher to decide what degree of correlation is acceptable. For example, if two variables share 10% of variance, then the causal relationship may be inconsequential. However, if the variables share 80% of the variance, the relationship may be more complicated than simple moderation (e.g. moderated mediation). These relationships are beyond the scope of this paper, and interested readers should consult Preacher et al. (2007) for further details. In addition to the brief discussion of the temporal relationship between the IV and the moderator, the researcher should also provide readers a description of the correlation between the two variables.

A final important point is that it is not necessary to observe a significant IV/DV relationship in order to show moderation. To revisit the parental age example: it is possible for a significant relationship between paternal age and autism diagnosis to be ‘masked’ by a non-significant relationship between maternal age and autism diagnosis. In other words, any main effect of the IV may be rendered non-significant by competing effects among the levels of the moderator. Therefore, a main effect of the IV is not a precondition of moderation.

To summarise, a moderator helps to identify subgroups in which the IV exhibits different effects from the rest of the total population. To be considered a moderator, a variable must display a statistically significant interaction with the IV in predicting the DV. For theoretical reasons, it is necessary to address (a) the temporal precedence of the moderator to the IV and (b) any causal relationship between the IV and the putative moderator. The interpretation of statistical and practical significance for each of these criteria is up to the researcher, who should be explicit in defence of the choices. If the moderator does not temporally precede the IV or if they are strongly correlated, then the relationship may not be one of simple moderation, although it may warrant some other classification of interest. Understanding these criteria is important for both the researcher and the consumer; we as a scientific community must be prepared to identify mistakes in appropriate labelling of variables as moderators.

Probing of moderator effects

Satisfying the above criteria for moderation is only descriptive; that is, the researcher now knows that the answer to her or his question is ‘It depends,’ but may not yet know exactly what it depends upon. The second step in moderation analysis is to probe the significant interaction in order to determine under which values of the moderator the relationship between the IV and the DV changes (Hayes & Matthes 2009). When either the moderator or IV is continuous, a simple manipulation of the data can help significantly in the interpretation of the resulting interaction. Centring refers to the practice of subtracting some set value (usually the sample mean or median) from each observation (see Aiken & West 1991). This procedure ensures that relevant values of the variables are included in the analysis, and can mitigate the effects of multicollinearity. This procedure has also been recommended for interval data [i.e. (0, 1, 2) recoded to (−1, 0, +1)] (Kraemer & Blasey 2004).

It may seem logical simply to split the sample into groups based on levels of the moderator (i.e. men vs. women or high anxiety vs. low anxiety) and re-estimate the effect of the IV in each of the subgroups, but this method (called subgroup analysis) is suboptimal. The null hypothesis of a subgroup analysis is that the effect in each subgroup is equal to zero. This does not allow for comparison of the size or significance of the two effects. Additionally, if the moderator is continuous, manual dichotomisation can lead to decreased power to detect real interactions and increased Type I error due to spurious interactions (MacCallum et al. 2002; Newsom et al. 2003). Other limitations are discussed in detail in Newsom et al. (2003).

Hayes & Matthes (2009) suggested two alternatives to subgroup analysis: the ‘pick-a-point’ approach (Rogosa 1980) and the superior Johnson-Neyman method (Johnson & Neyman 1936). Both are described at length in Hayes & Matthes (2009), and the authors provided SAS and SPSS macros (pre-written syntax) for use on any appropriately structured data set. Additionally, Preacher provided helpful calculation tools on his website, http://www.quantpsy.org. The general methods will be described here, but it is recommended that interested readers obtain and use the available macros to probe significant interactions.

The ‘pick-a-point’ procedure (Rogosa 1980) addresses a question of simple slopes by comparing the conditional effect of X across pre-selected values of the moderator (usually low, medium and high) without dividing the sample into subgroups. A similar approach was advocated by Aiken & West (1991), wherein the moderator is centred at one standard deviation below the mean, at the mean, and at one standard deviation above the mean. The pick-a-point approach avoids most of the pitfalls associated with the subgroup method because it uses data from the entire group. The researcher is encouraged to use data-based or clinically meaningful cut points. Although these selections are not arbitrary, there is no hard-and-fast rule for choosing the values of the moderator at which to compare the conditional effects of X (unless, of course, Mod has naturally occurring categories).

Therein lies the strength of the Johnson-Neyman method; this procedure essentially works backwards to find regions of significance (Hayes & Matthes 2009). The Johnson-Neyman technique finds the points along the continuum of the moderator where the effect of the IV becomes significant and non-significant (Hayes & Matthes 2009). As with all statistical procedures, it is up to the researcher to determine the level of significance. The mathematical underpinnings of the Johnson-Neyman approach are complicated and are beyond the scope of this paper, but the macro provided in Hayes & Matthes (2009) generates the solution quite painlessly.

In summary, although the outlined criteria for moderation suffice to define a moderator, most researchers are naturally interested in describing the moderated effect. This amounts to simply probing the significant interaction. The traditional approach, subgroup analysis, has several significant weaknesses that stem from dividing a sample (even when the moderator is categorical). Two alternatives, the pick-a-point method and the preferred Johnson-Neyman approach, are statistically superior to subgroup analysis. These methods are now easily implemented using SAS or SPSS macros provided in the literature. Additionally, one may quantify the moderated effect using the effect size f2 (Aiken & West 1991), which is commonly used for interactions.

Mediators

A mediator is a mechanism through which the IV brings about the DV. So, while a moderator helps answer what an effect depends upon, a mediator explains how the effect happens. For example, Benson & Karlof (2009) found that stress proliferation (a phenomenon where one stressor creates secondary stressors) mediated the effect of child autism symptom severity on parent stress. The authors showed that the child's autism severity predicted parent stress, but that did not explain how the effect worked. To that end, the authors performed analyses that led them to conclude that stress proliferation was at least one of the mechanisms through which the child's behaviour affected the parent. The child's symptom severity (X) predicted increased stress proliferation (Med), which in turn predicted increased overall parent stress (Y).

A mediator is depicted in Fig. 3: the IV affects the mediator, which in turn impacts the DV in a specific and predictable way. Unlike moderation, there is an essential causal relationship between the IV, the mediator and the DV. This suggested four operational criteria for the BK/MA approach: (1) the IV must precede the mediator; (2) the IV must be correlated with the mediator (path a); (3) the mediator must be related to the DV (path b); and (4) the relationship between treatment and outcome (path c in Fig. 1) should be at least partially accounted for by the mediator (e.g. path c′ in Fig. 3 must be closer to zero than path c in Fig. 1). Let us revisit the Benson & Karlof (2009) mediation example where the child's symptom severity (X) caused the parent's stress proliferation (Med) (criteria a and b). Although symptom severity (X) itself predicted parent stress (Y), when stress proliferation (Med) was added to the model, it predicted Y (criterion c) the relationship between symptom severity and parent stress weakened (criterion d).

Those promoting the direct-testing approach to mediation analysis suggest that this list of criteria is merely descriptive and non-essential (e.g. Preacher & Hayes 2004; Hayes 2009). In this approach, the indirect effect (mathematically equivalent to path c minus path c') is estimated and tested as the sole criterion for mediation. It is important to note that the temporal criterion remains as a definitional characteristic of mediation (i.e. the IV precedes the mediator). As with the discussion of moderation, causal steps are used here to describe the mediated effect, but the direct test of the mediated effect is suggested as the current gold standard for mediation analysis.

Causal steps approach to mediation

One overarching requirement for mediation is that the IV causes the mediator. This makes criteria a and b necessary; although criterion a was implicit in the BK model, it was unstated, and researchers often ignore it. These criteria have contrary counterparts in the moderation criteria; criteria a and b make moderation and mediation mutually exclusive.

Criterion c is most often satisfied by demonstrating a main effect of the mediator; the BK mediation model did not include the interaction term (X*Med). The MA guidelines added the interaction between the IV and the mediator to this criterion; they suggested that excluding the interactive effect may upset the estimates of the main effects and increase the error, as the interaction effect must be absorbed somewhere (Kraemer et al. 2008). The inclusion of the interaction term in the MA guidelines recognised the possibility that the IV may not only alter the mediator, but may also change the relationship between the mediator and outcome. However, it should be noted that this special case of mediation, where the mediator interacts with the IV, might also be considered moderated mediation if applicable temporal requirement are satisfied (see Preacher et al. 2007). This again underscores the importance of the explicit definition of criteria by the researcher.

One often encounters discussions of ‘full’ and ‘partial’ mediation in the literature. Usually, ‘full mediation’ describes a case where the mediator renders the main effect of the IV statistically and practically non-significant, suggesting that the effect of X is fully attributable to its effect on the mediator. However, it is unrealistic to expect that the IV has only one mechanism of effect (Baron & Kenny 1986), so most cases of mediation are classified as ‘partial mediation.’ There is no set threshold for the decrease in magnitude required by criterion d; instead, it is up to the researcher (in interaction with the research community) to determine the practical significance of the effect using the quantification techniques described here.

For mediation to be shown using causal steps, path c′ in Fig. 3 should be closer to zero than path c in Fig. 1 (criterion d). The most common statistical method is multiple regression with equations 1–3. Equation 1 is the regression of the DV (Y) on the IV (X), Equation 2 is the regression of the suspected mediator on the IV (X), and Equation 3 is the regression of the DV (Y) on X, Med, and the interaction between X and Med. Coefficient d in Equation 3 is constrained to zero in the BK model.

image(1)
image(2)
image(3)

MacKinnon (2008) provided syntax and sample output for both SAS and SPSS (pp. 57–58) implementation of causal steps to satisfy BK criteria.

Another approach to establishing mediation is structural equation modelling (SEM). SEM is a technique wherein constructs are represented as latent (unobservable) variables with multiple sources of measurement (the manifest variables). In the special case of path analysis, there are no latent variables. Instead, the relationships between the manifest variables (the measured constructs) are modelled (MacCallum & Austin 2000). A simplified illustration of the basic procedure is a comparison of the fit of a model based on Fig. 3 to the fit of the same model with the c′ path constrained to zero. If the constraint of c′ to zero does not reduce the fit of the model, mediation is supported because paths a and b are accounting for a significant proportion of the effect (Holmbeck 1997). Many researchers are not familiar with SEM, and it requires specialised software such as Lisrel, EQS, AMOS or MPLUS. However, for individuals motivated to use SEM, it is good option, and is de rigueur in fields such as developmental psychology.

Direct testing of mediator effects

Establishing mediation through the causal steps process described above does not directly test the practical or statistical significance of the mediated effect. The causal steps method is inductive; if the criteria are satisfied, then the variable in question must be mediating the effect of the IV. BK did include an approximate significance test for the mediated effect, known as the Sobel test (Sobel 1982). BK recommended a slight modification to the Sobel test; the recommended formula included the product of the standard errors while the Sobel version excluded it. Although BK ostensibly believed the test should be used in addition to the causal steps procedures, few researchers actually use the Sobel test (Preacher & Hayes 2004). This is likely due to the fact that the BK guidelines did not explicitly require the Sobel test.

A direct test of the mediated effect is sufficient (and more rigorous) to show mediation. Therefore, it has been proposed that it be done in place of the causal steps procedures (e.g. Preacher & Hayes 2004). Although the Sobel test was recommended by BK, it should be noted that the test is based upon two commonly violated assumptions: (1) that the sample size is large, and (2) that the product of paths a and b in Fig. 3 is normally distributed (Preacher & Hayes 2004). In practice, sample sizes are often not large, and products of estimates rarely have normal distributions (MacKinnon et al. 2007).

In light of the weaknesses of the Sobel test, a bootstrapping approach has been described by Preacher & Hayes (2004). Bootstrapping is a method of estimation based on numerous subsamples drawn from the original sample (Bollen & Stine 1990; see also Mooney & Duval 1993). Essentially, a sample is treated as a population from which k samples (>1000; Efron & Tibshirani 1993) are drawn with replacement. Preacher & Hayes (2004) outlined the process for estimating the mediated effect using the bootstrap method. For each of k samples, the mediated effect (a*b) is calculated; the point estimate of the mediated effect is the mean value of the statistic obtained across k samples. A confidence interval for the point estimate of a*b is found by: (1) ordering the k estimates from low to high; (2) determining the lower limit of the confidence interval, which is the k(α/2)th observation; and (3) determining the upper limit of the interval, the k{1 − [(α/2) − 1]}th observation. To borrow the example provided by Preacher & Hayes (2004), if k = 1000, the 95% confidence interval would span from the 25th observation to the 976th observation. The confidence interval is then interpreted in the normal fashion; if zero is contained within the interval, the conclusion is that the mediated effect is not significantly different from zero.

The strength of this non-parametric approach lies in the fact that no assumptions are made regarding the distribution of the variables or sampling distributions of the associated statistic; instead, it is assumed that sampling from the empirical ‘population’ is very similar to sampling from the true population (Bollen & Stine 1990). Although it is true that a larger n may provide more stable estimates, the validity of this method is not contingent on sample size (Preacher & Hayes 2004), so it is appropriate for the relatively smaller samples found in IDD research.

Bootstrapping may be computer-intensive, but is not labour-intensive for the researcher. Preacher & Hayes (2004) provided SAS and SPSS macros for the bootstrapping method as well as for the conventional BK approach (with the Sobel test) (also available at Hayes' website, http://www.afhayes.com/spss-sas-and-mplus-macros-and-code.html). Once data are appropriately structured, the macros can be easily run and readily interpreted (specific requirements are described in the text accompanying the macros).

In summary, the inductive approach of the causal steps approach does allow a researcher to classify a variable as a mediator or not, and it helps to guide future research by generating hypotheses about possible mediators. The steps are simple and several are done in the context of the main analysis, so little additional work is needed to evaluate a potential mediator. Given the importance of mediators, it may be productive and enlightening to incorporate such exploration whenever the investigator wants to understand the mechanisms behind an effect. However, if the goal of the study is to quantify or confirm the presence of a mediator, which it usually is, more powerful statistics are necessary. Preacher & Hayes (2004) have made it relatively easy for researchers to employ the bootstrapping approach for estimating mediated effects. Thus, the direct test of the mediated effect is the most rigorous approach to mediation analysis and should be considered the ‘gold standard.’

Literature review

The preceding sections are meant to demystify moderators and mediators, and it is hoped that researchers will be more likely in the future to assess these types of variables. The other goal of this article is to turn attention to the state of the IDD field in terms of moderation and mediation analyses. The goal of this section is to identify any patterns in the research being performed in the field, with the hope that illuminating consistent problems will allow the field to use moderators and mediators as research tools more efficiently. As the focus of this review is on patterns and not individual studies, results are presented in aggregate, and specific studies are referred to only rarely and in cases where it is particularly instructive for future investigations. A list of the studies reviewed and the individual evaluations according to stated criteria are available at http://psychmed.osu.edu.

Procedure

PsycInfo was used to search five major journals in the IDD field for studies that addressed moderation or mediation: American Journal on Mental Retardation (AJMR, now American Journal of Intellectual and Developmental Disabilities), Journal of Autism and Developmental Disorders (JADD), Journal of Intellectual and Developmental Disability (JIDD), Journal of Intellectual Disability Research (JIDR), and Research in Developmental Disabilities (RIDD). These journals were selected because they are well respected and contain empirical research of the type that is conducive to tests for moderation and mediation. The search included publications indexed in PsycInfo up to June 2010.

Results

Table 1 contains the results of the literature review. All told, the review produced 25 papers with moderator analyses and 38 papers with mediator analyses. Eleven studies each included both moderator and mediator analyses, for a total of 63 eligible analyses.

Table 1. Results of review
Evaluation variable Moderator studies (n, %) Mediator studies (n, %)
n = 25 n = 38
Journal*
American Journal on Mental Retardation 4 (16) 8 (21)
Journal of Autism and Developmental Disorders 8 (32) 9 (24)
Journal of Intellectual and Developmental Disability 0 0
Journal of Intellectual Disability Research 10 (40) 16 (42)
Research in Developmental Disabilities 3 (12) 5 (13)
Year
 Pre-1990 0 1 (3)
 1990–1999 3 (12) 4 (11)
 2000–2005 5 (20) 8 (21)
 2006–2010 17 (68) 25 (66)
Did the IV and the moderator correlate?
 No 4 (16)
 Yes, with no discussion of implications 7 (28)
 Not reported 14 (56)
Criteria for moderation
 X*Moderator interaction 22 (88)
 Other 3 (12)
Criteria were explicitly defined
 Yes 11 (44) 25 (66)
 No 14 (56) 13 (34)
Temporal precedence was addressed
 Yes 12 (48) 21 (55)
 No 13 (52) 17 (45)
The mediated effect was directly tested
 Yes [Sobel; Bootstrap; Other] 18 (47) [34; 5; 8]
 N/A 3 (8)
 No 17 (45)
Criteria for mediation
 Direct test only 2 (5)
 Causal steps [Direct Test; No Direct Test] 19 (50) [39, 37]
 Subset of causal steps 5 (13)
 Other 2 (5)
Variable tested as moderator AND mediator (‘pluralism’)
 No 17 (68) 30 (79)
 Yes 8 (32) 8 (21)
  • *  Rates of publication for each journal vary. Through 2010, PsycInfo has the following numbers of articles indexed for each journal: AJMR/AJIDD 1128; JADD 2136; JIDD 496; JIDR 1454; and RIDD 1093.
  • Studies were coded with respect to whether or not data were presented regarding the causal relationship between the independent variable and the putative moderator (‘Did the IV and Mod correlate?’), the criteria used to classify a moderator/mediator (‘Criteria for Moderation/Mediation’), whether or not criteria were explicitly stated by the authors (‘Criteria were explicitly defined’), whether or not the precedence of the moderator variable to the IV OR the precedence of the IV to the mediator variable was addressed (‘Temporal precedence addressed?’), whether or not the mediated effect was directly tested, and whether or not a single variable was analysed as both a moderator and mediator (‘Pluralism’).
  • IV, independent variable.

Moderator analyses were coded with respect to: (1) explicit discussion of the causal and temporal relationship between the moderator and the IV; (2) criteria used to classify a moderator; (3) presence of an explicit statement of these criteria; and (4) whether or not the moderator variable was also analysed as a mediator.

Mediator analyses were coded with respect to: (1) explicit discussion of the temporal relationship between the mediator and the IV; (2) criteria used to classify a mediator (including whether or not it was directly tested); (3) presence of an explicit statement of these criteria; and (4) whether or not the moderator variable was also analysed as a moderator.

Discussion

The methods used in this literature search were chosen to maximise the number of moderator and mediator analyses located. In all, only 63 analyses were identified (25 moderator and 38 mediator analyses; 52 unique papers). One journal, the Journal of Intellectual and Developmental Disabilities, contained no relevant articles. For comparison purposes, it is interesting to note that as of July 2010, Baron & Kenny (1986) had been cited 151 times in the Journal of Consulting and Clinical Psychology and 502 times in the Journal of Personality and Social Psychology (the article had been cited a grand total of 13 809 times across all sources). Combined, the five journals surveyed here cited Baron & Kenny (1986) 33 times. Nothing about the type of studies or data obtained in the IDD field prevents these analyses from being done, although relatively small samples may be a factor. It is also possible that the types of research questions asked by members of the field are less suited to moderator and mediator analyses than those asked by social psychologists. Perhaps the paucity is caused by a lack of knowledge about the utility and/or mathematical determination of moderators and mediators. It is encouraging that the publications located in the five journals of interest increased in frequency in the past 10 years; hopefully, this trend will continue.

A subtype of the IDD literature is more likely to include moderator and mediator analyses; quality of life studies were the most prevalent (72% of moderator studies and 58% of mediator studies). This may be due to the fact that the constructs measured in quality of life studies are more similar to those in the social psychology literature, where the concepts of moderation and mediation have been particularly fruitful. It was disappointing that there were not more symptomatology studies; mediator analyses may be especially helpful in understanding the processes associated with a disorder. Treatment studies (perhaps published in other journals) may be more likely to include moderator and mediator variables, but a survey of longitudinal studies (a category within which treatment studies would fit) was outside the scope of this paper. This review is meant to be constructive, and therefore I move to a discussion of how IDD researchers are doing moderator and mediator analyses. One hopes that this will lead to refinement in the future.

The temporal relationship is important for both moderators and mediators. In cross-sectional designs such as those examined in this review, it is impossible to ‘prove’ the causal relationships between the variables. Rather, the onus is on the researcher to establish his or her case by compiling relevant evidence. This might include longitudinal data from other studies characterising the causal relationship between IV and Mod or Med, or even a set of logical hypotheses that support such a relationship. This process is already established as necessary in cross-sectional studies: like mediators and moderators, IVs are assumed to precede the DV, though measurement may have been contemporaneous.

In one mediation analysis, Baker et al. (2005) made a strong case for the idea that optimism is a trait, consistent with their hypothesis that parent optimism moderated the relationship between child behaviour and parent stress. As Baker et al. pointed out (p. 586), hypothesising optimism as a mediator would have been inappropriate, because they believed that it is a trait (i.e. a trait is, by definition, present in the parent before the child behaviour occurs). Despite this, optimism was analysed as a mediator. Had the results of the regression analysis been significant (they were not), the authors would have concluded that the trait of parent optimism both moderated and mediated the effect of child behaviour on parent stress, which is a more complicated relationship than could be proven by their chosen analyses. Only 21 studies (55%) addressed the temporal relationship, despite the confusion introduced by the absence of such discussion. These results highlight a significant blind spot in both moderation and mediation analyses.

Similar results were observed for the relationship between the IV and the moderator. Only 10 studies reported this relationship. Seven of those studies showed that the moderator did correlate with the IV, but no discussions of the implications of the causal relationship were offered. In one case, the putative moderator correlated at a level of 0.9 with the IV (Manikam et al. 1995), which suggests that the relationship was something different from simple moderation. Authors should decide what level of shared variance is acceptable between the IV and the moderator, and should take care to defend that decision.

The need for explicitly defining the criteria for moderation and mediation is great. Obviously, simply using the term ‘moderator’ or ‘mediator’ is not enough, given the confusion regarding definitions. Although only 11 of the 25 moderator studies stated it explicitly, it was clear from the Results sections that most studies (88%) did require a significant interaction term for moderation. Thirteen of the 38 mediator studies (34%) did not explicitly state the criteria for mediation, but it was clear from the Results sections that most studies employed the BK causal steps method. Several studies employed subsets of causal steps. This is worrisome and is reflective of a lack of understanding regarding the function of a mediator. A variable that has a main effect on the DV, even if it makes the relationship between the IV and DV significantly weaker, is only a mediator if the IV is causing the variable (the criterion neglected by those using a subset). Furthermore, the mediated effect was not directly tested in most studies. It is not possible to tell from a causal steps analysis if the mediated effect is statistically significant or of significant magnitude, so this direct test is necessary. In three cases reviewed here (e.g. Baker et al. 2007; Weiss 2008; van Nieuwenhuijzen et al. 2009), the Sobel test actually contradicted the results of the causal steps procedure and showed that the effect was not significant. Only two (5%) studies employed the bootstrap method (Deveraux et al. 2009; Pollman et al. 2010). The fact that most studies neglected a direct test of the mediated effect leaves open the possibility that the mediated effect suggested by causal steps was neither practically nor statistically significant, a result that would be unacceptable in most other types of statistical evaluation.

There was some confusion regarding the meanings of ‘moderator’ and ‘mediator.’ One source of confusion is that the statistical meaning of moderator (a variable that can be used to predict varying X/Y relationships in subgroups of the sample) is different from a common usage (to preside over or to change). For example, one paper reported that ‘. . . only the Smith-Magenis analyses revealed friends as an important moderator of family stress’ (Hodapp et al. 1998, p. 336), but the model reflected a simple predictive relationship between friendship (IV) and stress (DV), not one in which the relationship between the disorder and family stress was dependent on the level of friendship. In some cases, it was not clear whether the authors intended to call a variable a moderator or a mediator; Baxter (1992) posited a model that included ‘mediating variables, which may be exacerbating or moderating in their influence on the stress process’ (p. 520, emphasis added). It is likely that one term or the other was meant to reflect something other than the statistical definition of the term, but this sowed the seeds for possible confusion. In other cases, the authors called the variables mediators and applied mediational criteria, but were examining what should have been called moderator variables. These results emphasise the need for information in addition to the terms ‘moderator’ or ‘mediator.’ Preferably, an author would state the exact criteria a variable must fulfil in order to be labelled a moderator or mediator.

Finally, eight of the total 52 papers (15%) hypothesised that the same variable would both moderate and mediate a given X/Y relationship (‘pluralism’). This is a logical impossibility, because the definitions put forth in this manuscript make the moderator and mediator variables mutually exclusive without significantly more complicated analyses than those found in this review. Again, a lack of explicitly stated criteria no doubt contributed to the confusion; the temporal criteria leave little distinction between moderators and mediators.

Limitations

Some papers with moderator or mediator analyses may have been missed in this review. Second, the criteria used to evaluate the studies stem from the recommendations made in the current paper. Although they reflect the theoretical considerations from causal steps as well as the more rigorous requirements of the direct-test approaches, they might not be considered gold standard throughout the field. Therefore, failing to meet the criteria set forth in this review is not unequivocally damning.

Conclusions

The methods and guidelines described in the first three sections of this paper are meant to inform and encourage workers in the IDD field to include the evaluation of moderators and mediators as a regular part of the research exercise. It is clear from this review that many IDD investigators are insufficiently familiar with the appropriate procedures for evaluating moderators and mediators, which is disappointing given the importance of such variables. Moderators are integral in establishing that research findings are generalisable (or not) across subgroups of a population, which may be especially relevant to the field of IDD, an umbrella term that covers many different subpopulations. Mediators, on the other hand, help us to answer one of the most important questions in science: ‘How?’ Usually, it is not enough to show that an effect exists; ideally, the scientific exercise results in a complete elucidation of the causal mechanisms. Unfortunately, ‘definitional’ confusion seems to have contributed to an air of mystery surrounding moderators and mediators. In keeping with recent literature, I have suggested the use of causal steps only to understand the function of a moderator or mediator, and the use of a direct statistical test to determine their presence. As outlined here, directly testing moderators and mediators only requires the use of a few readily available macros. An increased emphasis on moderation and mediation analyses will enhance the quality of the information that we uncover and the services we provide.

    The full text of this article hosted at iucr.org is unavailable due to technical difficulties.