NotesFAQContact Us
Collection
Advanced
Search Tips
Back to results
Peer reviewed Peer reviewed
Direct linkDirect link
ERIC Number: ED663547
Record Type: Non-Journal
Publication Date: 2024-Sep-18
Pages: N/A
Abstractor: As Provided
ISBN: N/A
ISSN: N/A
EISSN: N/A
Meta-Analysis Reconceived from a Finite Population Sampling Perspective
Matthew Forte; Elizabeth Tipton
Society for Research on Educational Effectiveness
Background/Context: Over the past twenty plus years, the What Works Clearinghouse (WWC) has reviewed over 1,700 studies, cataloging effect sizes for 189 interventions. Some 56% of these interventions include results from multiple, independent studies; on average, these include results of [approximately]3 studies, though some include as many as 32 results. This has produced a new problem -- when there are multiple studies, how should these results be summarized? One approach -- known to be problematic, but common in the early days of the WWC -- is to 'vote count'; this involves reporting the number of independent effect sizes that are statistically significant. Another approach -- that has been more recently implemented in the WWC -- is to meta-analytically combine these effects. In the WWC context, the meta-analytic estimator is simply a weighted combination of the independent study effect size estimates. Here the question becomes how these weights are selected. One approach is the Fixed Effect (FE) model. This model assumes that all studies are estimating the same true effect size and that differences in estimates between studies are attributable to sampling error. For a study j, an effect size estimate T[subscript j] receives weight w[subscript j][superscript FE] = 1/SE(T[subscript j])[superscript 2] (Borenstein et al., 2010). Another approach is the Fixed Effects (FES) model, which generally uses the same weights as the FE model but allows for true effect sizes to differ across studies (Rice et al., 2018). A third approach is the Random Effects (RE) model, which assumes that there is a distribution of true effects across studies. If sample sizes in each study were large, this would result in each study estimate receiving equal weight, w[subscript j][percent]; because finite samples are used within studies, however, the weights are not equal, with w[subscript j][superscript RE] = 1/(SE(T[subscript j])[superscript 2] + [tau][superscript 2]) where [tau][superscript 2] is the variation in true effect sizes (Borenstein et al., 2010). Thus, the RE weights lie between w[subscript j][superscript FE] and w[subscript j][percent]. Table 1 provides details on these three models. Purpose/Research Question: The choice between the FE, FES, and RE models is often fraught. The prevailing guidance for this choice typically focuses on three criteria: generalizability, estimation, and precision. First, it is said that the RE model provides a clear means to generalize to "studies" outside of those included, while the FE and FES models are interpreted as "conditional" effects, generalizable only to the "students in the meta-analysis" (Hedges & Vevea 1998). Second, it is widely accepted that at least 8-10 studies are required for stable estimation of the between-study variation [tau][superscript 2] required for RE weights (Langan et al., 2019). Third, many often note that the FE and FES models produce estimates that are more precise. In the WWC, for example, this combination of concerns has led to the use of the FES model (with weights w[subscript i] = n[subscript i] [approximately equal] w[subscript i][superscript FE]; Institute of Education Sciences, 2022) In this paper, we take a different approach. Instead of starting our model building with the data found in a meta-analysis, we begin instead with a clearly defined population and focus on how meta-analytic data could be obtained through different sampling models. Doing so allows us to frame the problem as one of aggregating effects for students rather than studies, and to allow the choice of model to be aligned to the data generating process, not the limitations of estimation or the purported goals (generalizability) of the analysis. Research Design: We begin with a finite population with an average effect size [theta] and conceive of three different types of random samples--simple random samples (SRS), stratified simple random samples (SSRS), and cluster samples (CS). Table 2 compares these designs. In the paper, we then align these three estimators with the FE, FES, and RE's models, noting where the assumptions between them are similar or different. Findings/Results: To preview a few findings, we show that -- (1) In the SRS design, each study's sample is a random sample from the same population; we might think of this as a sequence of 'direct' replications. We show that this aligns nearly exactly with the FE model -- though the assumptions and interpretation of [theta] differ. (2) The SSRS and CS designs each include an additional stage in the sampling process. First, the population is divided into clusters -- what we call 'potential studies' -- and then random samples from these clusters are observed. In SSRS, we observe all the clusters, whereas in CS, we observe a sample of these clusters. We show that for both, the weights w[subscript j]that each 'observed sample' should receive in the estimator of [theta] should be proportional to the size of that cluster in the population. Thus, if we observe two studies -- one large and one small -- the weight we give to each study should be with respect to the population size represented by that study, "not" the sample size. This means that the weights w[subscript j][superscript FE] are appropriate for both FES and CS so long as we can assume that the sampling rates are the same in each cluster ('potential study') in the population (i.e., n[subscript j] [is proportional to] N[subscript j]. Conversely, equal weights (like w[subscript j][superscript percent]) could be appropriate if we conceive of the population sizes of each potential cluster as the same. and (3) While the CS design and RE models are both the most general, we show that there are important differences between the two. Furthermore, we argue that the CS design is more general and thus provides benefits over the RE model, particularly when combined with the use of robust variance estimation. Conclusion: In this paper, we show that there are many benefits to reconceiving of meta-analysis in relation to sampling models -- particularly for the small meta-analyses common in the WWC. This includes: (1) in all three designs, the population parameter [theta] is the same, with all generalizing to the population of "students" not studies; (2) the choice of weights has to do with assumptions regarding population characteristics (sampling proportions), not sample sizes; (3) it is possible to incorporate variation in effects (via the CS design plus robust variance estimation) even with small number of studies.
Society for Research on Educational Effectiveness. 2040 Sheridan Road, Evanston, IL 60208. Tel: 202-495-0920; e-mail: contact@sree.org; Web site: https://www.sree.org/
Publication Type: Reports - Research; Information Analyses
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: Society for Research on Educational Effectiveness (SREE)
Grant or Contract Numbers: N/A