Table 1

Some frequently used measures for heterogeneity

MeasureAdvantagesDisadvantages
τ2
  • The τ (the square root of τ2) is the SD of the between-study variation on the scale of the original outcome.

  • The τ2 is the direct estimate of the between-study variation and therefore useful in calculations, for example, for the prediction interval.

  • A direct clinical interpretation based on τ2 is difficult, especially when τ2 belongs to outcomes that were analysed on log scale, for example, ORs.

  • When the τ2 estimate is based on only a few studies, it will be imprecise.

I2
  • I2 presents the inconsistency between the study results and quantifies the proportion of observed dispersion that is real, that is, due to between-study differences and not due to random error.2 3

  • I2 reflects the extent of overlap of the CIs of the study effects.

  • I2 represents the inconsistency always on a scale between 0 and 100, therefore it can be compared with suggested limits for low or high inconsistency.13

  • A direct clinical interpretation of I2 is difficult.

  • I2 is also ambiguous because its size depends on sample size:

    • With very large studies, even tiny between-study differences in effect size may result in a high I2;

    • With small (imprecise) studies, very different treatment effects can yield an I2 of 0.

CI
  • The CI in a random-effects model contains highly probable values for the summary (mean) treatment effect.

  • The CI gives no information on the range of true treatment effects that are likely to be seen in other settings, for example, in the next study or in the patients a clinician wants to treat in her clinic.

Prediction interval
  • The prediction interval in a random-effects model contains highly probable values for the true treatment effects in future settings, if those settings are similar to the settings in the meta-analysis.

  • The values in the interval can be compared with clinically relevant thresholds to see whether they correspond to benefit, null effects or harm.

  • The prediction interval can be used to estimate the probability that the treatment in a future setting will have a true-positive or true-negative effect, and to perform better power calculations.

  • Conclusions drawn from the prediction interval are based on the assumption that τ2 and the study effects are normally distributed.

  • The estimate of the prediction interval will be imprecise if the estimates of the summary effect and the τ2 are imprecise, for example, if they are based on only a few studies and if these studies are small.