Introduction
The term attrition is widely used in the clinical trials literature to refer to situations where outcome data are not available for some participants. Missing data may invalidate results from clinical trials by reducing precision and, under certain circumstances, yielding biased results. Missing outcome data are an important and common problem in mental health trials as dropout rate may exceed 50% for certain conditions.1 The Cochrane Collaboration regards incomplete outcome data as a major factor affecting the credibility of a study and requires systematic reviewers to assess the level of bias in all included trials via the Cochrane Risk of Bias tool.2 ,3 The ideal solution is to avoid missing data altogether and the National Research Council suggested ideas for limiting the possibility of missing data in the design of clinical trials.4 Systematic reviews and meta-analysis are retrospective by their own nature and preventive measures for the avoidance of missing outcome data cannot be used.
The intention-to-treat (ITT) principle is widely accepted as the most appropriate way to analyse data in randomised controlled trials (RCT).5 ,6 The ITT principle requires analysing all participants in the group they were originally randomised irrespective of the treatment they actually received. The Cochrane Handbook suggests employing an ITT as the least biased way to estimate intervention effects from randomised trials.7 However, in order to include in the analysis participants whose outcomes are unknown, one needs to employ an imputation technique and make assumptions about missing data, which may affect the reliability and robustness of study findings.
Missing data mechanisms
There are several reasons why data may be missing and not all of them introduce bias. The risk of bias due to missing data depends on the missing data mechanism which describes how propensity for missing data depends on the participant's characteristics and outcomes. Missing data mechanisms can be classified as follows:
I. Missing completely at random (MCAR)
The probability of a missing outcome is the same for all participants and does not depend on any participant characteristic (eg, if a participant misses some appointments due to scheduling difficulties). The MCAR assumption means that the group of participants who provided data is a random sample of the total population of participants, but this is often unrealistic in practice.
II. Missing at random (MAR)
The propensity for missingness is related to participants’ characteristics, but the probability of a missing outcome is not related to the outcome itself. For instance, suppose that primary school children are randomised to different interventions aiming to reduce school-related anxiety measured on a symptom severity scale. Younger children are less likely to provide data because they have a harder time understanding the items of the symptom severity scale. In the study, the proportion of young children and missing rates are expected to be comparable across interventions. The MAR assumption implies that outcomes for the younger children who dropped out are expected to be similar to outcomes for the younger children who completed the study.4 Under the MAR assumption, missingness does not depend on the actual outcome, although it is associated with some participant's or setting's characteristics. This term MAR is often confusing and sometimes misunderstood as MCAR. Statistical analyses usually start with MAR assumption, and, if it is true, analysis of completers only can provide an unbiased estimate of the relative treatment effect.8 However, the MAR assumption is formally untestable in meta-analyses as we usually have aggregated data and not enough information about those who dropped out; hence, we should consider a different approach for dealing with missing data. By contrast, it is possible to explore the MAR assumption using auxiliary data within an individual trial, for example, if baseline disease severity predicts missingness it is sensible to assume that final disease severity would predict missingness as well and the data may not be MAR.9
III. Missing not at random (MNAR) or informatively missing (IM)
Even accounting for all the available observed information, the probability that an observation is missing still depends on the unseen observations themselves. Participants may dropout for reasons that are associated with the actual effect of the intervention. In schizophrenia trials, for example, placebo arms show larger dropout rate than patients treated with antipsychotics because of placebo's lack of efficacy. Analysis of the participants who completed the study under MNAR would provide a biased estimate of the relative treatment effect. When missing data are MCAR or MAR they are termed ignorable. A MNAR mechanism is termed non-ignorable.
We use a hypothetical example to illustrate differences between the three categories. Consider an RCT with 200 participants randomised equally (1:1) to the experimental or control group (table 1). We assume that the true response rate is 33.3% in the control group and 50% in the experimental group so that the estimated OR should be 2. In the MCAR scenario, 10% of the participants dropped out because they missed the appointment. In the MAR scenario, young people dropped out because summer started and they left for vacation. In both these scenarios the missing rate across treatment groups is the same because groups are expected to have participants with similar baseline characteristics (eg, age). Therefore, the probability of dropping out is the same in both the groups. In the MNAR scenario, 40% of the participants who did not see any improvement dropped the study. The number of those not improved is larger in the control group; hence, missing rate is larger in the control group. Reasons for dropping out are related to the intervention received and more specifically to the actual outcome of the study. Missingness shed bias in the results that favours the control group because successes in completers remained unchanged in both groups but the number of completers is now smaller in the control group. We get unbiased results both in the MCAR and MAR scenarios, but to ignore missing data in the MNAR scenario may give biased results. Bias increases with sample size, hence large studies with data that are MNAR may give seriously biased results.