Overview
Details on methods and participant characteristics of IRL-GRey have been published elsewhere.11 12 Briefly, the IRL-GRey dataset included 454 patients 60 years and older with MDD according to the criteria of the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV)13 confirmed with the Structured Clinical Interview for DSM-IV14 and at least moderate depressive symptoms as reflected by a Montgomery-Asberg Depression Rating Scale (MADRS)15 score ≥15. Patients were excluded if they had dementia, bipolar disorder, schizophrenia, current psychotic symptoms or substance abuse or dependence within the past 6 months.
Participants were assessed with the MADRS at weeks 1, 2, 4, 6, 8, 10 and 12. They were treated with venlafaxine XR starting at 37.5 mg/day, titrated to 150 mg/day over 2 weeks as tolerated; if remission was not attained after 6 weeks, venlafaxine XR was further increased up to 300 mg/day for up to six additional weeks. Participants completed open treatment with venlafaxine XR either when they attained remission (defined as a MADRS score ≤10 for two consecutive visits) or when they completed 12 weeks of treatment.
Participant characteristics
We included in our analysis the following participant characteristics obtained at baseline as potential predictors of treatment response: age; self-reported sex at birth; self-reported race (white vs other); years of education; burden of comorbid physical illness measured with the Cumulative Illness Rating Scale-Geriatrics (CIRS-G)16; diagnosis of hypertension, heart disease or diabetes; severity of depressive symptoms measured with the MADRS total score; suicidality defined as a score ≥3 on the MADRS suicide item; severity of comorbid anxiety measured with the Brief Symptom Inventory (BSI)17 anxiety score; not having responded to at least one previous adequate antidepressant trial before starting venlafaxine XR (as opposed to being treatment naïve or having only inadequate antidepressant trials) based on a score of 3 or higher on the antidepressant treatment history form2 18; duration of current episode; single or recurrent depression status; and age at onset of first MDD episode. These variables were chosen based on previous studies showing their association with antidepressant outcomes in the IRL-GRey dataset2 11 and in other treatment studies of MDD in younger5 8 or older adults.2 5 7 11 18 19
For this analysis, the primary outcome was a full antidepressant response, defined as a decrease in MADRS score higher than 50% from baseline. This outcome and this definition were used as they have been shown to be suitable for providing MBC and they do not depend on the scale used.3 Based on previous studies,4 20 partial response was defined as a decrease in MADRS score of 25%–50% from baseline and non-response as a decrease in MADRS score of less than 25%.
Data analysis
All analyses (except for the receiver operating characteristic (ROC) analysis) were performed using IBM SPSS Statistics 26. Because length of treatment was variable (ie, some participants dropped out before they completed the study, while other participants completed the study when they attained remission), imputations were used for both intermittent missing and monotone missing observations using the Markov Chain Monte Carlo option of SPSS multiple imputation procedure. Missing values were replaced with an average of five imputations, and a full dataset was created for the 454 participants as we have done in previous analyses.4 10 11 Imputation details, including t-tests comparing original and imputed data, can be found in online supplemental table 1).
Identification of an early decision point
This first analysis aimed at identifying a time point that would fulfil the following three conditions: (1) more than 40% of partial responders at this time point attain full response at the end of treatment (ie, there is still hope for partial responders); (2) less than 25% of non-responders at this time point attain full response at the end of treatment (ie, most non-responders have been identified); (3) the proportion of full-responders at this time point with full response status at week 12 is sustained (ie, full-response is sustained). We call this time point the ‘early decision point’. Condition #1 used 40% as a threshold because placebo-controlled clinical trials typically report antidepressant response rates of about 40% in geriatric depression.21 Condition #2 was added because a recent meta-analysis showed that a third of patients with MDD and early non-improvement fully respond when treatment is extended to 12 weeks.22 Thus, we set the threshold to a quarter of participants for criterion 2, which is a smaller minority. Condition #3 was added because some patients showing very early response revert to non-response status with longer treatment as their initial improvement may represent a ‘placebo effect’ rather than a true antidepressant effect.23
To identify an early decision point, for each assessment point up to week 10 (ie, weeks 1, 2, 4, 6, 8, 10), participants were divided into three groups: full responders, partial responders and non-responders. Then, for condition #1, we examined the proportion of partial responders at each assessment point who attain full response at week 12; and for condition #3, we examined the proportion of full-responders at each assessment point who maintain full response at week 12. For condition #2, as in our previous work,4 we divided partial responders and non-responders at each assessment point up to week 10 (weeks 1, 2, 4, 6, 8, 10) and, for each of these assessment points, we calculated the proportion of participants who attain full response after various additional lengths of treatment up to week 12. For example, for participants at week 1, they will have additional lengths of treatment of 1, 3, 5, 7, 9 and 11 weeks. For participants at week 8, they will have additional lengths of treatment of 2 and 4 weeks, adding up to 12 weeks in total. Finally, for each length of treatment, we calculated weighted mean proportion of participants who attained full response, with number of participants within each group (partial vs non-responder) at each assessment point used as weights.
Generalised estimating equations for comparing the effect of added lengths of treatment in partial responders and non-responders at the early decision point
After selecting the early decision point that met all three conditions, repeated measures model with generalised estimating equation (GEE) was performed to statistically compare the proportions of partial responders and non-responders at this early decision point who attained full response with each added length of treatment.4 We used the GEE approach because GEE, which is a population-average model, accounts for within-group non-independence of observations and estimates the average response of the population within a group. As we were interested in estimating group effects, rather than model subject-specific effects, GEE was deemed appropriate.24 Group (partial responders vs non-responders at the early decision point) and time effects (additional lengths of treatment in weeks) on attainment of full response were examined. The same comparison was repeated for each assessment point after the early decision point to determine whether added lengths of treatment would have a different effect in differentiating partial responders and non-responders at later assessment points.
Decision trees for predicting treatment response after 12 weeks of treatment at the early decision point
In a previous analysis, we identified demographic and clinical predictors of remission in all IRL-GRey participants, using a priori the change in MADRS after 2 weeks of treatment as one of the potential predictors.11 Having at least one previous adequate antidepressant trial, baseline MADRS score, and improvement in MADRS score after 2 weeks were identified as significant predictors.11 Our analysis expanded on these findings by identifying predictors of treatment response in participants who did not attain full response (ie, partial and non-responders) by the early decision point. We did this because a clinician would not change antidepressant treatment in a patient who has already attained full treatment response. For this analysis, we considered all the baseline patient characteristics discussed above plus response status (partial or non-response) at the early decision point. The Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis checklist25 for reporting multivariate models can be found in online supplemental material 1.
We judged that our sample size was adequate for examining 14 predictors based on the rule of thumb of 10 events per variable (EPV).26 Sample size calculation was also performed using a method proposed by Riley et al for binary outcomes (ie, logistic regression)27 post hoc. Details of this sample size calculation can be found in online supplemental material 2.
We used ROC V.5.07 (https://web.stanford.edu/%7Eyesavage/ROC.html) with an ROC curve analysis based on a modelling strategy where the programme searches all the predictor variables and identifies the optimal predictor variable that best predicts the outcome of interest using signal detection theory, weighting the importance of sensitivity and specificity.28 Predictors were placed in a decision tree where the highest predicting variable divides the sample into two subsamples, and the process is repeated until the lowest predicting variable is found using a stopping rule of p<0.05. This approach is useful for analyses where predictors are likely to have high collinearity.28 We developed two decision trees using sensitivity cutoffs of 0.3 (‘low’) and 0.7 (‘high’). A low sensitivity decision tree minimises false positives (ie, falsely identifying a participant as someone who would attain eventual treatment response when they would not). This type of tree should be used for patients who are at high risk, such as inpatients or patients at risk of suicide. A high sensitivity decision tree minimises false negatives (ie, falsely identifying a participant as someone who would not attain eventual treatment response when they would). This type of tree should be used for patients in whom clinicians want to minimise medication changes, such as outpatients who have had multiple unsuccessful trials or frail outpatients.10 Negative predictive value (NPV, ie, ability to predict eventual non-response) of the leftmost terminal node, representing the NPV of the combination of predictors identifying the subgroup of participants who are most likely to not reach eventual treatment response, is presented as a measure of model performance. NPV, positive predictive value (PPV) and accuracy of the overall decision tree are presented for general information. Decision trees trained on the complete dataset are presented. A fivefold cross-validation was performed to test the performance of the two decision trees. We report the average of the five test NPVs of the leftmost terminal nodes; the average of the five test NPVs of the overall trees; test PPVs of the overall trees; test accuracies of the overall trees; and the predictors of the five training decision trees for low and high sensitivity cutoffs.