Article Text
Abstract
Objective The current study was undertaken for use of the decision tree (DT) method for development of different prediction models for incidence of type 2 diabetes (T2D) and for exploring interactions between predictor variables in those models.
Design Prospective cohort study.
Setting Tehran Lipid and Glucose Study (TLGS).
Methods A total of 6647 participants (43.4% men) aged >20 years, without T2D at baselines ((1999–2001) and (2002–2005)), were followed until 2012. 2 series of models (with and without 2-hour postchallenge plasma glucose (2h-PCPG)) were developed using 3 types of DT algorithms. The performances of the models were assessed using sensitivity, specificity, area under the ROC curve (AUC), geometric mean (G-Mean) and F-Measure.
Primary outcome measure T2D was primary outcome which defined if fasting plasma glucose (FPG) was ≥7 mmol/L or if the 2h-PCPG was ≥11.1 mmol/L or if the participant was taking antidiabetic medication.
Results During a median follow-up of 9.5 years, 729 new cases of T2D were identified. The Quick Unbiased Efficient Statistical Tree (QUEST) algorithm had the highest sensitivity and G-Mean among all the models for men and women. The models that included 2h-PCPG had sensitivity and G-Mean of (78% and 0.75%) and (78% and 0.78%) for men and women, respectively. Both models achieved good discrimination power with AUC above 0.78. FPG, 2h-PCPG, waist-to-height ratio (WHtR) and mean arterial blood pressure (MAP) were the most important factors to incidence of T2D in both genders. Among men, those with an FPG≤4.9 mmol/L and 2h-PCPG≤7.7 mmol/L had the lowest risk, and those with an FPG>5.3 mmol/L and 2h-PCPG>4.4 mmol/L had the highest risk for T2D incidence. In women, those with an FPG≤5.2 mmol/L and WHtR≤0.55 had the lowest risk, and those with an FPG>5.2 mmol/L and WHtR>0.56 had the highest risk for T2D incidence.
Conclusions Our study emphasises the utility of DT for exploring interactions between predictor variables.
- Diabetes
- Interaction
- Decision tree
- Data Mining
- Prediction
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Statistics from Altmetric.com
Footnotes
Contributors FA and FH designed the study protocol, and participated in the coordination and management of the study. AR performed the statistical analysis and wrote the manuscript. EH, JS and OP participated in the statistical analysis and interpretation of data. All authors read and approved the final manuscript.
Funding This study was supported by grant number 121 from the National Research Council of the Islamic Republic of Iran.
Disclaimer The funding source had no role in the design, in the collection, analysis and interpretation of data, in the writing of the manuscript, and in the decision to submit the manuscript for publication.
Competing interests None declared.
Patient consent Obtained.
Ethics approval This study was approved by the Ethical Committee of the Research Institute for Endocrine Sciences.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement No additional data are available.