Protocol for a systematic review on the methodological and reporting quality of prediction model studies using machine learning techniques

Constanza L Andaur Navarro; Johanna A A G Damen; Toshihiko Takada; Steven W J Nijman; Paula Dhiman; Jie Ma; Gary S Collins; Ram Bajpai; Richard D Riley; Karel GM Moons; Lotty Hooft

doi:10.1136/bmjopen-2020-038832

Article Text

PDF

PDF +
Supplementary
Material

XML

Epidemiology

Protocol

Protocol for a systematic review on the methodological and reporting quality of prediction model studies using machine learning techniques

http://orcid.org/0000-0002-7745-2887Constanza L Andaur Navarro1,2,
http://orcid.org/0000-0001-7401-4593Johanna A A G Damen1,2,
http://orcid.org/0000-0002-8032-6224Toshihiko Takada1,
http://orcid.org/0000-0001-6798-2078Steven W J Nijman1,
http://orcid.org/0000-0002-0989-0623Paula Dhiman3,
http://orcid.org/0000-0002-3900-1903Jie Ma3,
http://orcid.org/0000-0002-2772-2316Gary S Collins3,
http://orcid.org/0000-0002-1227-2703Ram Bajpai4,
Richard D Riley4,
Karel GM Moons1,2,
http://orcid.org/0000-0002-7950-2980Lotty Hooft1,2

¹Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
²Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
³Center for Statistics in Medicine, University of Oxford, Oxford, UK
⁴School of Primary, Community and Social Care, Keele University, Keele, UK

Correspondence to Constanza L Andaur Navarro; c.l.andaurnavarro{at}umcutrecht.nl

Abstract

Introduction Studies addressing the development and/or validation of diagnostic and prognostic prediction models are abundant in most clinical domains. Systematic reviews have shown that the methodological and reporting quality of prediction model studies is suboptimal. Due to the increasing availability of larger, routinely collected and complex medical data, and the rising application of Artificial Intelligence (AI) or machine learning (ML) techniques, the number of prediction model studies is expected to increase even further. Prediction models developed using AI or ML techniques are often labelled as a ‘black box’ and little is known about their methodological and reporting quality. Therefore, this comprehensive systematic review aims to evaluate the reporting quality, the methodological conduct, and the risk of bias of prediction model studies that applied ML techniques for model development and/or validation.

Methods and analysis A search will be performed in PubMed to identify studies developing and/or validating prediction models using any ML methodology and across all medical fields. Studies will be included if they were published between January 2018 and December 2019, predict patient-related outcomes, use any study design or data source, and available in English. Screening of search results and data extraction from included articles will be performed by two independent reviewers. The primary outcomes of this systematic review are: (1) the adherence of ML-based prediction model studies to the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD), and (2) the risk of bias in such studies as assessed using the Prediction model Risk Of Bias ASsessment Tool (PROBAST). A narrative synthesis will be conducted for all included studies. Findings will be stratified by study type, medical field and prevalent ML methods, and will inform necessary extensions or updates of TRIPOD and PROBAST to better address prediction model studies that used AI or ML techniques.

Ethics and dissemination Ethical approval is not required for this study because only available published data will be analysed. Findings will be disseminated through peer-reviewed publications and scientific conferences.

Systematic review registration PROSPERO, CRD42019161764.

epidemiology
statistics & research methods
preventive medicine

https://creativecommons.org/licenses/by/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https://creativecommons.org/licenses/by/4.0/.

https://doi.org/10.1136/bmjopen-2020-038832

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

This protocol increases transparency to the methods and definitions used in our review and that are applied to develop prediction model studies using artificial intelligence or machine learning.
The systematic review will provide an overview and critical appraisal of the methodological and reporting quality, and risk of bias of prediction model studies using machine learning.
The findings of this review will provide the needed evidence for the development of tailored methodological and reporting guidelines for prediction model studies based on machine learning techniques.
We will build a sensitivity search strategy by using terms related to machine learning techniques, as well as conventional prediction techniques.
Language restriction to English might exclude additional studies published in other languages.

Introduction

Clinical prediction models aim to estimate the individualised probability that a particular outcome, for example, condition or disease, is present (diagnostic models) or whether a specific outcome will occur in the future (prognostic models).1–4 Studies addressing the development, validation and updating of prediction models are abundant in most clinical domains. For example, in cardiovascular disease, more than 350 prediction models have been developed and only a few have been validated.5 Moreover, systematic reviews have shown that, within different medical domains, the methodological and reporting quality of prediction model studies is suboptimal.6–10 Due to the increasing availability of larger, routinely collected and complex medical data, and the rising application of Artificial Intelligence (AI) or machine learning (ML) techniques for clinical prediction, the number of prediction model studies is expected to increase even further.

ML can be described as techniques that directly and automatically learn from data without being explicitly programmed for that task, and often without any prior assumption.11–13 Thus, ML relies on patterns and inferences from the data itself. A perceived advantage of ML over conventional statistical techniques is its ability to analyse ‘big’, non-linear and high-dimensional data, and thus its ability to model complex associations and scenarios. Due to the novelty, diversity, flexibility and complexity of ML techniques, ML-based prediction model studies are often considered as uninterpretable for many users. Inadequate reporting of, for example, data sources, study design, modelling processes, number of predictors and other data assumptions, makes prediction models developed with ML techniques published in medical journals difficult to interpret and to be validated by other researchers, creating barriers to their use in daily clinical practice.

Complete reporting is essential to judge the validity of any prediction model as it facilitates: study replication, independent validation of the prediction model, risk of bias assessments, interpretation of the results, meta-analysis of prediction models, and the judgement of the value and applicability of such model in real clinical settings for individualised predictions.14 While complete reporting reveals the strengths and limitations of a prediction model, it also enhances the use and implementation of prediction model in clinical practice. The ‘Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD)’ statement has been available since 2015, providing a checklist of 22 items considered essential for informative reporting of diagnostic or prognostic prediction model studies.15 16 Similarly, the Prediction model Risk Of Bias ASsessment Tool (PROBAST) was published in 2019 to guide the critical appraisal of prediction model studies.17 18 PROBAST provides signalling questions to facilitate both the applicability and risk of bias assessment of prediction model studies across four domains: participants, predictors, outcome and analysis. This assessment can only be correctly implemented if prediction model studies are properly reported. Although TRIPOD and PROBAST both covered all types of prediction modelling studies, including those using ML techniques, their focus was on regression-based modelling. The challenges and necessity for reporting and quality assessment guidelines in the AI/ML field have been addressed by several authors and this has led to initiatives, such as Consolidated Standards of Reporting Trials-AI (for randomised controlled trials), and Standard Protocol Items: Recommendations for Interventional Trials-AI (for clinical trial protocols). Similarly, for prediction model studies using ML, TRIPOD-ML and PROBAST-ML have been announced.19–21

To improve the quality, transparency and usability of ML-based prediction models in medicine, it is important to explore the current use and reporting of ML techniques in prediction model studies, to evaluate the methodological conduct and risk of bias using PROBAST, and assess the adherence to TRIPOD by performing a comprehensive systematic review.3 15–18 22

Study aim

The primary aim of this systematic review is to evaluate the reporting and the methodological conduct of studies reporting on prediction models developed with supervised ML techniques, across all medical fields. Specific objectives are to:

Evaluate the reporting quality of prediction models developed using ML techniques based on TRIPOD.
Assess the methodological quality and the risks of bias in prediction model development or validation studies using ML techniques based on PROBAST.
Identify key and emerging concepts for the development of tailored adaptations or extensions of both TRIPOD and PROBAST.

Methods and analysis

Our systematic review protocol was registered with the International Prospective Register of Systematic Reviews (PROSPERO) on 19 December 2019 (CRD42019161764). This protocol was prepared using the Preferred Reporting Items for Systematic Reviews and Meta-Analysis Protocols (PRISMA-P) 2015 statement.23

Eligibility criteria

Articles will be eligible for this review when describing primary studies on the development and/or validation of a multivariable diagnostic or prognostic prediction model with at least two predictors, using any supervised ML methodology within all medical fields, and published between January 2018 and December 2019. This last inclusion criterion is to obtain the most contemporary sample of articles that would reflect the current practices of applied methods in the ML prediction model field. We will include studies with any study design and data source, all patient-related health outcomes, all outcome formats and restricted to humans only. Further details about inclusion criteria are given in table 1.

View this table:

Table 1

Definition of inclusion criteria

Articles will be excluded from this review when reporting models that make predictions for enhancing the reading of images or signals (rather than for prediction of health outcomes in individuals), or use only genetic or molecular markers as candidate predictors. Furthermore, prognostic factor studies, secondary research, conference abstracts and studies for which no full text is available will also be excluded. The search will be restricted to articles available in English only. Further details about exclusion criteria are given in table 2.

View this table:

Table 2

Definition of exclusion criteria

Information sources

A literature search will be systematically applied in one major public-available electronic medical literature databases from 01 January 2018 to 31 December 2019.

Search strategy

The search strategy was built using keywords including ML-related terms (ie, ‘supervised learning’, ‘support vector machine’, ‘neural network’), prediction-related terms24 (ie, ‘risk’, ‘prognosis’) and several performance measures for prediction modelling (ie, ‘AUC’, ‘O:E ratio’). For search refinement, we selected 30 articles aligned with our inclusion/exclusion criteria to create a ‘golden bullet’ set. This set was analysed using SWIFT-Reviewer to obtain the most frequent words in the included articles by topic modelling.25 In MedlinerRanker, the analysis of the included and excluded golden bullets articles allowed us to obtain the most discriminative words to be considered in the search strategy.26 The final search strategy is presented in online supplemental file 1.

Supplemental material

[bmjopen-2020-038832supp001.pdf]

Study records

Data management

Study record information including title and abstract from the searched online database will be imported into EndNote Citation Manager and Rayyan systematic review software.27 These platforms will track and back up all activities when authors conduct the literature review process. Once eligible studies are identified, full-text articles will be downloaded for full-text screening and data extraction. Data items (below) will be extracted from the final included studies for review using Research Data Capture (REDCap) software.28

Selection process

Two researchers, from a group of seven (CLAN, TT, SWJN, PD, JM, RB, JAAGD), will independently screen the titles and abstracts to identify eligible studies according to the eligibility criteria. Two independent researchers, from the combination of the previous seven reviewers, will review the full text for potentially eligible articles; one researcher (CLAN) will screen all articles and six researchers (TT, SWJN, PD, JM, RB, JAAGD) will collectively screen a portion of the same articles for agreement. Disagreements between reviewers will be solved by consensus or consultation with a third investigator, if necessary (JAAGD). The study flow will be presented in a PRISMA flowchart.29

Data collection process

We will perform a double data extraction for all included articles. Two reviewers will independently extract data from each article using a standardised data extraction form. One researcher (CLAN) will extract data from all articles and six other researchers (TT, SWJN, PD, JM, RB, JAAGD) will collectively extract data from the same articles. The data extraction form will be piloted on five papers and amended, if necessary. Disagreements in data extraction will be discussed between the two reviewers, and adjudicated by a third reviewer (KGMM, GSC, RDR or LH), if necessary. The authors of the articles will be contacted for further information and clarification, if needed. Data and records will be maintained by the lead investigator (CLAN) and stored on a shared secure platform for access by all investigators (REDCap).

Data items

Data to be extracted will be informed by TRIPOD using the TRIPOD adherence guidance, PROBAST and the CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies.15–18 22 30 Additional items specifically relating to ML techniques for prediction model purposes, will also be extracted.

Extracted data will include study design for the development and validation of the model, outcomes to be predicted, setting, the intended use of the prediction model, study population, data source, patient characteristics, total study sample size, number of individuals with the outcome, number of predictors (candidate and final), internal validation type, predictive performance measures (discrimination and calibration), number of models developed and the details of the ML technique used to develop each model (eg, technique, preprocessing, data cleaning, optimisation algorithm, predictors selection, penalisation techniques, hyperparameters, code, data availability and so on). This form will contain instructions for the reviewers on how to assess the models presented in the articles. For example, the number of models developed will be based on how many ML techniques were used, including if several hyperparameters are tuned. We will set a limit to the number of models for data extraction to 10. The number of predictors will be counted based on what is reported in the article and/or supplemental file. If not stated, the number of predictors will be reported as unclear. The final data extraction form is presented in online supplemental file 2.

Supplemental material

[bmjopen-2020-038832supp002.pdf]

Outcomes

The primary outcomes of this systematic review are the adherence to the TRIPOD reporting guideline and the risk of bias assessed using PROBAST.17 18 22

Assessment of risk of bias

The risk of bias of individual studies is one of our outcomes of interest and will be assessed using PROBAST.17 18

Data synthesis

We will conduct a narrative synthesis of the extracted data. Data will be summarised using descriptive statistics and visual plots. Numbers and percentages will be used to describe categorical data about the reporting, methodological conduct and risks of bias of the studies. The distribution of continuous data, such as sample size and the number of predictors, will be assessed and described using mean and SD for normally distributed data and using median and 25th and 75th percentiles for non-normally distributed data. The risk of bias assessment will be summarised and graphically presented for each PROBAST domain and as an overall risk of bias judgement. Results will be stratified by study type (development with internal validation and/or external validation), medical field and prevalent ML techniques.

Meta-bias(es)

Meta-bias will not be investigated in this study.

Confidence in cumulative evidence

The strength of the body of evidence will not be assessed in this study.

Amendments

Protocol amendments will be listed and made available on the PROSPERO registration. Date, description and rationale will be given for each amendment.

Ethics and dissemination

Ethical approval is not required for this study because only available published data will be analysed. The findings of this systematic review will be published in an open-access journal to ensure access for all stakeholders and disseminated in various scientific conferences.

Patient and public involvement

Not applicable.

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Discussion

The use of ML has been increasingly recognised as a powerful tool to improve healthcare by enabling related professionals to make decisions based on the increasingly available and diverse sources of (bio)medical data. Particularly, ML-based prediction algorithms that are considered the key to unlock the increasingly available data sources, are intended to better inform real-time clinical decisions, support early warning systems and provide superhuman imaging diagnostics.31 However, published research about this topic rarely provides adequate information about the final predictive model, and its estimates and performance. Even more scarce is research where the prediction model is accessible for patients and healthcare professionals alike. Hence, ML-based prediction model studies are often seen as uninterpretable. This aspect of ML techniques is problematic especially in medical diagnosis and prognosis, hampering the judgement of quality, clinical acceptance and implementation.

At present, there is a limited number of systematic reviews regarding the reporting and methodological quality of ML-based prediction model studies and their risks of bias.32–34 In this systematic review, we will review across all medical fields, the current use of ML techniques in prediction model development, validation and updating studies, the methodological conduct and risks of bias using PROBAST, and the adherence to the reporting guideline using TRIPOD. Particularly, we will assess the extent to which risks of bias and reporting of ML-based prediction model studies match the current recommendations from TRIPOD and PROBAST,22 and the implications of these results to update or extend them to TRIPOD-ML and PROBAST-ML.

So far, our findings should be considered within limitations. ML is a recently developed concept and without a clear scope yet. Therefore, a sensitive search strategy is hard to build, which may result in a large number of abstracts to screen at initial stages. Additionally, we are only able to include articles in English, which will under-represent research available in other languages.

Acknowledgments

The authors would like to thank and acknowledge the support of Rene Spijker, information specialist.

References

↵
1. Moons KGM,
2. Royston P,
3. Vergouwe Y, et al
. Prognosis and prognostic research: what, why, and how? BMJ 2009;338:b375. doi:10.1136/bmj.b375pmid:http://www.ncbi.nlm.nih.gov/pubmed/19237405
OpenUrl FREE Full Text
↵
1. Steyerberg EW,
2. Moons KGM,
3. van der Windt DA, et al
. Prognosis research strategy (progress) 3: prognostic model research. PLoS Med 2013;10:e1001381. doi:10.1371/journal.pmed.1001381pmid:http://www.ncbi.nlm.nih.gov/pubmed/23393430
OpenUrl CrossRef PubMed
↵
1. Riley RD,
2. van der Windt D,
3. Croft P, et al
. Prognosis research in healthcare: concepts, methods, and impact. First ed. Oxford University Press, 2019.
↵
1. Steyerberg EW
. Clinical prediction models: a practical approach to development, validation, and updating. Second ed. Cham, Switzerland: Springer, 2019.
↵
1. Damen JAAG,
2. Hooft L,
3. Schuit E, et al
. Prediction models for cardiovascular disease risk in the general population: systematic review. BMJ 2016;353:i2416. doi:10.1136/bmj.i2416pmid:http://www.ncbi.nlm.nih.gov/pubmed/27184143
OpenUrl Abstract/FREE Full Text
↵
1. Heus P,
2. Damen JAAG,
3. Pajouheshnia R, et al
. Poor reporting of multivariable prediction model studies: towards a targeted implementation strategy of the TRIPOD statement. BMC Med 2018;16:120. doi:10.1186/s12916-018-1099-2pmid:http://www.ncbi.nlm.nih.gov/pubmed/30021577
OpenUrl PubMed
↵
1. Collins GS,
2. de Groot JA,
3. Dutton S, et al
. External validation of multivariable prediction models: a systematic review of methodological conduct and reporting. BMC Med Res Methodol 2014;14:40. doi:10.1186/1471-2288-14-40pmid:http://www.ncbi.nlm.nih.gov/pubmed/24645774
OpenUrl CrossRef PubMed
↵
1. Bouwmeester W,
2. Zuithoff NPA,
3. Mallett S, et al
. Reporting and methods in clinical prediction research: a systematic review. PLoS Med 2012;9:e1001221. doi:10.1371/journal.pmed.1001221pmid:http://www.ncbi.nlm.nih.gov/pubmed/22629234
OpenUrl CrossRef PubMed
↵
1. Collins GS,
2. Mallett S,
3. Omar O, et al
. Developing risk prediction models for type 2 diabetes: a systematic review of methodology and reporting. BMC Med 2011;9:103. doi:10.1186/1741-7015-9-103pmid:http://www.ncbi.nlm.nih.gov/pubmed/21902820
OpenUrl CrossRef PubMed
↵
1. Wen Z,
2. Guo Y,
3. Xu B, et al
. Developing risk prediction models for postoperative pancreatic fistula: a systematic review of methodology and reporting quality. Indian J Surg 2016;78:136–43.doi:10.1007/s12262-015-1439-9pmid:http://www.ncbi.nlm.nih.gov/pubmed/27303124
OpenUrl PubMed
↵
1. Mitchell TM
. Machine learning. New York, NY: McGraw Hill, 1997.
↵
1. Bi Q,
2. Goodman KE,
3. Kaminsky J, et al
. What is machine learning? A primer for the epidemiologist. Am J Epidemiol 2019;188:2222–39.doi:10.1093/aje/kwz189pmid:http://www.ncbi.nlm.nih.gov/pubmed/31509183
OpenUrl PubMed
↵
1. Sidey-Gibbons JAM,
2. Sidey-Gibbons CJ
. Machine learning in medicine: a practical introduction. BMC Med Res Methodol 2019;19:64. doi:10.1186/s12874-019-0681-4pmid:http://www.ncbi.nlm.nih.gov/pubmed/30890124
OpenUrl CrossRef PubMed
↵
1. Harrell FE,
2. Lee KL,
3. Mark DB
. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med 1996;15:361–87.doi:10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4pmid:http://www.ncbi.nlm.nih.gov/pubmed/8668867
OpenUrl CrossRef PubMed Web of Science
↵
1. Collins GS,
2. Reitsma JB,
3. Altman DG, et al
. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med 2015;162.
↵
1. Moons KGM,
2. Altman DG,
3. Reitsma JB, et al
. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med 2015;162:W1. doi:10.7326/M14-0698pmid:http://www.ncbi.nlm.nih.gov/pubmed/25560730
OpenUrl CrossRef PubMed
↵
1. Wolff RF,
2. Moons KGM,
3. Riley RD, et al
. PROBAST: a tool to assess the risk of bias and applicability of prediction model studies. Ann Intern Med 2019;170:51–8.doi:10.7326/M18-1376pmid:http://www.ncbi.nlm.nih.gov/pubmed/30596875
OpenUrl CrossRef PubMed
↵
1. Moons KGM,
2. Wolff RF,
3. Riley RD, et al
. PROBAST: a tool to assess risk of bias and applicability of prediction model studies: explanation and elaboration. Ann Intern Med 2019;170:W1. doi:10.7326/M18-1377pmid:http://www.ncbi.nlm.nih.gov/pubmed/30596876
OpenUrl CrossRef PubMed
↵
1. Liu Y,
2. Chen P-HC,
3. Krause J, et al
. How to read articles that use machine learning: users' guides to the medical literature. JAMA 2019;322:1806–16.doi:10.1001/jama.2019.16489pmid:http://www.ncbi.nlm.nih.gov/pubmed/31714992
OpenUrl CrossRef PubMed
↵
1. CONSORT-AI and SPIRIT-AI Steering Group
. Reporting guidelines for clinical trials evaluating artificial intelligence interventions are needed. Nat Med 2019;25:1467–8.doi:10.1038/s41591-019-0603-3pmid:http://www.ncbi.nlm.nih.gov/pubmed/31551578
OpenUrl CrossRef PubMed
↵
1. Collins GS,
2. Moons KGM
. Reporting of artificial intelligence prediction models. The Lancet 2019;393:1577–9.doi:10.1016/S0140-6736(19)30037-6
OpenUrl
↵
1. Heus P,
2. Damen JAAG,
3. Pajouheshnia R, et al
. Uniformity in measuring adherence to reporting guidelines: the example of TRIPOD for assessing completeness of reporting of prediction model studies. BMJ Open 2019;9:e025611. doi:10.1136/bmjopen-2018-025611pmid:http://www.ncbi.nlm.nih.gov/pubmed/31023756
OpenUrl Abstract/FREE Full Text
↵
1. Moher D,
2. Shamseer L,
3. Clarke M, et al
. Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement. Syst Rev 2015;4:1. doi:10.1186/2046-4053-4-1pmid:http://www.ncbi.nlm.nih.gov/pubmed/25554246
OpenUrl CrossRef PubMed
↵
1. Geersing G-J,
2. Bouwmeester W,
3. Zuithoff P, et al
. Search filters for finding prognostic and diagnostic prediction studies in MEDLINE to enhance systematic reviews. PLoS One 2012;7:e32844. doi:10.1371/journal.pone.0032844pmid:http://www.ncbi.nlm.nih.gov/pubmed/22393453
OpenUrl CrossRef PubMed
↵
1. Howard BE,
2. Phillips J,
3. Miller K, et al
. SWIFT-Review: a text-mining workbench for systematic review. Syst Rev 2016;5:87. doi:10.1186/s13643-016-0263-zpmid:http://www.ncbi.nlm.nih.gov/pubmed/27216467
OpenUrl PubMed
↵
1. Fontaine J-F,
2. Barbosa-Silva A,
3. Schaefer M, et al
. MedlineRanker: flexible ranking of biomedical literature. Nucleic Acids Res 2009;37:W141–6.doi:10.1093/nar/gkp353pmid:http://www.ncbi.nlm.nih.gov/pubmed/19429696
OpenUrl CrossRef PubMed Web of Science
↵
1. Ouzzani M,
2. Hammady H,
3. Fedorowicz Z, et al
. Rayyan-a web and mobile APP for systematic reviews. Syst Rev 2016;5:210. doi:10.1186/s13643-016-0384-4pmid:http://www.ncbi.nlm.nih.gov/pubmed/27919275
OpenUrl CrossRef PubMed
↵
1. Harris PA,
2. Taylor R,
3. Thielke R, et al
. Research electronic data capture (REDCap)--a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform 2009;42:377–81.doi:10.1016/j.jbi.2008.08.010pmid:http://www.ncbi.nlm.nih.gov/pubmed/18929686
OpenUrl CrossRef PubMed Web of Science
↵
1. Liberati A,
2. Altman DG,
3. Tetzlaff J, et al
. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. Ann Intern Med 2009;339.
↵
1. Moons KGM,
2. de Groot JAH,
3. Bouwmeester W, et al
. Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the charms checklist. PLoS Med 2014;11:e1001744. doi:10.1371/journal.pmed.1001744pmid:http://www.ncbi.nlm.nih.gov/pubmed/25314315
OpenUrl CrossRef PubMed
↵
1. Chen JH,
2. Asch SM
. Machine Learning and Prediction in Medicine - Beyond the peak of inflated expectations. N Engl J Med 2018;376.
↵
1. Christodoulou E,
2. Ma J,
3. Collins GS, et al
. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol 2019;110:12–22.doi:10.1016/j.jclinepi.2019.02.004pmid:http://www.ncbi.nlm.nih.gov/pubmed/30763612
OpenUrl CrossRef PubMed
↵
1. Shillan D,
2. Sterne JAC,
3. Champneys A, et al
. Use of machine learning to analyse routinely collected intensive care unit data: a systematic review. Crit Care 2019;23:284. doi:10.1186/s13054-019-2564-9pmid:http://www.ncbi.nlm.nih.gov/pubmed/31439010
OpenUrl PubMed
↵
1. Wang W,
2. Kiik M,
3. Peek N, et al
. A systematic review of machine learning models for predicting outcomes of stroke with structured data. PLoS One 2020;15:e0234722. doi:10.1371/journal.pone.0234722pmid:http://www.ncbi.nlm.nih.gov/pubmed/32530947
OpenUrl PubMed

Supplementary materials

Supplementary Data

This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Data supplement 1
Data supplement 2

Footnotes

Contributors The study concept and design were conceived by CLAN, JAAGD, KGMM, LH, PD, GSC and RDR. CLAN, JAAGD, TT, SWJN, PD, JM and RB will conduct article screening and data extraction. CLAN will perform data analysis. All authors drafted this manuscript, revised it for important content and have provided the final approval of this version. CLAN, the corresponding author, is the guarantor of the review.
Funding GSC is funded by the National Institute for Health Research (NIHR) Oxford Biomedical Research Centre (BRC) and by Cancer Research UK programme grant (C49297/A27294). PD is funded by the NIHR Oxford BRC.
Disclaimer The funder was not involved in the development of this protocol.
Competing interests None declared.
Patient consent for publication Not required.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

[1] ↵
Moons KGM,
Royston P,
Vergouwe Y, et al
. Prognosis and prognostic research: what, why, and how? BMJ 2009;338:b375. doi:10.1136/bmj.b375pmid:http://www.ncbi.nlm.nih.gov/pubmed/19237405
OpenUrl FREE Full Text

[2] Moons KGM,

[3] Royston P,

[4] Vergouwe Y, et al

[5] ↵
Steyerberg EW,
Moons KGM,
van der Windt DA, et al
. Prognosis research strategy (progress) 3: prognostic model research. PLoS Med 2013;10:e1001381. doi:10.1371/journal.pmed.1001381pmid:http://www.ncbi.nlm.nih.gov/pubmed/23393430
OpenUrl CrossRef PubMed

[6] Steyerberg EW,

[7] Moons KGM,

[8] van der Windt DA, et al

[9] ↵
Riley RD,
van der Windt D,
Croft P, et al
. Prognosis research in healthcare: concepts, methods, and impact. First ed. Oxford University Press, 2019.

[10] Riley RD,

[11] van der Windt D,

[12] Croft P, et al

[13] ↵
Steyerberg EW
. Clinical prediction models: a practical approach to development, validation, and updating. Second ed. Cham, Switzerland: Springer, 2019.

[14] Steyerberg EW

[15] ↵
Damen JAAG,
Hooft L,
Schuit E, et al
. Prediction models for cardiovascular disease risk in the general population: systematic review. BMJ 2016;353:i2416. doi:10.1136/bmj.i2416pmid:http://www.ncbi.nlm.nih.gov/pubmed/27184143
OpenUrl Abstract/FREE Full Text

[16] Damen JAAG,

[17] Hooft L,

[18] Schuit E, et al

[19] ↵
Heus P,
Damen JAAG,
Pajouheshnia R, et al
. Poor reporting of multivariable prediction model studies: towards a targeted implementation strategy of the TRIPOD statement. BMC Med 2018;16:120. doi:10.1186/s12916-018-1099-2pmid:http://www.ncbi.nlm.nih.gov/pubmed/30021577
OpenUrl PubMed

[20] Heus P,

[21] Damen JAAG,

[22] Pajouheshnia R, et al

[23] ↵
Collins GS,
de Groot JA,
Dutton S, et al
. External validation of multivariable prediction models: a systematic review of methodological conduct and reporting. BMC Med Res Methodol 2014;14:40. doi:10.1186/1471-2288-14-40pmid:http://www.ncbi.nlm.nih.gov/pubmed/24645774
OpenUrl CrossRef PubMed

[24] Collins GS,

[25] de Groot JA,

[26] Dutton S, et al

[27] ↵
Bouwmeester W,
Zuithoff NPA,
Mallett S, et al
. Reporting and methods in clinical prediction research: a systematic review. PLoS Med 2012;9:e1001221. doi:10.1371/journal.pmed.1001221pmid:http://www.ncbi.nlm.nih.gov/pubmed/22629234
OpenUrl CrossRef PubMed

[28] Bouwmeester W,

[29] Zuithoff NPA,

[30] Mallett S, et al

[31] ↵
Collins GS,
Mallett S,
Omar O, et al
. Developing risk prediction models for type 2 diabetes: a systematic review of methodology and reporting. BMC Med 2011;9:103. doi:10.1186/1741-7015-9-103pmid:http://www.ncbi.nlm.nih.gov/pubmed/21902820
OpenUrl CrossRef PubMed

[32] Collins GS,

[33] Mallett S,

[34] Omar O, et al

[35] ↵
Wen Z,
Guo Y,
Xu B, et al
. Developing risk prediction models for postoperative pancreatic fistula: a systematic review of methodology and reporting quality. Indian J Surg 2016;78:136–43.doi:10.1007/s12262-015-1439-9pmid:http://www.ncbi.nlm.nih.gov/pubmed/27303124
OpenUrl PubMed

[36] Wen Z,

[37] Guo Y,

[38] Xu B, et al

[39] ↵
Mitchell TM
. Machine learning. New York, NY: McGraw Hill, 1997.

[40] Mitchell TM

[41] ↵
Bi Q,
Goodman KE,
Kaminsky J, et al
. What is machine learning? A primer for the epidemiologist. Am J Epidemiol 2019;188:2222–39.doi:10.1093/aje/kwz189pmid:http://www.ncbi.nlm.nih.gov/pubmed/31509183
OpenUrl PubMed

[42] Bi Q,

[43] Goodman KE,

[44] Kaminsky J, et al

[45] ↵
Sidey-Gibbons JAM,
Sidey-Gibbons CJ
. Machine learning in medicine: a practical introduction. BMC Med Res Methodol 2019;19:64. doi:10.1186/s12874-019-0681-4pmid:http://www.ncbi.nlm.nih.gov/pubmed/30890124
OpenUrl CrossRef PubMed

[46] Sidey-Gibbons JAM,

[47] Sidey-Gibbons CJ

[48] ↵
Harrell FE,
Lee KL,
Mark DB
. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med 1996;15:361–87.doi:10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4pmid:http://www.ncbi.nlm.nih.gov/pubmed/8668867
OpenUrl CrossRef PubMed Web of Science

[49] Harrell FE,

[50] Lee KL,

[51] Mark DB

[52] ↵
Collins GS,
Reitsma JB,
Altman DG, et al
. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med 2015;162.

[53] Collins GS,

[54] Reitsma JB,

[55] Altman DG, et al

[56] ↵
Moons KGM,
Altman DG,
Reitsma JB, et al
. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med 2015;162:W1. doi:10.7326/M14-0698pmid:http://www.ncbi.nlm.nih.gov/pubmed/25560730
OpenUrl CrossRef PubMed

[57] Moons KGM,

[58] Altman DG,

[59] Reitsma JB, et al

[60] ↵
Wolff RF,
Moons KGM,
Riley RD, et al
. PROBAST: a tool to assess the risk of bias and applicability of prediction model studies. Ann Intern Med 2019;170:51–8.doi:10.7326/M18-1376pmid:http://www.ncbi.nlm.nih.gov/pubmed/30596875
OpenUrl CrossRef PubMed

[61] Wolff RF,

[62] Moons KGM,

[63] Riley RD, et al

[64] ↵
Moons KGM,
Wolff RF,
Riley RD, et al
. PROBAST: a tool to assess risk of bias and applicability of prediction model studies: explanation and elaboration. Ann Intern Med 2019;170:W1. doi:10.7326/M18-1377pmid:http://www.ncbi.nlm.nih.gov/pubmed/30596876
OpenUrl CrossRef PubMed

[65] Moons KGM,

[66] Wolff RF,

[67] Riley RD, et al

[68] ↵
Liu Y,
Chen P-HC,
Krause J, et al
. How to read articles that use machine learning: users' guides to the medical literature. JAMA 2019;322:1806–16.doi:10.1001/jama.2019.16489pmid:http://www.ncbi.nlm.nih.gov/pubmed/31714992
OpenUrl CrossRef PubMed

[69] Liu Y,

[70] Chen P-HC,

[71] Krause J, et al

[72] ↵
CONSORT-AI and SPIRIT-AI Steering Group
. Reporting guidelines for clinical trials evaluating artificial intelligence interventions are needed. Nat Med 2019;25:1467–8.doi:10.1038/s41591-019-0603-3pmid:http://www.ncbi.nlm.nih.gov/pubmed/31551578
OpenUrl CrossRef PubMed

[73] CONSORT-AI and SPIRIT-AI Steering Group

[74] ↵
Collins GS,
Moons KGM
. Reporting of artificial intelligence prediction models. The Lancet 2019;393:1577–9.doi:10.1016/S0140-6736(19)30037-6
OpenUrl

[75] Collins GS,

[76] Moons KGM

[77] ↵
Heus P,
Damen JAAG,
Pajouheshnia R, et al
. Uniformity in measuring adherence to reporting guidelines: the example of TRIPOD for assessing completeness of reporting of prediction model studies. BMJ Open 2019;9:e025611. doi:10.1136/bmjopen-2018-025611pmid:http://www.ncbi.nlm.nih.gov/pubmed/31023756
OpenUrl Abstract/FREE Full Text

[78] Heus P,

[79] Damen JAAG,

[80] Pajouheshnia R, et al

[81] ↵
Moher D,
Shamseer L,
Clarke M, et al
. Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement. Syst Rev 2015;4:1. doi:10.1186/2046-4053-4-1pmid:http://www.ncbi.nlm.nih.gov/pubmed/25554246
OpenUrl CrossRef PubMed

[82] Moher D,

[83] Shamseer L,

[84] Clarke M, et al

[85] ↵
Geersing G-J,
Bouwmeester W,
Zuithoff P, et al
. Search filters for finding prognostic and diagnostic prediction studies in MEDLINE to enhance systematic reviews. PLoS One 2012;7:e32844. doi:10.1371/journal.pone.0032844pmid:http://www.ncbi.nlm.nih.gov/pubmed/22393453
OpenUrl CrossRef PubMed

[86] Geersing G-J,

[87] Bouwmeester W,

[88] Zuithoff P, et al

[89] ↵
Howard BE,
Phillips J,
Miller K, et al
. SWIFT-Review: a text-mining workbench for systematic review. Syst Rev 2016;5:87. doi:10.1186/s13643-016-0263-zpmid:http://www.ncbi.nlm.nih.gov/pubmed/27216467
OpenUrl PubMed

[90] Howard BE,

[91] Phillips J,

[92] Miller K, et al

[93] ↵
Fontaine J-F,
Barbosa-Silva A,
Schaefer M, et al
. MedlineRanker: flexible ranking of biomedical literature. Nucleic Acids Res 2009;37:W141–6.doi:10.1093/nar/gkp353pmid:http://www.ncbi.nlm.nih.gov/pubmed/19429696
OpenUrl CrossRef PubMed Web of Science

[94] Fontaine J-F,

[95] Barbosa-Silva A,

[96] Schaefer M, et al

[97] ↵
Ouzzani M,
Hammady H,
Fedorowicz Z, et al
. Rayyan-a web and mobile APP for systematic reviews. Syst Rev 2016;5:210. doi:10.1186/s13643-016-0384-4pmid:http://www.ncbi.nlm.nih.gov/pubmed/27919275
OpenUrl CrossRef PubMed

[98] Ouzzani M,

[99] Hammady H,

[100] Fedorowicz Z, et al

[101] ↵
Harris PA,
Taylor R,
Thielke R, et al
. Research electronic data capture (REDCap)--a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform 2009;42:377–81.doi:10.1016/j.jbi.2008.08.010pmid:http://www.ncbi.nlm.nih.gov/pubmed/18929686
OpenUrl CrossRef PubMed Web of Science

[102] Harris PA,

[103] Taylor R,

[104] Thielke R, et al

[105] ↵
Liberati A,
Altman DG,
Tetzlaff J, et al
. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. Ann Intern Med 2009;339.

[106] Liberati A,

[107] Altman DG,

[108] Tetzlaff J, et al

[109] ↵
Moons KGM,
de Groot JAH,
Bouwmeester W, et al
. Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the charms checklist. PLoS Med 2014;11:e1001744. doi:10.1371/journal.pmed.1001744pmid:http://www.ncbi.nlm.nih.gov/pubmed/25314315
OpenUrl CrossRef PubMed

[110] Moons KGM,

[111] de Groot JAH,

[112] Bouwmeester W, et al

[113] ↵
Chen JH,
Asch SM
. Machine Learning and Prediction in Medicine - Beyond the peak of inflated expectations. N Engl J Med 2018;376.

[114] Chen JH,

[115] Asch SM

[116] ↵
Christodoulou E,
Ma J,
Collins GS, et al
. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol 2019;110:12–22.doi:10.1016/j.jclinepi.2019.02.004pmid:http://www.ncbi.nlm.nih.gov/pubmed/30763612
OpenUrl CrossRef PubMed

[117] Christodoulou E,

[118] Ma J,

[119] Collins GS, et al

[120] ↵
Shillan D,
Sterne JAC,
Champneys A, et al
. Use of machine learning to analyse routinely collected intensive care unit data: a systematic review. Crit Care 2019;23:284. doi:10.1186/s13054-019-2564-9pmid:http://www.ncbi.nlm.nih.gov/pubmed/31439010
OpenUrl PubMed

[121] Shillan D,

[122] Sterne JAC,

[123] Champneys A, et al

[124] ↵
Wang W,
Kiik M,
Peek N, et al
. A systematic review of machine learning models for predicting outcomes of stroke with structured data. PLoS One 2020;15:e0234722. doi:10.1371/journal.pone.0234722pmid:http://www.ncbi.nlm.nih.gov/pubmed/32530947
OpenUrl PubMed

[125] Wang W,

[126] Kiik M,

[127] Peek N, et al

Log in using your username and password

Main menu

Log in using your username and password

You are here

Abstract

Statistics from Altmetric.com

Request Permissions

Strengths and limitations of this study

Introduction

Study aim

Methods and analysis

Eligibility criteria

Information sources

Search strategy

Supplemental material

Study records

Data management

Selection process

Data collection process

Data items

Supplemental material

Outcomes

Assessment of risk of bias

Data synthesis

Meta-bias(es)

Confidence in cumulative evidence

Amendments

Ethics and dissemination

Patient and public involvement

Ethics approval and consent to participate

Consent for publication

Discussion

Acknowledgments

References

Supplementary materials

Supplementary Data

Footnotes

Read the full text or download the PDF:

Log in using your username and password