Article Text
Abstract
Introduction Cardiovascular disease (CVD) causes a quarter of all deaths in the UK,(1) and the NHS Long Term Plan emphasises that earlier detection and treatment of cardiovascular, renal and metabolic risk factors is a priority.(2) We trained, tested and prospectively implemented a machine learning algorithm in primary care electronic health record (EHR) data to identify individuals at higher risk of incident cardio-renal-metabolic diseases and cardiovascular death.(3, 4)
Methods We used UK primary care EHR data from 2 081 139 individuals aged ≥30 years (Jan 2, 1998, Nov 30, 2018), randomly divided into training (80%) and testing (20%) datasets. We trained a random forest classifier using age, sex, ethnicity and comorbidities. We calculated the cumulative incidence rate for ten cardio-renal-metabolic diseases and death, and excluded individuals for the analysis of each disease who had a preceding diagnosis of that disease. Fine and Gray’s models with competing risk of death were fit for each outcome between higher and lower predicted risk.
We implemented the algorithm in a pilot interventional non-randomised single arm study (OPTIMISE) across six primary care sites. Consenting individuals aged ≥30 years at higher predicted risk received community-based cardio-renal-metabolic phenotyping and assessment for guideline-adherence of current treatment.
Results In the testing dataset (n = 416 228), individuals at higher predicted risk had higher long-term risk of heart failure (HR 12.54, 95% CI 12.08–13.01), aortic stenosis (9.98, 9.16–10.87), AF (HR 8·75, 95% CI 8·44–9·06), stroke/TIA (8.07, 7.80–8.34), chronic kidney disease (CKD) (6.85, 6.70–7.00), peripheral vascular disease (6.62, 6.28–6.98), valvular heart disease (6.49, 6.14–6.85), MI (5.02, 4.82–5.22), diabetes (2.05, 2.00–2.10) and COPD (2.02, 2.00–2.05) (figure 1). This cohort were also at higher risk of death (10.45, 10.23–10.68), accounting for 74% of cardiovascular deaths (8582 of 11676) during 10-year follow up.
Of 82 higher risk patients in the pilot clinical implementation (mean age 71.6 years (SD 7.5), 50% women), 78.0% had hypertension and 37.8% had type 2 diabetes (table 1). Of higher risk patients with hypertension, 58.5% (31/53) of those aged <80 years had a systolic blood pressure (SBP)>140 mmHg, and 54.5% (6/11) of those aged ≥80 years had a SBP >150 mmHg. Of those with type 2 diabetes and co-existent CVD, only 23.1% (3/13) were on SGLT2 inhibitor therapy. Of higher risk patients on statin therapy, 37.0% (20/54) had LDL-cholesterol >1.8 mmol/L, and 23.1% (3/13) of patients with previous CVD had an LDL-cholesterol >2.0 mmol/L (table 2).
Furthermore, 19.5% (16/82) of the higher risk cohort had undiagnosed moderate or high risk CKD. Those with unrecognised CKD were often not on a statin (41.7%; 5/12), ACE-i/ARB therapy with co-existent hypertension (61.5%. 8/13), or an SGLT2 inhibitor with co-existent diabetes (50.0% (3/6), 83.3% (5/6), respectively). Almost half of the cohort (49%) were found to be obese, and 17% (14/82) were eligible for GLP-1 RA therapy.
Conclusions Machine learning can identify people at higher risk of cardio-renal-metabolic diseases and death in UK primary care EHR data. On prospective evaluation higher risk individuals have unrecorded and undertreated cardio-renal-metabolic diseases, which are actionable targets for integrated multi-disciplinary preventative care.
Baseline characteristics of higher risk participants
Baseline investigations and medications for higher risk participants
Kaplan-Meier plots for the ten cardio-renal-metabolic-pulmonary outcomes
Conflict of Interest None