Augmented versus artificial intelligence for stratification of patients with myositis ===================================================================================== * Michael Mahler * Brenden Rossin * Olga Kubassova * polymyositis * dermatomyositis * autoimmune diseases * autoimmunity With interest we read the recent article by Pinal-Fernandez and Mammen,1 which comments on the paper by Spielmann *et al* 2 and to a lesser extent on the contribution by Mariampillai *et al* 3 4 and raises concerns about the artificial intelligence (AI)-driven approach used to define subgroups of patients with idiopathic inflammatory myopathy (IIM). To illustrate this, Pinal-Fernandez and Mammen constructed a library of 1000 observations and selected the four variables using a multivariate normal distribution, thus finding a similar clustering as in the original paper by Spielmann *et al*.2 We share some of the concerns about unsupervised learning techniques raised by Pinal-Fernandez and Mammen.1 In this letter, we would like to highlight several aspects related to AI-driven methodologies. Machine learning (ML) is a subset of AI that enables a computer to make decisions based on the large dataset. When applied to clustering, it will always give an ‘optimal’ solution for the number of clusters ‘present’ in a dataset. However, it is up to the human user's discretion to determine whether those clusters exist. An ML algorithm determines a number of clusters by separating the datasets into the subgroups through a process of optimising (1) separation between each cluster to its greatest and (2) ensuring that within a cluster, the distance to the cluster centre for each point is the smallest. Such an algorithm is essentially trying to identify a number of optimal clusters that allow each cluster to be distinct from the others. The goal is to have tight individual clusters that are very distinguishable from the others. In any dataset, the algorithms will present an optimal solution to those or similar criteria, but it does not always mean those clusters are truly significant or meaningful. Visualising the clusters using dimensionality reduction techniques such as principal component analysis or t-distributed stochastic neighbour embedding is vital for this process, in addition to more quantitative methods such as comparing intracluster variation, intercluster variation and silhouette scoring. That is why researchers using ML should ideally be ‘bilingual’ and understand both the mathematics and algorithms, as well the science and clinical meaning behind the results. To conclude, we emphasise that, no doubt, ML has the potential to improve the stratification of patients with IIM if certain concepts of data science are followed as also pointed out by a task force of the European League Against Rheumatism for big data and AI.5 ML relies on large, standardised and curated datasets that require large patient cohorts. Due to the rarity of IIM, larger patient cohorts (such as the MyoNet/EuroMyositis)6 are required to generate quality data. Once larger and curated datasets are available, the ML approach is a powerful alternative to human judgement and can improve future classification criteria for IIM.4 7 8 Today, we argue for the use of ML alongside expert decision, thus relying on augmented judgement when making the final decision on patient stratification especially when building AI-based models. Augmented intelligence has the potential for improved patient stratification in IIM. ## Footnotes * Contributors All authors participated in writing of the letter. * Competing interests None declared. * Patient consent for publication Not required. * Provenance and peer review Not commissioned; internally peer reviewed. ## References 1. Pinal-Fernandez I , Mammen AL . On using machine learning algorithms to define clinically meaningful patient subgroups. Ann Rheum Dis 2020;79:e128. [doi:10.1136/annrheumdis-2019-215852](http://dx.doi.org/10.1136/annrheumdis-2019-215852) pmid:http://www.ncbi.nlm.nih.gov/pubmed/31227486 [FREE Full Text](http://ard.bmj.com/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiRlVMTCI7czoxMToiam91cm5hbENvZGUiO3M6MTE6ImFubnJoZXVtZGlzIjtzOjU6InJlc2lkIjtzOjEwOiI3OS8xMC9lMTI4IjtzOjQ6ImF0b20iO3M6Mjg6Ii9hbm5yaGV1bWRpcy83OS8xMi9lMTYyLmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 2. Spielmann L , Nespola B , Séverac F , et al . Anti-Ku syndrome with elevated CK and anti-Ku syndrome with anti-dsDNA are two distinct entities with different outcomes. Ann Rheum Dis 2019;78:1101–6.[doi:10.1136/annrheumdis-2018-214439](http://dx.doi.org/10.1136/annrheumdis-2018-214439) [Abstract/FREE Full Text](http://ard.bmj.com/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MTE6ImFubnJoZXVtZGlzIjtzOjU6InJlc2lkIjtzOjk6Ijc4LzgvMTEwMSI7czo0OiJhdG9tIjtzOjI4OiIvYW5ucmhldW1kaXMvNzkvMTIvZTE2Mi5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 3. Mariampillai K , Granger B , Amelin D , et al . Development of a new classification system for idiopathic inflammatory myopathies based on clinical manifestations and myositis-specific autoantibodies. JAMA Neurol 2018;75:1528. [doi:10.1001/jamaneurol.2018.2598](http://dx.doi.org/10.1001/jamaneurol.2018.2598) 4. Vulsteke J-B , De Langhe E , Mahler M . Autoantibodies at the Center of (sub)Classification—Issues of Detection. JAMA Neurol 2019;76:867. [doi:10.1001/jamaneurol.2019.0440](http://dx.doi.org/10.1001/jamaneurol.2019.0440) 5. Gossec L , Kedra J , Servy H , et al . EULAR points to consider for the use of big data in rheumatic and musculoskeletal diseases. Ann Rheum Dis 2020;79:69–76.[doi:10.1136/annrheumdis-2019-215694](http://dx.doi.org/10.1136/annrheumdis-2019-215694) pmid:http://www.ncbi.nlm.nih.gov/pubmed/31229952 [Abstract/FREE Full Text](http://ard.bmj.com/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MTE6ImFubnJoZXVtZGlzIjtzOjU6InJlc2lkIjtzOjc6Ijc5LzEvNjkiO3M6NDoiYXRvbSI7czoyODoiL2FubnJoZXVtZGlzLzc5LzEyL2UxNjIuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 6. Lilleker JB , Vencovsky J , Wang G , et al . The EuroMyositis registry: an international collaborative tool to facilitate myositis research. Ann Rheum Dis 2018;77:30–9.[doi:10.1136/annrheumdis-2017-211868](http://dx.doi.org/10.1136/annrheumdis-2017-211868) [Abstract/FREE Full Text](http://ard.bmj.com/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MTE6ImFubnJoZXVtZGlzIjtzOjU6InJlc2lkIjtzOjc6Ijc3LzEvMzAiO3M6NDoiYXRvbSI7czoyODoiL2FubnJoZXVtZGlzLzc5LzEyL2UxNjIuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 7. Lundberg IE , Bottai M , Tjärnlund A . Response to: 'Performance of the 2017 European League Against Rheumatism/American College of Rheumatology classification criteria for adult and juvenile idiopathic inflammatory myopathies in clinical practice' by Hočevar et al. Ann Rheum Dis 2018;77:e91. [doi:10.1136/annrheumdis-2017-212786](http://dx.doi.org/10.1136/annrheumdis-2017-212786) 8. Malaviya AN . 2017 EULAR/ACR classification criteria for adult and juvenile idiopathic inflammatory myopathies and their major subgroups: little emphasis on autoantibodies, why? Ann Rheum Dis 2018;77:e77. [doi:10.1136/annrheumdis-2017-212701](http://dx.doi.org/10.1136/annrheumdis-2017-212701)