Review

Opportunities and challenges for real-world studies on chronic inflammatory joint diseases through data enrichment and collaboration between national registers: the Nordic example

Abstract

There are increasing needs for detailed real-world data on rheumatic diseases and their treatments. Clinical register data are essential sources of information that can be enriched through linkage to additional data sources such as national health data registers. Detailed analyses call for international collaborative observational research to increase the number of patients and the statistical power. Such linkages and collaborations come with legal, logistic and methodological challenges. In collaboration between registers of inflammatory arthritides in Sweden, Denmark, Norway, Finland and Iceland, we plan to enrich, harmonise and standardise individual data repositories to investigate analytical approaches to multisource data, to assess the viability of different logistical approaches to data protection and sharing and to perform collaborative studies on treatment effectiveness, safety and health-economic outcomes. This narrative review summarises the needs and potentials and the challenges that remain to be overcome in order to enable large-scale international collaborative research based on clinical and other types of data.

Key messages

  • There is increasing need for detailed real-world data on rheumatic diseases and their treatments.

  • This need goes beyond clinical data and calls for enrichment of data in the clinical registers through linkages to other data sources and for collaborative observational research across national borders.

  • Such enrichment and collaboration come with legal, logistic and methodological challenges.

  • Through collaboration between rheumatology registers on chronic inflammatory arthritides in Sweden, Denmark, Norway, Finland and Iceland, we hope to address these challenges and to study treatment effectiveness, safety and health-economic outcomes in rheumatoid arthritis, axial spondyloarthritis and psoriatic arthritis.

The need for real-world data from patients with inflammatory joint diseases

With a prevalence of 1%–2% in the general population and lifetime risks of at least 1 in 20, chronic inflammatory joint diseases including rheumatoid arthritis (RA), ankylosing spondylitis (AS), other spondyloarthritides (SpA), including psoriatic arthritis (PsA), represent a significant burden for afflicted individuals, for healthcare and for society at large, whether measured as pain, functional impairment, healthcare resource utilisation or costs.

The therapeutic approaches to chronic inflammatory joint diseases have changed substantially over the last two decades. A growing number of treatment options have enabled increasingly ambitious treatment goals, but also lead to complex treatment patterns and concerns regarding their costs. To determine the optimal treatment in the clinical setting, including its value for the individual and for society, studies assessing effectiveness, safety and long-term outcomes of different treatment options in different treatment contexts are necessary. A better understanding of the heterogeneities, comorbidities and societal outcomes of the treated diseases themselves is also required to enable individualised treatment.

Randomised controlled trials (RCTs), while still the gold standard for efficacy studies, often provide insufficient evidence to inform clinical practice as their size, strict inclusion and exclusion criteria, restricted treatment options and follow-up times typically preclude inferences regarding the long-term safety and effectiveness. Furthermore, RCTs have limited power to provide safety evidence with regard to rare events and the performance of treatment of patients not fulfilling the entry criteria for RCTs. Thus, some of the clinically most relevant questions are virtually impossible to address in a randomised controlled setting.

For all of the above reasons, we are increasingly facing situations where large-scale observational studies based on real-world data are needed. To this end, clinical rheumatology registers have been established in many countries, either as disease registers, such as an early RA register, or as registers to specifically monitor treatment, such as a biologics register, or as both.1–5 Rheumatology has been at the forefront in establishing population based regional or national longitudinal clinical disease registers (in the Scandinavian countries: SRQ/ARTIS, DANBIO, NOR-DMARD, ROB-FIN and ICEBIO). Detailed information about each of these clinical rheumatology registers has been published previously1 6–12 and summarised in table 1.

Table 1
Overview of the five Nordic clinical rheumatology registers

The need for enriching clinical register data through linkage to other registers

Clinical registers provide a potential for large-scale data at a level of clinical detail (eg, specific clinical metrics, such as Disease Activity Score 28) that is often much higher than that found in administrative or claims data. Similarly, clinical registers may provide a unique source for patient-reported outcome measures (PROMs) data. To be sustainable in the long run, clinical data collection for a register needs to be slim enough to fit (and benefit) clinical practice. Conversely, there is a limit with regard to the amount and nature of information that lends itself to collection in the clinical setting. While clinical registers are ideal to collect information that are also relevant and available at the clinical visit, clinical registers are often not the ideal vehicle for the collection of rare and unexpected events or factors that ‘occur’ outside of rheumatology (ie, may not be known to the rheumatologist, such as cost data) or for events for which there is considerable recall bias. For these latter types of events, obtaining data from other data sources can both bring down the burden of data collection in the clinical register and offer ‘objective’ measurements devoid of subjectivity (eg, cost, or work ability) or recall bias (eg, drug prescription data), at a high level of completeness.

In settings with other national registers available, such as in the Nordic countries, the clinical rheumatology registers may be enriched via linkages to such other population-based registers and ‘complete’ registers. Examples of such are registers on cancer incidence, mortality or work force participation. Existence of personal identifiers enables deterministic record linkage. Table 2 outlines different types of external data sources in each of the Nordic countries which are possible to link to the clinical rheumatology registers. As evident from the table, such linkages entail considerable administrative preparation, with approval from multiple authorities and a waiting time varying between several months and more than a year. Importantly, linkage to external register holders often puts additional restrictions on what the linked data may be used for, how they can be accessed and whether (if physically accessible) they may be exported. All of these processes and restrictions vary across country, to some extent across different public register holders within each country and also over time. Currently, there is no such thing as a ‘push-the-button’ mechanism to link all clinical data to all of the national data sources listed in table 2. Rather, work in this field still relies on practical experience from each of the registers and register holders’ modus operandi.

Table 2
Examples of enrichment that can be enabled via linkages of the clinical rheumatology registers to other national and population-based registers

Linkage to external data sources offers additional benefit, namely, the possibility to assemble general population, or disease-specific comparator cohorts. For instance, for every individual in the clinical register, 10 (or 100 or 1000) general population comparator subjects may be sampled and then subjected to additional register linkages. Such a general population cohort will provide the possibility to contextualise any differences in risk among patients with a certain disease (say, treated with X instead of Y) to any risks associated with merely having (vs not having) the disease at all. Compare, for instance, the large increase in risk of tuberculosis in biologics-treated RA versus biologics-naive RA versus the only moderate risk increase of tuberculosis in biologics-naive RA versus the general population, or any marginal risk increase of malignant lymphomas in tumour necrosis factor inhibitors (TNFi)-treated versus TNFi-naive RA versus the clear increase in lymphoma risk in biologics-naive RA versus the general population.13–15

The need for collaborative observational studies and the need for data harmonisation

In a number of situations, collaboration across registers is necessary. For instance:

  1. Studies of rare treatment exposures.

  2. Studies of rare outcomes.

  3. Studies at a maximal phenotypic resolution, acknowledging that individualised treatment rather than treatment on average should not only focus on the treatment but also on the characteristics of the treated disease, a fact that rapidly decreases statistical precision in the specific subset of individuals with those characteristics.

While collaborative studies are increasingly needed, they come with specific challenges, at several levels. A proactive approach to these challenges is absolutely vital for the success and interpretability of the outcome of any collaborative study.

First, there is heterogeneity in the primary clinical data collection in the clinical registers in terms of what is collected and how it is defined and collected.16 Harmonisation at this level can result in changes in the primary data collection, or in its subsequent coding or categorisation. Examples of such harmonisation efforts include the European League Against Rheumatism (EULAR) Task Force on RA data collection in clinical practice.17 It should be pointed out that harmonisation does not mean that all registers need to collect the same and only the same variables, only that core elements of the data collection should be defined in a way that ensures comparability or translatability across registers. Heterogeneity regarding population background risks between countries has also to be taken into account. A good example is the recent collaborative analyses on malignant melanomas and lymphomas under the umbrella of EULAR.18

Second, enrichment of the raw data in a clinical register through linkage to external registers should be comparable. Since such external data sources (eg, a national cancer register) are seldom amenable to changes in their primary data collection, harmonisation at this level will largely be about harmonising algorithms with which these data are curated. For instance, in a multicountry drug safety study of myocardial infarction using linkage of clinical RA treatment data to hospital data on myocardial infarction, harmonisation may be about defining what is meant by a ‘myocardial infarction’ in each of these hospital registers. ‘Myocardial infarction’ may, for instance, comprise various combinations of unstable angina, ST-segment elevation and non ST-segment elevation infarctions and include or exclude sudden cardiac death.

Third, also the analytical protocols need to be harmonised. In the above example of myocardial infarction, such harmonisation will ensure that, for instance, the risk windows during which each study subject is considered to be at risk for a myocardial infarction following a specific antirheumatic treatment are the same across all participating sites or countries, or that adjustment for demographics and comorbidities are performed in a comparable manner across sites or countries. Harmonisation at this level will require a reasonably detailed understanding of the data to be included and must therefore be a joint effort across all collaborators.

Even with perfect harmonisation, not all data sources may provide information on all the desired variables, a fact that effectively may preclude identical analyses to be performed. For instance, say that Register I holds information on covariates A, B and C, Register II holds information on covariates A, B, D but not C, and Register III holds information on A, C, E but neither B nor D (figure 1). To run one and the same model across these three registers would mean a model only containing variable A. Within each register, however, more elaborate models (each including three co-covariates) can be run. The trade-off here is whether it is preferable to let each register come up with its own ‘best’ model and apply meta-analytic techniques to weigh these ‘best’ estimates together even if the collation of relative risks across registers will no longer mean combining risk estimates from identical models, or whether a joint analysis based on fewer but identical covariates is the better choice. In situations where A, B, C, D and E above all represent aspects of the same item (say, treatment response and that A=EULAR DAS28 response, B=ACR response and so on) then one way forward may be to create a new variable (‘response’) and have each register categorise individuals into responders or non-responders according to the response-metric captured in each register. Another analytical challenge occurs when the relative importance of a covariate, such as obesity, on an outcome, such as cardiovascular risk, varies across registers.

Illustration of the challenge residing in only partial overlaps in the primary data collection across registers, with only variable A as common across registers.

Finally, treatment channelling, or confounding by indication, is an important aspect of all observational comparative effectiveness or safety research, collaborative or not. It reflects the fact that treatment allocation in clinical practice is not a random process, but determined by known and unknown factors related to the patient, to his or her rheumatic disease and other medical history, the treating physician and the treatment context. While there is no single method that effectively quantifies and eliminates confounding by indication, judicious analyses informed by hands-on experience from the very clinical practice that gave rise to the data at country level or at regional level within each country and by access to individual-level data beyond clinical data (such as data on socioeconomy) can help demonstrate the extent to which channelling is present. Different analytical techniques can be used to reduce its impact. When confounding by indication is likely to differ in magnitude (or even direction) across countries, harmonised but parallel analyses provide an opportunity for an assessment of the importance of confounding by indication, while also adding a point to the analysis plan as some variables may be completely necessary to adjust for in some countries, but of less importance in others. Epidemiologists need to consider differential selection bias by country when analysing multiregional data. The best method to combine data from various registries will therefore depend on the research question, the outcome, the presence of effect modification by country and the need of higher statistical power.

Technical, logistic and legal challenges in collaborative studies: can and should data travel?

After agreeing on a specific research question, collaboration between different registers can conceptually include different approaches (figure 2):

Schematic presentation of enrichment through linkages of clinical Rheumatology register data to other national data sources within each of the five Nordic countries and various approaches to collaboration across the five Nordic countries.

  1. Analyses based on exports of harmonised, anonymised or de-identified, individual-patient level data from each register to a central database, where they are collated and analysed jointly as one data set.

  2. Fully federated analyses, in which the curated individual-level data are analysed from a central unit and as one virtual data set, yet do not leave the local servers where they are stored.

  3. Separate but harmonised analyses of curated data, conducted in parallel at each register. The curated data sets are analysed individually in each country based on a harmonised analysis protocol, with the results presented both individually per register and pooled through meta-analytic techniques.

There are advantages and drawbacks with each of the above alternatives. Currently, and as outlined above, cross-border transfer of data are often accompanied by uncertainties or bureaucratic bottlenecks regarding legal and logistic aspects of handling of the data and export across borders. Often, these are questions beyond the control of the individual researcher. Still, there are a series of examples that demonstrate the feasibility of this approach, at least for collaborations built exclusively around clinical register data.19–22 It should be pointed out that even anonymised or de-identified data may, by virtue of their richness, be personal data and have to be treated accordingly. The European Union (EU) General Data Protection Regulation does not seem to substantially alter the underlying premises for performing research based on register data or register linkages or the movement of data within EU countries, at least not from a northern European point of view, but this remains an important issue for close monitoring.23

While intuitively appealing, fully federated analyses (option 2) are linked to issues of whether, from a legal point of view, providing external access to the data is any different from exporting the very same data to the analysing party. If the analyses are run ‘in the cloud’, then some sort of data export must de facto have occurred. There are, however, promising technical solutions in operation that rely on transfer of scripts and interim results (aggregate-level data or parameter estimates) only, and thereby circumvent the challenge of actual or virtual data access and transfer.24 25 Running models that are based on iterative model-building across several data sets may, however, be time-consuming and limited by the information-transfer capacities in the network used.

By contrast, separate but harmonised analyses (option 3) put particular demands on harmonisation not only of the raw data but also of the statistical analysis plan to be executed at each register. Since the absence of pooling of individual-level data precludes the potential for adjusted analyses across data sets, other means to accommodate important risk determinants across data sets must be implemented (eg, stratification and standardisation, as exemplified in a recently published study26). The analytical ‘output’ to be pooled through meta-analytic approaches may vary from rates to actual relative risk estimates. Since all programming and analysis would need to be run separately at each collaborating site, this option also may also require more total work hours than the other options, where at least the central analysis would only need to be run by one designated statistician.

A Nordic initiative to facilitate collaboration across enriched rheumatology registers

With largely similar healthcare structures, the existence of national registers on medical and societal outcomes (cancer, hospitalisations, pregnancies, sick leave, other cost reimbursements and so on) and the possibility to link information across registers, the premises for collaborative large-scale register-based research including enriched clinical data are good across the Nordic countries, though not without challenges as those outlined above.

We have initiated a collaboration between the Nordic Rheumatology registers which aims to establish a standing network across Sweden, Denmark, Finland, Norway and Iceland for register research on patients with RA, AS, SpA and PsA. Taking a pragmatic and research-question–based approach, we will exploit various approaches to harmonise (1) primary data collection, (2) data management and (3) analytical protocols, and to address legal, logistic and technical challenges involved in data handling and in data sharing, both with regard to the clinical register data and to linked data from other data sources (figure 2).

The research questions that can be addressed through a collaboration such as ours can largely be divided into two categories: (1) questions that can be addressed using the clinical registers only (eg, a clinical effectiveness study of response to treatment X vs response to treatment Y) and (2) questions that can typically only be addressed using data enriched with data from other sources data (eg, long-term malignancy risks with treatment Z). A logical first step before embarking on specific comparative effectiveness/safety projects is to characterise and compare the patient populations across countries27 and to assess the relative uptake of different therapies in each country.27 28 Even with the similarities across the Nordic countries, heterogeneity is to be expected. Harmonisation of clinical input data is therefore a prerequisite, both regarding the definition and the collection, as well as of the study protocol. For each specific subproject launched within our collaboration, a prespecified statistical analysis protocol must be agreed on, and exposure, outcome and covariates need to be clearly defined. Besides study-specific definitions and analysis plans, this work will eventually result in a library of ‘standard’ and generic definitions for data harmonisation regardless of specific research question.

So far, our collaboration has begun to demonstrate similarities and differences across the Nordic countries with regard to biological therapies used in AS/SpA28and PsA29 and in the choice of biological therapies in patients with a history of cancer.30 Ongoing projects include studies of infection risks with newer type of biologics in RA, risks for demyelinating events with TNFi31 and birth outcomes.32

Conclusion

There is an increasing need for detailed real-world data on rheumatic diseases and their treatments in large patient populations. Collaboration across large, population-based clinical rheumatology registers in settings that allow for enrichment through linkages to additional sources of information represent a powerful next step in the generation of real-world evidence and can be of great value for patients, clinicians, regulators, pharmaceutical companies and other healthcare providers. In this regard, the premises for register-based collaborative clinical research in the field of chronic inflammatory diseases in the Nordic countries are particularly promising. Such collaboration comes, however, with legal, logistic and methodological challenges. In a collaboration between rheumatology registers on chronic inflammatory arthritides in the Nordic countries, we hope to enrich, harmonise and standardise the resultant data repositories to investigate analytical approaches to data coming from different sources, to assess the merits of different logistical approaches to data protection and sharing by performing collaborative studies on treatment effectiveness, safety and health-economic outcomes.

  • Contributors: All authors contributed significantly in the preparation of the review and are part of this international collaboration.

  • Funding: The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests: None declared.

  • Patient consent: Not required.

  • Provenance and peer review: Not commissioned; externally peer reviewed.

  • Data sharing statement: No additional data are available.

  • Correction notice: This article has been corrected since it was first published. The 13th author’s name has been corrected to ‘Bjorn Gudbjornsson.’

  1. close Glintborg B, Gudbjornsson B, Krogh NS, et al. Impact of different infliximab dose regimens on treatment response and drug survival in 462 patients with psoriatic arthritis: results from the nationwide registries DANBIO and ICEBIO. Rheumatology 2014; 53:2100–9.
  2. close Lie E, van der Heijde D, Uhlig T, et al. Effectiveness of switching between TNF inhibitors in ankylosing spondylitis: data from the NOR-DMARD register. Ann Rheum Dis 2011; 70:157–63.
  3. close Ljung L, Rantapää-Dahlqvist S, Jacobsson LT, et al. Response to biological treatment and subsequent risk of coronary events in rheumatoid arthritis. Ann Rheum Dis 2016; 75:2087–94.
  4. close Aaltonen KJ, Virkki LM, Jämsen E, et al. Do biologic drugs affect the need for and outcome of joint replacements in patients with rheumatoid arthritis? A register-based study. Semin Arthritis Rheum 2013; 43:55–62.
  5. close Twigg S, Hensor EMA, Freeston J, et al. Fatigue, older age, higher body mass index and female gender predict worse disability in early rheumatoid arthritis despite treatment to target: a comparison of two observational cohort studies from the United Kingdom. Arthritis Care Res (Hoboken) 2017;
  6. close Neovius M, Simard J, Sundström A, et al. Generalisability of clinical registers used for drug safety and comparative effectiveness research: coverage of the Swedish Biologics Register. Ann Rheum Dis 2011; 70:516–9.
  7. close Kvien TK, Heiberg EMA, Lie E, et al. A Norwegian DMARD register: prescriptions of DMARDs and biological agents to patients with inflammatory rheumatic diseases. Clin Exp Rheumatol 2005; 23:S188–94.
  8. close Konttinen L, Honkanen V, Uotila T, et al. Biological treatment in rheumatic diseases: results from a longitudinal surveillance: adverse events. Rheumatol Int 2006; 26:916–22.
  9. close Heinonen AV, Aaltonen KJ, Joensuu JT, et al. Effectiveness and drug survival of TNF inhibitors in the treatment of ankylosing spondylitis: a prospective cohort study. J Rheumatol 2015; 42:2339–46.
  10. close Aaltonen KJ, Ylikylä S, Tuulikki Joensuu J, et al. Efficacy and effectiveness of tumour necrosis factor inhibitors in the treatment of rheumatoid arthritis in randomized controlled trials and routine clinical practice. Rheumatology 2017; 56:kew467–35.
  11. close Aaltonen K, Heinonen A, Joensuu J, et al. Effectiveness and drug survival of TNF-inhibitors in the treatment of psoriatic arthritis: A prospective cohort study. Semin Arthritis Rheum 2017; 46:732–9.
  12. close Ibfelt EH, Jensen DV, Hetland ML, et al. The Danish nationwide clinical register for patients with rheumatoid arthritis: DANBIO. Clin Epidemiol 2016; 8:737–42.
  13. close Mercer LK, Regierer AC, Mariette X, et al. Spectrum of lymphomas across different drug treatment groups in rheumatoid arthritis: a European registries collaborative project. Ann Rheum Dis 2017; 76:2025–30.
  14. close Arkema EV, Jonsson J, Baecklund E, et al. Are patients with rheumatoid arthritis still at an increased risk of tuberculosis and what is the role of biological treatments? Ann Rheum Dis 2015; 74:1212–7.
  15. close Hellgren K, Dreyer L, Arkema EV, et al. Cancer risk in patients with spondyloarthritis treated with TNF inhibitors: a collaborative study from the ARTIS and DANBIO registers. Ann Rheum Dis 2017; 76:105–11.
  16. close Radner H, Dixon W, Hyrich K, et al. Consistency and utility of data items across European rheumatoid arthritis clinical cohorts and registers. Arthritis Care Res 2015; 67:1219–29.
  17. close Radner H, Nikiphorou E, Chatzidionysiou K, et al. Towards development of a minimum core dataset and standards of data collection for observational rheumatoid arthritis research – a eular initiative. Ann Rheum Dis 2016; 75:406.3–7.
  18. close Mercer LK, Askling J, Raaschou P, et al. Risk of invasive melanoma in patients with rheumatoid arthritis treated with biologics: results from a collaborative project of 11 European biologic registers. Ann Rheum Dis 2017; 76:386–91.
  19. close Chatzidionysiou K, Lie E, Nasonov E, et al. Highest clinical effectiveness of rituximab in autoantibody-positive patients with rheumatoid arthritis and in those for whom no more than one previous TNF antagonist has failed: pooled data from 10 European registries. Ann Rheum Dis 2011; 70:1575–80.
  20. close Chatzidionysiou K, Lie E, Nasonov E, et al. Effectiveness of disease-modifying antirheumatic drug co-therapy with methotrexate and leflunomide in rituximab-treated rheumatoid arthritis patients: results of a 1-year follow-up study from the CERERRA collaboration. Ann Rheum Dis 2012; 71:374–7.
  21. close Gabay C, Riek M, Hetland ML, et al. Effectiveness of tocilizumab with and without synthetic disease-modifying antirheumatic drugs in rheumatoid arthritis: results from a European collaborative study. Ann Rheum Dis 2016; 75:1336–42.
  22. close Finckh A, Neto D, Iannone F, et al. The impact of patient heterogeneity and socioeconomic factors on abatacept retention in rheumatoid arthritis across nine European countries. RMD Open 2015; 1:e000040.
  23. close van Vliet-Ostaptchouk JV, Nuotio ML, Slagter SN, et al. The prevalence of metabolic syndrome and metabolically healthy obesity in Europe: a collaborative analysis of ten large cohort studies. BMC Endocr Disord 2014; 14:9.
  24. close Wolfson M, Wallace SE, Masca N, et al. DataSHIELD: resolving a conflict in contemporary bioscience – performing a pooled analysis of individual-level data without sharing the data. Int J Epidemiol 2010; 39:1372–82.
  25. close Yamanaka H, Askling J, Berglind N, et al. Infection rates in patients from five rheumatoid arthritis (RA) registries: contextualising an RA clinical trial programme. RMD Open 2017; 3:e000498.
  26. close Glintborg BLU, Aaltonen K, Kristianslund E, et al. First line biological treatment in ankylosing spondylitis - prescription rates, baseline demographics and disease activity. A collaboration between biological registers in the five Nordic countries. ACR 2017;
  27. close Chatzidionysiou K, Askling J, Eriksson J, et al. Effectiveness of TNF inhibitor switch in RA: results from the national Swedish register. Ann Rheum Dis 2015; 74:890–6.
  28. close Jorgensen TDL, Gudbjornsson B, Hetland M, et al. Prescription patterns of tumour necrosis factor inhibitor and ustekinumab in psoriatic arthritis: a nordic population-based cohort study. Ann Rheum Dis 2016; 76:686.2–86.
  29. close Chatzidionysiou KAK, Nordström D, Gudbjornsson B, et al. How do we use biologics in patients with a history of malignancy? An assessment of treatment patterns using scandinavian registers. EULAR 2017;
  30. close Dreyer L, Magyari M, Laursen B, et al. Risk of multiple sclerosis during tumour necrosis factor inhibitor treatment for arthritis: a population-based study from DANBIO and the Danish Multiple Sclerosis Registry. Ann Rheum Dis 2016; 75:785–6.
  31. close Bröms G, Granath F, Ekbom A, et al. Low risk of birth defects for infants whose mothers are treated with anti-tumor necrosis factor agents during pregnancy. Clin Gastroenterol Hepatol 2016; 14:234–41.
  32. Ibfelt EH, Sørensen J, Jensen DV, et al. Validity and completeness of rheumatoid arthritis diagnoses in the nationwide DANBIO clinical register and the Danish National Patient Registry. Clin Epidemiol 2017; 9:627–32.

  • Received: 30 January 2018
  • Accepted: 15 March 2018
  • First published: 12 April 2018