The use of network meta-analysis in updating WHO living maternal and perinatal health recommendations

Abstract

Drawing on two recent examples of WHO living guidelines in maternal and perinatal health, this paper elucidates a pragmatic, stepwise approach to using network meta-analysis (NMA) in guideline development in the presence of multiple treatment options. NMA has important advantages. These include the ability to compare multiple interventions in a single coherent analysis, provide direct estimates of the relative effects of all available interventions, infer indirect effect estimates for interventions not directly compared and generate rankings of the available treatment options. It can be difficult to harness these advantages in the face of a lack of current guidance on using NMA evidence in guideline development, with several challenges emerging. Challenges include the choice of conceptual approach, the volume and complexity of the evidence, the contribution of treatment rankings, and the fact that the preferable treatment is not always obvious. This paper describes a layered approach to resolving these challenges, which supports systematic guideline decision-making and development of trustworthy clinical guidelines when multiple treatment options are available.

Summary box

  • While network meta-analysis (NMA) can provide streamlined and methodologically coherent synthesis of multiple treatments for guideline development, incorporating evidence from NMA into guideline Evidence-to-Decision (EtD) frameworks is not always straightforward.

  • NMA can generate a large volume of complex evidence that is both conceptually challenging and not easily presented using standard summary approaches, while some appealing features of NMA (such as treatment rankings) are potentially misleading for guideline panel members.

  • We elucidate how, for two sets of WHO guideline updates using effectiveness evidence from two NMAs, we adopted a layered approach to conceptualising and communicating the effectiveness evidence to guideline panel members.

  • We describe an approach that other guideline technical teams may adopt when preparing the ‘effects of interventions’ domain of EtD frameworks using NMA, which can facilitate interpretation, aid decision-making and support development of trustworthy recommendations.

Introduction

Network meta-analysis (NMA) is a technique used in systematic reviews to compare multiple treatments for a single condition.1 2 It can also produce rankings of treatments for different outcomes.3 In recent years, NMA has provided effectiveness evidence for the development of several WHO maternal and perinatal health recommendations.4 5 We describe the process of using NMA for developing recommendations in the context of Grading of Recommendations Assessment, Development and Evaluation (GRADE) Evidence-to-Decision (EtD) frameworks.6 7

While NMA has advantages over conventional pairwise meta-analysis for guideline development, several challenges emerged. These concerned the choice of conceptual approach, volume and complexity of the evidence, the contribution of treatment rankings, and the fact that the preferable treatment is not always obvious. We sought to achieve clarity while maintaining transparency, by adopting a pragmatic, stepwise approach. For two sets of recommendations, we provided guideline decision-makers with an accessible package of information designed to aid interpretation of the evidence.

WHO living guideline development using GRADE EtD framework

WHO guidelines are typically intended for global use, and are developed to rigorous methodological standards.8 The scientific evidence supporting a WHO recommendation is synthesised using the GRADE approach, including the use of structured EtD frameworks.6 7 These inform Guideline Development Group (GDG) deliberations and allow systematic and transparent use of available evidence to formulate recommendations.7 NMA can provide evidence that informs the EtD domain ‘effects of interventions’ (which includes the size of the desirable effects, size of the undesirable effects, certainty of the evidence of effects, and balance of desirable and undesirable effects). Other EtD domains (such as acceptability and feasibility) are informed by other processes.

Since 2017, WHO’s Department of Sexual and Reproductive Health and Research has adopted a ‘living guideline’ approach to updating maternal and perinatal health recommendations. Individual recommendations rather than whole guidelines are prioritised for updating by an independent Executive Guideline Steering Group, on the basis of emerging evidence.9 WHO has released more than 40 updated recommendations using this approach, including 11 informed by commissioned Cochrane reviews using NMA: ten 2018 recommendations on uterotonics for preventing postpartum haemorrhage (see box 1); and one 2022 recommendation on tocolytic therapy for improving preterm birth outcomes (see box 2).4 5

Box 1

Uterotonics for preventing postpartum haemorrhage: 2018 recommendation update

Globally, nearly a quarter of maternal deaths are associated with postpartum haemorrhage (PPH), while uterine atony is the most common cause of PPH. Uterotonic agents work by increasing contractility of the uterus—when administered to all women prophylactically after birth, they can reduce postpartum blood loss. WHO published updated recommendations on the use of uterotonics for preventing PPH in 2018, following publication of a major new trial which found that heat-stable carbetocin was non-inferior to oxytocin for the prevention of PPH.4 These recommendations used effectiveness evidence from a Cochrane systematic review and network meta-analysis (NMA) that included 196 trials involving 134 414 women.11 This NMA included seven different uterotonic agents (oxytocin, misoprostol, carbetocin, ergometrine, injectable prostaglandins, eg, carboprost, syntometrine (oxytocin plus ergometrine), oxytocin plus misoprostol), as well as placebo/no treatment. The updated recommendations considered 18 outcomes that the Guideline Development Group agreed were critical and important for global guideline decision-making. These included outcomes prioritised when this guideline was previously updated in 2012 (which were identified through consultation with international stakeholders), plus three additional outcomes selected to reflect a relevant core outcome set published in the intervening period and to ensure that the final recommendations would be woman-centred.

Box 2

Tocolytics for delaying preterm birth: 2022 recommendation update

Preterm birth (before 37 completed weeks of pregnancy) is the single largest cause of neonatal death worldwide. The earlier babies are born, the greater the risk of respiratory, infectious, metabolic and neurological morbidities. Tocolytic drugs can inhibit or arrest contractions of the uterus, and thus can prolong pregnancy. This allows more time for in-utero fetal maturation, administration of antenatal corticosteroids and other medications that can improve preterm newborn outcomes, and also provide time for transferring a woman to a higher level of care. WHO published updated tocolytic recommendations in 2022 in the context of new, important evidence on the use of antenatal corticosteroids, whose effects are closely linked to the use of tocolysis.5 For the recommendation update, evidence on the effectiveness and safety of tocolytics was provided by a new Cochrane systematic review and network meta-analysis.12 This review identified 122 individually randomised trials, involving 13 697 women. The trials included comparisons of six different classes of tocolytic drugs (betamimetics, cyclo-oxygenase inhibitors, calcium channel blockers, magnesium sulphate, oxytocin receptor antagonists and nitric oxide donors), combinations of tocolytics, and placebo or no treatment with a tocolytic. The recommendation update considered 31 outcomes that the Guideline Development Group agreed were critical and important for global guideline decision-making, including those prioritised in the previous iteration of this recommendation (following consultation with international stakeholders), plus two additional outcomes to ensure that the final recommendations would be woman-centred.

Advantages of evidence synthesis using NMA

NMA has several advantages over conventional pairwise meta-analysis. When sufficient trial data are available, it can provide direct estimates of the relative effects of all available interventions for a common set of outcomes. Indirect effect estimates can also be obtained when the relative effectiveness of two interventions is inferred through a common comparator. Network effect estimates combine the entirety of the available direct and indirect evidence connecting two interventions to yield an estimate that draws on all connected sources of evidence.1 2 NMA also enables comparison of interventions that have not been directly compared, such as newer agents with placebo in the absence of placebo-controlled trials.10

NMA can also generate rankings of available treatments.3 The Cochrane reviews that underpinned the 201811 and 202212 updates expressed rankings as the Surface area Under the Cumulative RAnking line (SUCRA). For each outcome, this indicates the cumulative probability of being the best agent, second best and so forth. SUCRAs are attractive because they are the only graphical option that allows simultaneous comparison of all treatments. Furthermore, they simplify the information about the relative effect of each treatment into a single number.3 The closer the SUCRA value is to 100%, the more likely it is that the treatment is in the top rank or one of the top ranks.13 For instance, figure 1 includes three interventions with high SUCRAs of 72%–76% (combinations of tocolytics, calcium channel blockers and nitric oxide donors).

Example of NMA summary of findings table from Evidence-to-Decision framework on tocolytics for delaying preterm birth.32

Advantages of using NMA evidence in guideline development

GRADE EtD frameworks were designed to integrate evidence from a standard pairwise review.6 7 We have previously encountered indications with multiple treatment options, and hence multiple pairwise systematic reviews requiring an EtD framework per comparison, as in the previous iterations of the uterotonic and tocolytics recommendations.14 15 This was cumbersome and difficult for GDG members to interpret. It was also challenging to assess the impact of methodological differences between reviews (eg, eligibility criteria, reported outcomes and handling of subgroup analyses). As searches were conducted at different times (sometimes years apart) and review methods have improved over time, methodological approaches may have differed. Also, some treatments appeared in more than one review, causing confusion as to which was more ‘correct’. While not insurmountable, such issues risk introducing subtle but potentially important sources of bias into guideline decision-making.16

Generating one EtD framework including effectiveness evidence from only one NMA of multiple treatments resolves several of these difficulties, as it is a single evidence base, assembled with a standardised methodology. This is easier for GDG members to understand and uses optimal analytical methods.10 16 Also, the use of evidence from NMA is particularly efficient because each guideline update requires only one (though often large) systematic review.

Challenges of using evidence from NMA in EtD frameworks

Using evidence from NMA brings new challenges to guideline development. NMA does not currently appear in the WHO handbook for guideline development,8 although there are a number of case studies available, notably the WHO living guideline on drug treatments for COVID-19.17 Similarly, there is no official GRADE guidance on incorporating evidence from NMA into EtD frameworks, or on producing NMA evidence tables, although there are references available.18 19

For guideline methodologists, the absence of established conventions on these issues presents five significant challenges.

The usefulness of NMA evidence rests on conceptual decisions about the choice of question and reference agent

Guideline developers must decide the most logical and useful conceptual approach to structuring the evidence for decision-makers and end users, including decisions about the PICO (population, intervention, comparison, outcomes) question(s) and relatedly the choice of reference agent. Ideally, the PICO question(s) for the recommendation aligns with the systematic review PICO question(s). NMA can make it easier to address multiple treatment comparison or even multiple PICOs, however the best question strategy may be hard to determine when a single guideline must speak to different healthcare contexts. For example, the 2018 WHO uterotonics recommendations aimed to address high-income, medium-income and low-income countries, across which uterotonics availability varies from seven in higher-resource settings to one or two in limited-resource settings. Even though we had NMA evidence and would ultimately need to compare all options with one another, in the absence of widespread access to all treatments, the appropriate conceptual starting point was not a multiple treatment comparison.4

NMA generates evidence on all possible pairwise comparisons of the included interventions, which are potentially numerous. To interpret NMA results, a ‘reference’ intervention against which all other interventions are compared is chosen. The choice of reference is often placebo or no treatment, or the most commonly used comparator treatment, that is, the best-connected intervention in the network.1 While it has an important bearing on the validity and utility of the resulting recommendation, the best choice of reference agent may not always be obvious. For the uterotonics guideline, the currently recommended treatment and best choice of reference agent was oxytocin, however placebo or no treatment was also a relevant comparator in contexts where oxytocin was not available.

A large volume of streamlined evidence is still a large volume of evidence

Effectiveness evidence generated by NMA is more coherent and streamlined than referring to multiple (potentially conflicting) systematic reviews. It also leverages indirect evidence, which can improve effect estimates. However, the GDG may still need to consider a large volume of evidence. The uterotonics question included evidence on 7 interventions plus placebo or no uterotonic treatment, and 18 outcomes (see box 1); the tocolytics question included 7 interventions plus placebo or no tocolytic treatment, and 31 outcomes (see box 2). We generated a summary of findings table for each outcome, presenting effect estimates for all interventions versus the reference. This is a lot of information for the panel to assimilate, even when synthesised and streamlined in this way.

NMA findings can be difficult for guideline panels to interpret

With multiple interventions and many outcomes comes complexity. It can be hard to discern which, if any, intervention is clinically superior, especially when results vary across outcomes.20 Panel members found it challenging to understand how much weight to place on low or very low certainty evidence, and how to make decisions where findings for different outcomes appeared contradictory. For example, in figure 2 several tocolytic interventions were likely more effective at delaying birth by 7 days compared with placebo, while there was probably no difference between the intervention and placebo groups in the mean time between therapy and birth. These differences occur because the two outcomes included data from different trials. Although these issues may be familiar to GDG members considering evidence from pairwise systematic reviews, their impact is multiplied in line with the complexity of NMA.

Excerpt from summary table of anticipated absolute effects of tocolytics versus reference (placebo/no treament) for delaying preterm birth.32

Treatment rankings are problematic and potentially misleading

For the 2022 tocolytic guideline, we presented the treatment rankings. Rankings are produced on a per-outcome basis, and thus one drug can have different rankings for different outcomes. For instance, the SUCRA for the tocolytic intervention nitric oxide donors ranged from 1% to 100% (see table 1).

Table 1
Example of different treatment rankings for a single intervention.32

Methodologists advise caution in interpreting rankings.2 13 A higher ranking cannot be relied on to consistently identify better treatments. Differences in rank might not be clinically significant, while the relative importance of an outcome may vary. Treatments can have a higher rank without evidence that they have a better effect,2 and rankings based on low or very low certainty evidence are not conclusive.13 While effect estimates usually include a conventional level of significance such as a p value or CI, SUCRA rankings do not (although other approaches, such as the use of median ranking, can).

NMA is not a panacea for GDG decision-making

While NMA has advantages, it is not a panacea for all the challenges the GDG faces when navigating effectiveness evidence. NMA does not automatically provide ‘the answer’ as to the clinically superior intervention. Given the significant resources invested in conducting NMA, and the promise inherent in using up-to-date methods, GDG members may find this disappointing or frustrating.

NMA provides a coherent picture of the available evidence, but the evidence may be incomplete or ‘patchy’. The indirect evidence may help address any gaps but cannot resolve them all. Also, assessors may have low confidence in the evidence that is available—for example, the tocolytics NMA identified 122 trials of 13 697 women, however, most outcomes had low or very low certainty evidence.

Pragmatic solutions

We adopted a pragmatic approach to addressing these inter-related challenges. Where necessary we broke down the guideline decision-making process into multiple stages. We had two guiding principles: to make all the effectiveness evidence available in a structured manner so that any detail was accessible and readily retrievable; to facilitate interpretation and interrogation of the evidence by providing multiple navigable layers, from simple to complex. We aimed to produce materials that maximised clarity by using carefully considered layout, graphical elements and consistent colour-coding. We describe our solutions in the following sections.

Starting point: conceptual approach

In the example described earlier (section Challenges of using evidence from NMA in EtD frameworks) concerning the best approach to take to the 2018 uterotonics guideline, the GDG needed to develop recommendations that addressed the variable availability of uterotonic agents across high-resource, medium-resource and low-resource settings. Given this variation, in order to be useful to policy-makers and clinicians in all contexts it was crucial that the guideline first establish to what extent each agent was better or worse than placebo or no treatment. To support the GDG in tackling this issue thoroughly and systematically, although all effectiveness evidence was drawn from a single NMA, we divided the decision-making process into two sets of PICO questions. In a first phase of GDG meetings, we asked whether each intervention improved outcomes, developing individual EtD frameworks for each intervention versus placebo or no treatment. These EtDs are available in a series of web annexes that accompany the published guideline.21–26 In addition to supporting the GDG in their decision-making, this deconstructed approach had the additional benefit of providing a clear evidence base for policy-makers in diverse global contexts where only one or two treatments are available. Second, speaking to settings where all interventions are available, we prepared a single EtD that compared all interventions to a reference (oxytocin), which is available in a further web annex27 and is also included in online supplemental appendix A. While this entailed multiple GDG meetings, it meant we could be confident in the completeness and relevance of the resultant recommendations.

Oxytocin was chosen as the reference because it was the current standard of care, and the most frequently investigated uterotonic across all outcomes. For the 2022 tocolytics update (EtD included in online supplemental appendix B), placebo or no tocolytic treatment was the chosen reference because there was no standard tocolytic treatment, tocolytics were not recommended for women at risk of imminent preterm birth, and in the NMA, placebo/no tocolytic treatment was the best-connected node across most outcomes.

Evidence foundation: NMA summary of findings tables

Appraisal of the evidence on effectiveness and safety of interventions is usually captured in GRADE evidence profiles (see box 3), and as such these tables provide the basis for all GDG decision-making on the undesirable and undesirable effects and certainty of the evidence. Evidence profiles have a standard format that is familiar to guideline decision-makers, being automatically generated using GRADEpro software.28 GRADEpro does not currently support NMA results (although one pilot project has trialled a new (not publicly released) GRADEpro module to support multiple intervention comparison19). Therefore we undertook the process manually, using Excel and Word.29 30 Since we undertook this process, further detailed guidance has been published providing practical strategies for GRADE-assessing NMA effect estimates, that seeks to reduce the significant workload involved. This approach may be especially beneficial for developers of guidelines based on NMA with a large number of interventions.31 As NMA generates three relevant effect estimates for each outcome—direct, indirect and network estimates—and the certainty of evidence may differ for each, it would be hard to include a complete breakdown of all GRADE assessments in the standard ‘evidence profile’ format. Our results tables therefore more closely resembled adapted ‘summary of findings’ tables.

Box 3

GRADE certainty assessments

A key step in the Grading of Recommendations Assessment, Development and Evaluation (GRADE) Evidence-to-Decision process involves rating the ‘certainty’ of effect estimates for each outcome from high to very low certainty. Certainty captures how confident assessors are that the evidence describes the true intervention effect. To determine the certainty level of evidence on the effectiveness and safety of interventions, evidence for each comparison and outcome is assessed against predefined criteria (study design, risk of bias, inconsistency, indirectness, imprecision and publication bias). Assessments are summarised for guideline decision-makers in ‘evidence profiles’ (detailed ‘summary of findings’ tables that include explicit judgements for each GRADE criterion so that panel members can interrogate and reach agreement on these assessments).40 A modified method for assessing the certainty of effect estimates generated by network meta-analysis has been described by the GRADE Working Group.29 30

While there is no established format for NMA evidence tables, guidance is available and there are examples in recent literature. Yepes-Nuñez et al have explored the optimal presentation of NMA results in summary of findings tables.18 Taking as their starting point published expert guidance on the aspects of NMA that should be included (relative and absolute effects; GRADE certainty; rank probabilities; NMA geometry), the authors develop a template intended to facilitate understanding and enhance decision-making. While the paper does not constitute official GRADE guidance, we adopted many aspects of the suggested approach.

For our two sets of WHO recommendations, we included one NMA summary of findings table for each outcome, detailing effectiveness evidence for all interventions versus a reference comparator (see boxes 1 and 2, and example figure 1). All effect estimates are detailed alongside their GRADE, anticipated absolute effects for the ‘headline’ estimate (as natural frequency) and the ranking (as SUCRA). The PICO, network diagram and SUCRA graph are also provided. The network estimate is usually the ‘headline’ result, although this is not always the case (eg, where no indirect evidence is available). Our summary of findings tables differed slightly from the table described by Yepes-Nuñez et al.18 We retained effect estimates and GRADE assessments for the indirect and direct evidence, alongside the network estimate, consistent with previous GRADE Working Group advice.29 30 In line with the source Cochrane review, we used SUCRA rather than median rank to express treatment rankings, and so included the SUCRA graph. We also included explanations of network diagrams and SUCRA graphs. Like Yepes-Nuñez et al, we listed the interventions in the same order (and not by order of rank), reflecting our caution about the reliability of rankings, a challenge discussed in more detail in later sections.

Summary of findings: collating all outcomes

The 2018 and 2022 EtD frameworks included 18 and 31 detailed summary of findings tables, respectively (see boxes 1 and 2, and figure 1). To facilitate comprehension of this large body of evidence, we included a collated summary of intervention effects in the main EtD framework. The full summary of findings tables were included as an appendix (later published in web annexes to the main guideline, available as online supplemental appendices A and B).27 32

The collated summaries comprised colour-coded tables. Figure 3 shows the summary from the 2018 uterotonics update. GDG assessments of the balance of the desirable and undesirable effects must factor in the magnitude of the effect; we modified the design for the 2022 tocolytics update (see figure 2), including anticipated absolute effects (natural frequency per 1000 women, with 95% CI) for each intervention versus the reference. The colours signal benefit (green), harm (orange/red) or no difference (grey). The shades (darker to lighter) reflect the certainty of the evidence, with a neutral yellow signalling very low certainty. The narrative interpretation of each result is included for clarity, and to increase accessibility for colour-blind readers, using language that reflects guidance published by the GRADE Working Group.33

Summary table of anticipated treatment effects (beneficial outcomes) of uterotonic agents versus reference agent (oxytocin) for preventing postpartum haemorrhage.27

These summaries brought together evidence on the size of the desirable and undesirable effects and the certainty of the evidence in a highly digestible format, enabling the panel to begin to make assessments of the balance of effects for all interventions. Although differing in some details, developers of other WHO guidelines have produced similar tabular collations of NMA results, and the authors have observed that this approach is ‘optimal’.17 34 This adaptable format provides an accessible overview and interpretative aid for guideline decision-makers.

Summary of findings: communicating treatment rankings

For the 2022 tocolytics recommendation update, we included treatment rankings in the summary of findings tables (figure 1) and provided a collated array in an appendix (figure 4). As noted earlier, the interpretation of rankings is fraught with difficulty. As indicated by the colour-coding and narrative interpretation in figure 4, the certainty of the evidence varied widely, and we had little confidence in some ostensibly ‘high ranking’ treatments for some outcomes (signalled by the yellow boxes). Recent GRADE guidance acknowledges these challenges, and offers an approach to drawing conclusions from NMA that takes into account primarily the effect estimates and certainty of the evidence, and secondarily the rankings, however we found this approach overly cumbersome to implement given the number of outcomes involved.20 35 The collated summary of SUCRA rankings provided another prism through which panel members could assess the limitations of the available evidence.

Excerpt from summary table of treatment rankings from Evidence-to-Decision framework on tocolytics for delaying preterm birth.32

Making judgements

Guideline panel members study the EtD before the group meets. The highly visual collated summary supported identification of treatments that signalled potentially important benefits across priority outcomes, and red flags (eg, in side effect profiles). During meetings, the GDG used the collated summary to identify a shortlist of potential candidate treatments based on their benefit/risk profiles. The panel identified the most promising three options based on key outcomes, and ruled out others that were clearly unhelpful due to having the fewest benefits and/or worst side effects. When discussing the safety and effectiveness of the shortlisted options, the panel considered whether the differences between them were clinically meaningful. This discussion focused on the magnitude of the effect and certainty of the evidence, as well as outcome importance, and was less dependent on SUCRA rankings.

In the EtD frameworks that compared multiple treatments, we modified judgement tables for all EtD domains to include rows summarising judgements for all treatments (online supplemental appendices A and B).27 32 Final judgements depended on further discussion of all EtD domains (and not just effectiveness). For example, while the tocolytics recommendation highlights nifedipine, the accompanying remarks made by the GDG note that oxytocin receptor agonists and nitric oxide donors can be effective in prolonging pregnancy, but are not available in many countries and can be more costly.5

Strengths and limitations of our approach

This stepwise, layered approach enabled us to organise a productive guideline decision-making process. Both GDGs were able to navigate complex effectiveness evidence in readiness for consideration of other EtD domains. Presenting top layers of the most salient, simplified evidence did not obscure detail, but rather enabled precise and transparent signalling of areas of uncertainty or inconsistency. Ultimately, this meant that GDG discussions could relatively systematically move towards conclusions about the balance of effects of the available interventions.

We have described our approach to handling evidence on the effect of interventions from NMA, but we have not attempted here to address the challenges posed to other EtD domains by the availability of multiple interventions. Other authors have recently highlighted that NMA does not address EtD domains beyond those concerned with effectiveness evidence, and have described the ongoing development of a solution to this wider issue.19

One further limitation in our approach may be the considerable resources involved in preparing the NMA (for author teams) and the EtDs (for guideline technical teams). When commissioning a systematic review, guideline development teams must weigh up whether this investment is likely to pay off, that is, whether NMA is warranted as opposed to relying on pairwise meta-analysis. For both the examples presented, the resource investment in EtD preparation improved usability and efficiency for the GDG.

Recently, several WHO guidelines have been developed using WHO-INTEGRATE, a modified version of the GRADE EtD framework that places greater emphasis on WHO’s distinctive norms and values.36 The process described in this paper will be informative for appraisal of effectiveness evidence from NMA for clinical guideline development using the standard GRADE approach or such adaptations.

Possible improvements for future guideline updates

There may be ways to improve this approach, that modify or adapt the components we have described (summary of findings tables, collated summary of findings and treatment rankings), or add additional components into the package. For instance, a modified summary of findings table could incorporate alternative ranking methods or colour-coded narrative interpretation of results. It would also be helpful if software were available for creating NMA GRADE summary of findings tables, as doing this manually is time-consuming and risks copy-paste errors.

While it is no longer a novel method of meta-analysis, NMA remains at the cutting edge of evidence synthesis for clinical guideline development. Efforts are ongoing to improve the ability of clinicians and decision-makers to fully and accurately use NMA findings. Guideline technical teams could explore the potential of novel visual approaches to augment evidence summaries and enhance interpretation.37–39

Conclusions

Guideline development involves difficult decisions about the best conceptual starting points and ultimate recommendations. NMA offers guideline developers many advantages over standard meta-analysis when multiple treatments are available. However, NMA may not resolve all difficulties, and can create distinctive challenges. These challenges are not insurmountable, and we have provided some solutions that are characterised by a stepwise approach to conceptualising, presenting and interpreting effectiveness evidence. Although it involved significant preparation by the technical team, this enabled guideline panel members to develop trustworthy recommendations. Developers of future clinical guidelines may build on this approach, potentially incorporating innovations such as novel depictions of NMA results or using more advanced software aids. We anticipate evolving consensus in this area and look forward to further advances.

  • Handling editor: Seye Abimbola

  • Contributors: This article was conceived by OTO and JPV. MJW drafted and revised the paper, with supervision by JPV and incorporating feedback from OTO and IDG. JAR helped to edit the final version of the paper. JPV is guarantor. OTO, JPV, IDG and MJW developed the 2018 uterotonics EtD frameworks discussed. OTO, JPV, IDG, JAR and DC developed the 2022 tocolytics EtD framework discussed. All authors critically reviewed the manuscript and approved the final version. The corresponding author (MJW) attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.

  • Funding: The authors received no specific funding to develop this article. Both the tocolytics and uterotonics recommendations were developed with financial support from USAID and the UNDP/UNFPA/UNICEF/WHO/World Bank Special Programme of Research, Development and Research Training in Human Reproduction (HRP), a co-sponsored programme executed by the World Health Organization. The donors did not participate in any decision related to the recommendation development process, including the composition of research questions, membership of the recommendation development groups, conducting and interpretation of systematic reviews, or formulation of the recommendations. The views of the funding bodies did not influence the content of the recommendations.

  • Competing interests: None declared.

  • Patient and public involvement: Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

  • Provenance and peer review: Not commissioned; externally peer reviewed.

  • Supplemental material: This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Data availability statement

Data sharing not applicable as no datasets generated and/or analysed for this study. Not applicable.

Ethics statements

Patient consent for publication:
Ethics approval:

Not applicable.

  1. close Rouse B, Chaimani A, Li T, et al. Network meta-analysis: an introduction for Clinicians. Intern Emerg Med 2017; 12:103–11.
  2. close Dias S, Caldwell DM. Network meta-analysis explained. Arch Dis Child Fetal Neonatal Ed 2019; 104:F8–12.
  3. close Salanti G, Ades AE, Ioannidis JPA, et al. Graphical methods and numerical summaries for presenting results from multiple-treatment meta-analysis: an overview and Tutorial. J Clin Epidemiol 2011; 64:163–71.
  4. close WHO recommendations: uterotonics for the prevention of postpartum haemorrhage. Geneva, World Health Organization 2018;
    Available: here [Accessed 16 Nov 2023]
  5. close WHO recommendation on tocolytic therapy for improving preterm birth outcomes. Geneva, World Health Organization 2022;
    Available: here [Accessed 16 Nov 2023]
  6. close Alonso-Coello P, Oxman AD, Moberg J, et al. GRADE evidence to decision (Etd) frameworks: a systematic and transparent approach to making well informed Healthcare choices. 1: introduction. BMJ 2016; 353.
  7. close Alonso-Coello P, Oxman AD, Moberg J, et al. GRADE evidence to decision (Etd) frameworks: a systematic and transparent approach to making well informed Healthcare choices. 2: clinical practice guidelines. BMJ 2016; 353.
  8. close World Health Organization. WHO Handbook for guideline development. Geneva, World Health Organiztion 2014;
  9. close Vogel JP, Dowswell T, Lewin S, et al. “Developing and applying a “living guidelines” approach to WHO recommendations on maternal and perinatal health”. BMJ Glob Health 2019; 4.
  10. close Kanters S, Ford N, Druyts E, et al. Use of network meta-analysis in clinical guidelines. Bull World Health Organ 2016; 94:782–4.
  11. close Gallos ID, Papadopoulou A, Man R, et al. Uterotonic agents for preventing postpartum haemorrhage: A network meta-analysis. Cochrane Database Syst Rev 2018; 12.
  12. close Wilson A, Hodgetts-Morton VA, Marson EJ, et al. Tocolytics for delaying Preterm birth: a network meta-analysis (0924). Cochrane Database Syst Rev 2022; 8.
  13. close Mbuagbaw L, Rochwerg B, Jaeschke R, et al. Approaches to interpreting and choosing the best treatments in network meta-analyses. Syst Rev 2017; 6:79.
  14. close WHO recommendations for the prevention and treatment of postpartum haemorrhage: evidence base. Geneva, World Health Organization 2012;
    Available: here [Accessed 21 Apr 2023]
  15. close WHO recommendations on interventions to improve preterm birth outcomes: Evidence base. Geneva: World Health Organization 2015;
    Available: here [Accessed 21 Apr 2023]
  16. close Vogel JP, Williams M, Gallos I, et al. WHO recommendations on Uterotonics for postpartum haemorrhage prevention: what works, and which one. BMJ Glob Health 2019; 4.
  17. close Siemieniuk RA, Bartoszko JJ, Zeraatkar D, et al. Drug treatments for COVID-19: living systematic review and network meta-analysis. BMJ 2020; 370.
  18. close Yepes-Nuñez JJ, Li S-A, Guyatt G, et al. Development of the summary of findings table for network meta-analysis. J Clin Epidemiol 2019; 115:1–13.
  19. close Piggott T, Brozek J, Nowak A, et al. Using GRADE evidence to decision frameworks to choose from multiple interventions. Journal of Clinical Epidemiology 2021; 130:117–24.
  20. close Brignardello-Petersen R, Florez ID, Izcovich A, et al. GRADE approach to drawing conclusions from a network meta-analysis using a minimally Contextualised framework. BMJ 2020; 371.
  21. close WHO recommendations: Uterotonics for the prevention of postpartum haemorrhage. Web annex 1: Oxytocin versus placebo or no treatment. Geneva: World Health Organization 2018;
    Available: here [Accessed 13 Dec 2022]
  22. close WHO recommendations: Uterotonics for the prevention of postpartum haemorrhage. Web annex 2: Carbetocin versus placebo or no treatment. Geneva: World Health Organization 2018;
    Available: here [Accessed 13 Dec 2022]
  23. close WHO recommendations: Uterotonics for the prevention of postpartum haemorrhage. Web annex 3: Misoprostol versus placebo or no treatment. Geneva: World Health Organization 2018;
    Available: here [Accessed 13 Dec 2022]
  24. close WHO recommendations: Uterotonics for the prevention of postpartum haemorrhage. Web annex 4: Ergometrine / Methylergometrine versus placebo or no treatment. Geneva: World Health Organization 2018;
    Available: here [Accessed 13 Dec 2022]
  25. close WHO recommendations: Uterotonics for the prevention of postpartum haemorrhage. Web annex 5: Oxytocin and Ergometrine versus placebo or no treatment, Geneva: World Health Organization. 2018;
    Available: here [Accessed 13 Dec 2022]
  26. close WHO recommendations: Uterotonics for the prevention of postpartum haemorrhage. Web annex 6: Injectable prostaglandins versus placebo or no treatment. Geneva: World Health Organization 2018;
    Available: here [Accessed 13 Dec 2023]
  27. close WHO recommendations: Uterotonics for the prevention of postpartum haemorrhage. Web annex 7: Choice of uterotonic agents. Geneva: World Health Organization 2018;
    Available: here [Accessed 21 Apr 2023]
  28. close GRADEpro GDT. GRADEpro Guideline Development Tool [Software]. McMaster University and Evidence Prime 2022;
  29. close Puhan MA, Schünemann HJ, Murad MH, et al. A GRADE working group approach for rating the quality of treatment effect estimates from network meta-analysis. BMJ 2014; 349.
  30. close Brignardello-Petersen R, Bonner A, Alexander PE, et al. Advances in the GRADE approach to rate the certainty in estimates from a network meta-analysis. J Clin Epidemiol 2018; 93:36–44.
  31. close Izcovich A, Chu DK, Mustafa RA, et al. A guide and pragmatic considerations for applying GRADE to network meta-analysis. BMJ 2023; 381.
  32. close WHO recommendation on tocolytic therapy for improving preterm birth outcomes. Web annex: Evidence-to-decision frameworks. Geneva: World Health Organization 2022;
    Available: here [Accessed 13 Dec 2022]
  33. close Santesso N, Glenton C, Dahm P, et al. GRADE guidelines 26: informative statements to communicate the findings of systematic reviews of interventions. J Clin Epidemiol 2020; 119:126–35.
  34. close Chu DK, Brignardello-Petersen R, Guyatt GH, et al. Method’s corner: Allergist’s guide to network meta-analysis. Pediatr Allergy Immunol 2022; 33.
  35. close Brignardello-Petersen R, Izcovich A, Rochwerg B, et al. GRADE approach to drawing conclusions from a network meta-analysis using a partially Contextualised framework. BMJ 2020; 371.
  36. close Rehfuess EA, Stratil JM, Scheel IB, et al. The WHO-INTEGRATE evidence to decision framework version 1.0: integrating WHO norms and values and a complexity perspective. BMJ Glob Health 2019; 4.
  37. close Seide SE, Jensen K, Kieser M, et al. Utilizing radar graphs in the visualization of simulation and estimation results in network meta-analysis. Res Synth Methods 2021; 12:96–105.
  38. close Daly CH, Mbuagbaw L, Thabane L, et al. Spie charts for Quantifying treatment effectiveness and safety in multiple outcome network meta-analysis: a proof-of-concept study. BMC Med Res Methodol 2020; 20:266.
  39. close Ostinelli EG, Efthimiou O, Naci H, et al. Vitruvian plot: a Visualisation tool for multiple outcomes in network meta-analysis. Evid Based Ment Health 2022; 25:e65–70.
  40. close Guyatt G, Oxman AD, Akl EA, et al. GRADE guidelines: 1. introduction - GRADE evidence profiles and summary of findings tables. J Clin Epidemiol 2011; 64:383–94.

  • Received: 12 June 2023
  • Accepted: 21 October 2023
  • First published: 6 December 2023