Recommendations for musculoskeletal ultrasonography by rheumatologists: Setting global standards for best practice by expert consensus
Abstract
Objective
To establish an expert consensus of best practice for rheumatologists performing musculoskeletal ultrasonography (MUS).
Methods
A panel of worldwide experts in MUS was identified by literature review, membership of teaching faculty, and peer recommendation. They were invited to take part in a 4-stage Delphi process employing 2 iterative rounds to establish a consensus of specific indications, anatomic areas, and knowledge and skills required by rheumatologists performing MUS.
Results
Experts in MUS were identified (n = 57; 37 radiologists, 20 rheumatologists). Successive rounds of this rigorous Delphi exercise enabled group consensus to be achieved in 30 of the proposed 37 categories comprising 8 of 13 indications (inflammatory arthritis, tendon pathology, effusion, bursitis, monitoring disease activity, monitoring disease progression, guided aspiration, and injection), 8 of 10 anatomic areas (hand, wrist, elbow, shoulder, hip, knee, ankle and heel, and forefoot), and 14 categories of knowledge and skills (physics, anatomy, pathology, equipment, clinical application and relevance, indications and limitations, artifact, machine function and operation, patient and probe position, planes and system of examination, image optimization, dynamic assessment, color Doppler, and power Doppler).
Conclusion
We have produced the first expert-derived, interdisciplinary consensus of recommendations for rheumatologists performing MUS. This represents a significant advance that will not only direct future rheumatology MUS practice, but will facilitate informed educational development. This is an important step towards the introduction of a specific training curriculum and assessment process to ensure competent rheumatologist ultrasonographers.
INTRODUCTION
There is increasing evidence to support the benefit to patients whose rheumatologists perform musculoskeletal ultrasonography (MUS) (1-7). This imaging technique is a valuable complementary clinical tool enabling clinicians to improve the accuracy of their diagnostic skills and management decisions essential for maintaining the highest standards of patient care (8). It is also of considerable utility to the researcher, allowing improved understanding of the pathophysiology of rheumatic diseases and as an objective outcome measure for clinical research studies (9-11). This has encouraged an increasing number of rheumatologists to purchase ultrasound machines and carry out MUS examinations as part of their routine practice (12). This has a number of important educational implications, particularly with regard to training and competency assessment. The current situation is a challenge that needs to be addressed by the rheumatology and radiology communities (13, 14).
Although rheumatologic ultrasonography is an expanding area, the published information on training is limited (15, 16). A review of training and assessment regimens of current MUS practitioners confirms a wide variety of approaches to training and minimal exposure to competency assessment (17). Courses in MUS are popular, but these aim to introduce concepts and aid understanding rather than provide formal training. Guidelines are available on image acquisition, equipment, and practice standards (18, 19) but there is little information and no published agreement on any other fundamental issue with regard to education in this area. For example, there are no guidelines in place to direct a rheumatologist as to the appropriate knowledge and skills that they require to perform an adequate MUS assessment, or even for what indications or in which anatomic areas it is appropriate to perform such an examination. Similarly, there are no published requirements for training nor any established standards or outcomes that rheumatologists must achieve to be deemed proficient in this technique. In addition, there is no objective assessment process to ensure that competency standards are achieved and maintained, and there are no requirements for life-long learning, continuous assessment, or revalidation. Overall, there is currently insufficient basic information available with which to make recommendations for training, define suitable standards, or determine the nature of assessment. Answers to these important basic questions are essential to advance this important area and enable the development of a structured rheumatology MUS curriculum and a valid criterion-referenced system of competency assessment. Appropriately trained clinicians are essential for competent patient diagnosis and management. Without this, patients will not receive the maximum benefit from this imaging modality, and their safety may be compromised due to inappropriate examination or misdiagnosis because of inadequately trained operators.
We have used consensus-defining methodology to obtain a wide range of information, insights, and opinions from expert practitioners in the field of MUS, with the aim of establishing an informed consensus of appropriate indications, anatomic areas, and knowledge and skills required by rheumatologists who perform MUS. This important information will enable us to develop and evaluate competency standards and devise educational outcomes, which will form the basis for the development of a specific training curriculum and assessment process for rheumatologists in MUS.
MATERIALS AND METHODS
Study design.
We employed a 4-stage consensus defining approach utilizing the Delphi technique (20) and a panel of international experts in MUS (Figure 1). This process involved an initial stage of identification of areas considered to be important for rheumatologists; two iterative rounds of questionnaire completion, feedback, and reflection; and a final stage of analysis.

Method summary. MUS = musculoskeletal ultrasonography.
Delphi process.
The Delphi process is a research technique originally developed in the 1950s by the Rand Corporation and takes its name from the Delphic oracle's skills of interpretation and foresight (20). It is a structured multistage process involving a series of iterative rounds and aims to combine expert opinion to establish a group consensus on a given subject. It can be usefully applied to areas of controversy or areas in which published data are inadequate and involves a series of questionnaires interspersed with controlled feedback (21-23). The Delphi approach was chosen because it represents an established method of obtaining considered opinion of experts, determining levels of agreement, and evaluating the degree of consensus. It has been widely used in various areas of health care research, including the development of guidelines for best practice, content of curricula, and in the definition of professional roles and clinical protocols (24, 25). There are a number of specific advantages to this methodology that were useful for our purposes. It allows a large number of experts to be sampled without geographic constraints; it is relatively inexpensive and straightforward to manage with quantitative and qualitative data being collected via a self-administered questionnaire; and anonymity between panelists avoids dominance of opinions from certain individuals. This overcomes a potential problem of group dynamics that can occur with decision-making committees and encourages true opinion with less pressure to conform to the collective viewpoint of the committee, profession, or establishment. The regular participation in the iterative rounds and controlled feedback allows individuals to change or modify their opinions and encourages panel members to become more involved and motivated to participate in the development of the project—this enhances ownership and improves the likelihood of acceptance of the findings. Potential disadvantages include a lack of data regarding the minimum sample size; difficulties of ensuring representative sampling of experts; no formal definition as to what actually constitutes a consensus; potential reduced response rate in later rounds; and possible lack of accountability because of anonymity within the group.
Identification of the expert panel.
The first stage involved selection of an appropriate panel of experts in MUS. These experts were chosen by Medline literature review to identify authors of relevant peer-reviewed publications; identification of members of the teaching faculty of established MUS training courses; and consultation with committee members from the European League Against Rheumatism, British Society of Skeletal Radiologists MUS working groups, and the Musculoskeletal Ultrasound Society to identify individuals regarded as experts by their peers (17). This group of informed experienced professionals, each with a track record of ongoing practice, research, and teaching, consisted of international experts in the field who were selected as a heterogeneous but representative sample of specialist practitioners in MUS.
Questionnaire design.
The construction of an appropriate data-collection instrument involved a number of steps. A Medline database literature review was conducted to identify areas in which MUS had been utilized and that were relevant to rheumatologic practice. Data from a preceding study, which identified the current MUS practice of the same group of experts, was reviewed (17). In addition, local experts were interviewed and asked to list areas that they perceived to be important. This information was collated and analyzed to produce a 4-page questionnaire that was divided into 4 sections: 1) indications for which a rheumatologist could perform a MUS examination (13 categories); 2) anatomic areas in which it would be appropriate for rheumatologists to undertake MUS (10 categories); 3) the knowledge and skills that a rheumatologist should acquire as part of their training in MUS (14 categories); and 4) a blank section in which the experts were asked to detail any areas not listed that they thought were important together with any other opinions or comments. Panelists were asked to rate their level of agreement or disagreement for each category according to a Likert scale (see below).
A small local pilot study was then undertaken to assess the function of the questionnaire to exclude any difficulties with comprehension or wording, to review its layout, and to assess the feasibility of administration. This resulted in minor modifications. This completed stage 1 of the Delphi process.
The finalized questionnaire was then distributed to our entire expert panel by postal mail as part of stage 2 of the Delphi process. After 4, 7, and 10 weeks, written, e-mail, and personal telephone reminders were made to the nonresponders.
Stage 3 involved summarizing the ratings given by the respondents in the previous round and including these in a further version of the stage-1 questionnaire. Following each question, the percentage distribution of the group responses was listed together with a reminder of the respondent's own previous score. This feedback information was presented to each expert who was asked to study the group replies and indicate whether their individual opinion remained unchanged or had been modified in light of the responses made by the other members of the panel. This questionnaire was recirculated to each panelist by electronic or postal mail. Subsequent written, e-mail, and personal telephone reminders were made to the nonresponders after 4, 7, 10, and 14 weeks. At each round, a letter explaining the purpose of the exercise accompanied the questionnaire. Two such iterative phases were planned, but if opinion remained disparate and our consensus criteria were not satisfied, an additional round would be conducted. The results were analyzed as part of stage 4 for levels of agreement and degree of consensus.
Statistical advice was taken at each stage of the questionnaire design and was provided by a statistician specializing in educational research at the Medical Education Unit, University of Leeds.
Analysis.
We aimed to establish whether experts in MUS agreed with the issues under consideration and to evaluate levels of agreement between experts with the aim of developing a universally applicable consensus of opinion regarding appropriate practice. Agreement was assessed on 2 levels: 1) Does the expert respondent agree with the issue under consideration; and 2) Does each expert respondent agree with the opinion of other experts on a certain issue (consensus element). A 1–5 Likert scale was used to assess opinion in response to the statements contained in the questionnaire in which 1 indicated “definitely no/definitely not important,” 2 indicated “probably no/probably not important,” 3 indicated “no opinion,” 4 indicated “probably yes/probably important,” and 5 indicated “definitely yes/essential.” Results were expressed as the cumulative percentage of respondents scoring an item either 4 (probably yes/probably important) or 5 (definitely yes/essential) (total cumulative agreement). Group agreement with the issue under consideration was defined as total cumulative agreement ≥70% after the second Delphi round. The net change in agreement between Delphi rounds was used as a measure of the level of agreement between the panel members (net total cumulative agreement). Group consensus was defined as a net change of less than ±10%. If both of these parameters were satisfied, group consensus agreement was established and the category was defined as appropriate. The results have been presented as the total cumulative agreement after each Delphi round and the net total cumulative agreement; they also have been broken down in terms of specialty background. Data evaluation and statistical analysis were carried out by the authors using SPSS version 10 (Chicago, IL) under the direction of a statistician specializing in medical education research. Nonparametric statistical tests were used to assess levels of significance (independent samples were compared using the Mann-Whitney test; related samples were compared using the Wilcoxon's signed rank test).
RESULTS
Composition of experts.
Fifty-seven international experts in MUS were identified who satisfied our selection criteria (Table 1). Eleven were based in the United Kingdom, 28 in mainland Europe (Austria, Belgium, Denmark, Finland, France, Germany, The Netherlands, Italy, Spain, and Switzerland), 13 in North America, and 5 elsewhere (Australia, Asia, South America). This group was composed of 20 rheumatologists and 37 radiologists. Their overall median duration of MUS practice was 8–9 years (interquartile range [IQR] 4–19 years) and individuals performed a median of 4–5 MUS sessions per week (IQR 2–7) on a median of 21–25 patients (IQR 11–35) (17).
Specialty | ||
---|---|---|
Rheumatologists | Radiologists | |
UK | 6 | 5 |
Mainland Europe | 14 | 14 |
North America | 0 | 13 |
Other (Asia, Australia, South America) | 0 | 5 |
Total | 20 | 37 |
Response rate.
The overall response rate from the first Delphi round was 70% (40 of 57). This included a 90% response (18 of 20) from rheumatologists and a 59% response (22 of 37) from radiologists. One reply was anonymous and 1 was incomplete; these were not used. The overall response rate from the second Delphi round was 95% (36 of 38). The relative proportion of these respondents per specialty was 100% (18 of 18) for rheumatologists and 90% (18 of 20) for radiologists. Nonresponders to the first questionnaire were not included in the second round.
Group consensus agreement of appropriate indications for a rheumatologist to perform MUS.
After the second Delphi exercise, the total cumulative agreement scores for monitoring disease activity (89%), monitoring disease progression (87%), bursitis (84%), effusion (84%), inflammatory arthritis (81%), guided aspiration (81%), and guided injection (81%) were all easily above the threshold value of 70%, signifying group agreement (Table 2). The net level of agreement for almost all these indications increased between rounds, implying a positive feeling within the group that these indications were indeed appropriate. The greatest change was +8% for monitoring disease progression, and consequently all categories remained comfortably within our ±10% criteria for consensus agreement. Tendon pathology showed less total cumulative agreement, with a negative net change from 78% to 75% between rounds, but this still fulfilled our criteria to be included as an appropriate indication. Degenerative arthritis (46%), soft tissue mass (46%), nerve lesions (51%), ligament injury (51%), and muscle injury (57%) all scored well below our 70% round-2 limit, signifying insufficient group agreement. The corresponding net total cumulative agreement scores were all negative, implying increasing agreement that these were inappropriate indications for a rheumatologist to perform a MUS examination. Considering the overall change of all scores between rounds, the only indications where there was a statistically significant difference in responses were nerve lesions (P = 0.01), ligament injury (P = 0.02), and monitoring disease progression (P = 0.02). Therefore, appropriate indications for a rheumatologist to perform a MUS examination as defined by expert group consensus are inflammatory arthritis, tendon pathology, joint or soft tissue effusion, bursitis, monitoring disease activity, monitoring disease progression, guided aspiration, and guided injection.
Indications | Delphi 1 Total cumulative agreement, % | Delphi 2 Total cumulative agreement, % | Net total cumulative agreement, % |
---|---|---|---|
Inflammatory arthritis | |||
Total | 76 | 81 | +5 |
Rheumatologist | 100 | 100 | 0 |
Radiologist | 55 | 63 | +8 |
Tendon pathology | |||
Total | 78 | 75 | −3 |
Rheumatologist | 100 | 100 | 0 |
Radiologist | 56 | 50 | −6 |
Degenerative arthritis* | |||
Total | 47 | 46 | −1 |
Rheumatologist | 78 | 67 | −11 |
Radiologist | 20 | 26 | +6 |
Effusion (joint/soft tissue) | |||
Total | 82 | 84 | +2 |
Rheumatologist | 100 | 100 | 0 |
Radiologist | 65 | 68 | +3 |
Bursitis | |||
Total | 82 | 84 | +2 |
Rheumatologist | 100 | 100 | 0 |
Radiologist | 65 | 68 | +3 |
Muscle injury* | |||
Total | 58 | 57 | −1 |
Rheumatologist | 94 | 90 | −4 |
Radiologist | 25 | 26 | +1 |
Ligament injury* | |||
Total | 58 | 51 | −7 |
Rheumatologist | 94 | 83 | −11 |
Radiologist | 25 | 21 | −4 |
Soft tissue mass* | |||
Total | 47 | 46 | −1 |
Rheumatologist | 72 | 72 | 0 |
Radiologist | 25 | 21 | −4 |
Nerve lesions* | |||
Total | 55 | 51 | −4 |
Rheumatologist | 99 | 89 | −10 |
Radiologist | 25 | 16 | −9 |
Monitoring disease activity | |||
Total | 82 | 89 | +7 |
Rheumatologist | 94 | 100 | +6 |
Radiologist | 70 | 79 | +9 |
Monitoring disease progression | |||
Total | 79 | 87 | +8 |
Rheumatologist | 94 | 100 | +6 |
Radiologist | 65 | 74 | +9 |
Guided aspiration | |||
Total | 82 | 81 | −1 |
Rheumatologist | 100 | 100 | 0 |
Radiologist | 65 | 63 | −2 |
Guided injection | |||
Total | 82 | 81 | −1 |
Rheumatologist | 95 | 100 | +5 |
Radiologist | 70 | 68 | −2 |
- * Categories defined as inappropriate for rheumatologists.
Group consensus agreement of appropriate anatomic areas for a rheumatologist to perform MUS.
Group agreement was established for the hand (81%), wrist (81%), elbow (78%), hip (75%), knee (75%), ankle and heel (75%), forefoot (75%), and shoulder (72%) as appropriate regions for rheumatologists to perform MUS (Table 3). Net agreement was within our defined limits between rounds for all these sites, with the greatest change being +6% in the hand and wrist. Hence, our criteria for group consensus agreement were satisfied in each of these anatomic areas. Soft tissue (56%) and groin (56%) both scored well below our 70% threshold for group agreement, and even though net total cumulative agreement showed little change between rounds, these areas were consequently considered inappropriate for a rheumatologist to perform a MUS examination. The overall change in expert responses between rounds did not reach statistical significance in any anatomic area. Therefore, appropriate anatomic areas for a rheumatologist to perform a MUS examination as defined by expert group consensus are the hand, wrist, elbow, shoulder, hip, knee, ankle and heel, and forefoot.
Anatomic areas | Delphi 1 Total cumulative agreement, % | Delphi 2 Total cumulative agreement, % | Net total cumulative agreement, % |
---|---|---|---|
Hands | |||
Total | 75 | 81 | +6 |
Rheumatologist | 95 | 100 | +5 |
Radiologist | 56 | 61 | +5 |
Wrist | |||
Total | 75 | 81 | +6 |
Rheumatologist | 94 | 100 | +6 |
Radiologist | 56 | 61 | +5 |
Elbow | |||
Total | 75 | 77 | +2 |
Rheumatologist | 100 | 100 | 0 |
Radiologist | 50 | 56 | +6 |
Shoulder | |||
Total | 70 | 72 | +2 |
Rheumatologist | 94 | 95 | +1 |
Radiologist | 45 | 50 | +5 |
Hips | |||
Total | 75 | 75 | 0 |
Rheumatologist | 94 | 94 | 0 |
Radiologist | 56 | 56 | 0 |
Groin* | |||
Total | 53 | 56 | +3 |
Rheumatologist | 83 | 78 | −5 |
Radiologist | 22 | 33 | +11 |
Knee | |||
Total | 75 | 75 | 0 |
Rheumatologist | 100 | 100 | 0 |
Radiologist | 50 | 50 | 0 |
Ankle and heel | |||
Total | 75 | 75 | 0 |
Rheumatologist | 100 | 100 | 0 |
Radiologist | 50 | 50 | 0 |
Forefoot | |||
Total | 75 | 75 | 0 |
Rheumatologist | 95 | 100 | +5 |
Radiologist | 56 | 50 | −6 |
Soft tissue* | |||
Total | 53 | 56 | +3 |
Rheumatologist | 78 | 78 | 0 |
Radiologist | 28 | 33 | +5 |
- * Categories defined as inappropriate for rheumatologists.
Group consensus agreement of appropriate knowledge and skills that a rheumatologist requires to perform MUS.
All knowledge and skill categories satisfied our criteria for group agreement, with total cumulative agreement scores ranging between 86% and 100% (Table 4). Net agreement was either positive, with a maximum score of +6% for color Doppler, or did not change between rounds. There were no negative net agreement scores. Consensus agreement was consequently established for all 14 knowledge and skill categories. Statistically significant differences in the overall scoring between rounds were seen in the categories of color Doppler (P = 0.01), machine function and operation (P = 0.046), and image optimization (P = 0.046). Therefore, appropriate knowledge and skills that a rheumatologist requires to perform a MUS examination as defined by expert consensus agreement are ultrasound physics, anatomy, pathology, clinical application and relevance, indications and limitations, artifact, machine function and operation, patient and probe position, planes and system of examination, image optimization, dynamic assessment, color Doppler, and power Doppler (Table 5).
Knowledge and skills | Delphi 1 Total cumulative agreement, % | Delphi 2 Total cumulative agreement, % | Net total cumulative agreement, % |
---|---|---|---|
MUS physics | |||
Total | 86 | 86 | 0 |
Rheumatologist | 78 | 83 | +5 |
Radiologist | 95 | 89 | −6 |
Anatomy | |||
Total | 100 | 100 | 0 |
Rheumatologist | 100 | 100 | 0 |
Radiologist | 100 | 100 | 0 |
Pathology | |||
Total | 100 | 100 | 0 |
Rheumatologist | 100 | 100 | 0 |
Radiologist | 100 | 100 | 0 |
MUS equipment | |||
Total | 92 | 94 | +2 |
Rheumatologist | 95 | 94 | −1 |
Radiologist | 89 | 95 | +6 |
Clinical application/relevance | |||
Total | 100 | 100 | 0 |
Rheumatologist | 100 | 100 | 0 |
Radiologist | 100 | 100 | 0 |
Indications/limitations of MUS | |||
Total | 100 | 100 | 0 |
Rheumatologist | 100 | 100 | 0 |
Radiologist | 100 | 100 | 0 |
MUS artifact | |||
Total | 100 | 100 | 0 |
Rheumatologist | 100 | 100 | 0 |
Radiologist | 100 | 100 | 0 |
Machine function/operation | |||
Total | 94 | 97 | +3 |
Rheumatologist | 85 | 100 | +15 |
Radiologist | 94 | 95 | +1 |
Patient/probe position | |||
Total | 100 | 100 | 0 |
Rheumatologist | 100 | 100 | 0 |
Radiologist | 100 | 100 | 0 |
Planes and system of examination | |||
Total | 97 | 97 | 0 |
Rheumatologist | 100 | 100 | 0 |
Radiologist | 95 | 100 | +5 |
Image optimization | |||
Total | 97 | 97 | 0 |
Rheumatologist | 100 | 100 | 0 |
Radiologist | 95 | 95 | 0 |
Dynamic assessment | |||
Total | 97 | 97 | 0 |
Rheumatologist | 100 | 100 | 0 |
Radiologist | 95 | 95 | 0 |
Color Doppler | |||
Total | 86 | 92 | +6 |
Rheumatologist | 94 | 100 | +6 |
Radiologist | 79 | 83 | +4 |
Power Doppler | |||
Total | 86 | 89 | +3 |
Rheumatologist | 90 | 94 | +4 |
Radiologist | 84 | 83 | −1 |
- * MUS = musculoskeletal ultrasonography.
Indications | Anatomic areas | Knowledge and skills |
---|---|---|
Inflammatory arthritis | Hand | MUS physics |
Tendon pathology | Wrist | Anatomy |
Effusion (joint/soft tissue) | Elbow | Pathology |
Bursitis | Shoulder | MUS equipment |
Monitoring disease activity | Hip | Clinical application/relevance |
Monitoring disease progression | Knee | Indications/limitations of MUS |
Guided aspiration (joint/soft tissue fluid collection) | Ankle and heel | MUS artifact |
Guided injection | Forefoot | Machine function/operation |
Patient/probe position | ||
Planes and system of examination | ||
Image optimization | ||
Dynamic assessment | ||
Color Doppler | ||
Power Doppler |
- * MUS = musculoskeletal ultrasonography.
Differences between specialty background.
When we divided responses in terms of specialty background (i.e., rheumatologist or radiologist) some interesting differences were observed. Total cumulative agreement scores for radiologists are consistently lower than those of rheumatologists for all indications and anatomic areas and most knowledge and skill categories, although differences are lowest in this latter section (mean 67% versus 95%; range 16–100% versus 67–100%). Indeed, statistically significant differences were demonstrated between responses given by the 2 specialty groups for all categories of indication and anatomic area after both Delphi rounds except for “monitoring disease activity” in round 1 (P = 0.06). No statistically significant differences were seen between responses for any of the knowledge and skill categories after either round. In the categories where consensus agreement was satisfied, the mean overall difference in total cumulative agreement scores between specialties after round 2 was 22% (range 0–50%), compared with 55% (40–73%) in the categories defined as inappropriate for rheumatologists. The mean cumulative agreement scores for rheumatologists and radiologists were 99% (83–100%) versus 77% (50–100%) for the accepted categories and 79% (67–89%) versus 25% (16–33%) for the rejected categories. Rheumatologists scored 100% total cumulative agreement for 25 of 30 appropriate categories, the exceptions being shoulder (95%), hips (94%), equipment (94%), power Doppler (94%), and physics (83%). However, they scored significantly less for all 7 inappropriate categories: degenerative arthritis (67%), soft tissue mass (72%), groin (78%), soft tissue (78%), ligament injury (83%), nerve lesions (89%), and muscle injury (90%). When reviewing indications and anatomic areas, rheumatologists were more likely to score a category as “definitely important/essential” rather than “probably important”; radiologists on the other hand were much more likely to score a category as “probably important” rather than “definitely important/essential.” Little difference was noted in the categories of knowledge and skills.
If we consider change of opinion between Delphi rounds, the mean net total cumulative agreement was greatest among radiologists (mean +1.5%, range −9 to +11) compared with rheumatologists (mean +0.6%, range −11 to +15). These differences were only statistically significant among the rheumatologists' scores in the anatomic category of nerve lesions (P = 0.046) and in the knowledge and skill category color Doppler (P = 0.03). None of the responses made by the radiologists changed to a statistically significant level between Delphi rounds 1 and 2. In the categories where consensus agreement was satisfied, the mean overall net change in cumulative agreement score was +1.7%, whereas in the categories defined as inappropriate for rheumatologists it was −1.8%. When divided by specialty, the mean net total cumulative agreement score in the accepted categories among rheumatologists was +2.2% compared with +1.6% for radiologists. In the rejected categories, the mean net change was −6.2% among rheumatologists and +1.0% for radiologists.
DISCUSSION
There are currently insufficient published data to make recommendations for the training and assessment of rheumatologists performing MUS or to guide a rheumatologist as to what is appropriate MUS practice. This is the first study to provide information in a number of these fundamental areas; the results will facilitate much-needed informed educational development in this field and will direct future rheumatology MUS practice. We have produced expert-derived consensus guidelines of appropriate indications, anatomic areas, and knowledge and skills required by rheumatologists who perform MUS. This is a substantial step forward in this important and developing area of rheumatologic imaging.
We identified 57 international experts in MUS comprising 20 rheumatologists and 37 radiologists, all of whom have a track record of teaching, research, and active MUS practice over a number of years. The response rate was particularly good among the rheumatologists and there was excellent retention of all respondents between rounds, implying a high level of motivation and interest among expert practitioners.
Extensive preliminary research enabled us to develop a focused questionnaire containing 37 categories of possible areas of importance for rheumatologists undertaking MUS. This was divided into 4 sections comprising indications, anatomic areas, knowledge and skills, and free text. Using our strict criteria of group and consensus agreement, we were able to classify these into 30 categories that satisfied our criteria as being appropriate for rheumatologists and 7 that were considered inappropriate (Tables 5 and 6).
Indications | Anatomic areas | Knowledge and skills |
---|---|---|
Degenerative arthritis | Groin | |
Muscle injury | Soft tissue | |
Ligament injury | ||
Soft tissue mass | ||
Nerve lesions |
A number of interesting observations can be made from this data. Two distinct groups of conditions were identified within the possible indications list. The first group had total cumulative agreement scores ranging from 81% to 89%, easily above our threshold value of 70% to signify group agreement. The net change in cumulative agreement between rounds for all of these categories satisfied our criteria for group consensus with the scores becoming more positive, indicating an increase in agreement within the expert group that these indications were indeed appropriate. The second group of indications all had cumulative agreement scores ranging from 46% to 57%, well below the cutoff value of 70% to signify group agreement. Although the net change in cumulative agreement was relatively small and satisfied our criteria for consensus agreement, the absolute change was in a negative direction, implying that the group was more definite in its opinion that these categories should indeed be rejected. The inappropriate indications for rheumatologist MUS comprise degenerative arthritis, muscle and ligament injury, soft tissue masses, and nerve lesions. In our opinion, although MUS may be an appropriate first-line investigation for some of these indications (although one could argue at present that a radiograph may be a more appropriate investigation for degenerative arthritis), it is likely that correlation of MUS findings with those of other imaging modalities, e.g., magnetic resonance imaging, will be required. It is expected that this additional imaging would be interpreted by a radiologist, suggesting that they may be the more appropriate specialists to undertake any initial MUS examination for these indications.
The category of tendon pathology had a slightly different pattern of scores. This category did satisfy our criteria to be included as an appropriate indication with a total cumulative agreement score of 75%, although this value was relatively lower than the other accepted indications, with a net change between rounds of −3%, suggesting some uncertainty within the group. This observation may be linked to the scores for the shoulder category in the anatomic areas section, which scored 72% total cumulative agreement, just above the limit for group agreement, with a net change of +2%. These 2 categories have among the largest differences in total cumulative agreement scores between radiologists and rheumatologists (50% versus 100% for tendon pathology; 50% versus 95% for the shoulder). One of the most common indications for MUS in traditional orthopedic radiology practice is examination of the rotator cuff tendons in the shoulder. The shoulder is a recognized controversial area in MUS because it is one of the most difficult areas to scan proficiently with possibly the greatest learning curve. For this reason, and the possible requirement for correlation of MUS findings with those of other imaging modalities, some believe that it is inappropriate for anyone other than a radiologist to perform a MUS examination of the shoulder. Although one may speculate regarding a possible association between the categories of tendon pathology and shoulder, further work is ongoing to formally establish any relationship between the indication and anatomy categories and to determine linkage with pathology. This will enable us to determine, for example, the anatomic areas in which tendon examination is appropriate and what pathologic processes should be identified by rheumatologists using MUS.
In the anatomic area section, agreement is greatest in the hand and wrist categories, with total cumulative agreement scores of 81% with net scores of +6%, implying increasingly positive consensus in these areas. The issue of the shoulder has been dealt with above but it is perhaps a little surprising that the other anatomic areas, although satisfying the criteria for group consensus agreement, do not score a little higher. The relatively low scores given by the radiologists seem to account for this, with total cumulative agreement of only 50% in the knee, ankle and heel, and forefoot with no change between rounds in the former categories and even a reduction in net agreement in the latter. This is in stark contrast to the opinions given by the rheumatologists, who score 100% total cumulative agreement in all of these categories.
Group consensus was established in all categories of knowledge and skills, with all criteria being comfortably satisfied. All net total cumulative agreement scores remained unchanged or increased, implying positive consensus agreement. No statistically significant differences were seen in the responses given by each specialist group for any of the knowledge and skill categories, with high levels of total and specialty cumulative agreement (83–100%). This implies that our experts were unanimous in their opinion regarding these attributes and so group consensus in this section was readily established.
Overall there was relatively little change in individual or total cumulative agreement scores between Delphi rounds, despite experts being presented with the group results and offered time to reflect and change their original answers. This implies that the experts were confident in their own opinions, which may reflect the fact that they are experienced practitioners and have developed their own firm views that are unlikely to be changed by the collective opinion of the group. The small overall change in scores between rounds implies stable and reliable individual and group opinion and corroborates the accuracy of this data. Regardless of the total cumulative agreement score, all categories fulfilled our criteria for group consensus because the net total cumulative agreement scores were within the defined limits of ±10%. This implies that the Delphi process had been successful in obtaining a reliable group consensus after 2 rounds and that further questioning was unnecessary.
The difference in scores between the 2 specialties of rheumatology and radiology are interesting. The radiologists have total cumulative agreement scores that are consistently below those of the rheumatologists, with statistically significant differences in all indication and anatomic area categories. This may reflect both the enthusiasm of the rheumatologist to perform a MUS examination for all indications and in all anatomic areas and a more cautious approach by the radiologist exercising an initial degree of control on how much MUS is appropriate for a rheumatologist to undertake. However, even though the radiologists' scores are lower, the trend and relative change in results compared with those of the rheumatologist is similar. For example, among the 7 categories that were identified as being inappropriate, the mean total cumulative agreement is much lower than in the categories that were accepted (rheumatologists 79% [range 67–89%] versus 99% [83–100%]; radiologists 25% [16–33%] versus 77% [50–100%]). In addition, in the categories where consensus agreement was satisfied, the mean overall net change in cumulative agreement score was +1.7%, implying positive consensus; whereas in the categories defined as inappropriate for rheumatologists it was −1.8%, implying a more negative consensus. When divided per specialty, the trend is similar, with a mean net total cumulative agreement score in the accepted categories of +2.2% among rheumatologists compared with +1.6% for radiologists. In the rejected categories, the mean net change was −6.2% among rheumatologists and +1.0% for radiologists. This data therefore suggests relative agreement between the 2 specialties.
The Delphi consensus-defining methodology was chosen because it represents a recognized method of obtaining considered opinions from knowledgeable, informed professionals and is particularly suited to providing insights into areas in which there is currently limited published data, such as rheumatologist-performed MUS. It has proved to be an effective technique that has allowed us to determine expert consensus agreement relating to future rheumatologist MUS practice. Explicit criteria were applied to the selection of experts to ensure that they were representative of the wider MUS specialist community. Likewise, strict definitions of group agreement and consensus were adopted. A wide variety of resources were used to construct the initial questionnaire to eliminate any potential bias from the authors. The 2 iterative phases allowed the experts to interact with the questionnaire, reflect on their initial judgments, gather any required information, and alter their responses based on feedback from their peers. Opportunities for repeated consideration also provided data on attitudes about the degree of cooperation and acceptance among panelists regarding the role of rheumatologists in MUS. The excellent retention of respondents between rounds implies a high level of motivation and ownership among our expert panelists, which increases the likelihood of acceptance, dissemination, and implementation of our findings. This rigorous approach was necessary to maximize the validity of this process and ensure the relevance, credibility, applicability, and transferability to rheumatology MUS practice. Repeat testing of this study's finding against observed practice in the future will help to further reinforce validity and reliability. Although there are potential disadvantages to the Delphi method, the other alternative would be an anecdotal or subjective approach, which clearly would have been far less satisfactory.
We have obtained the first interdisciplinary consensus agreement among expert practitioners of recommendations for best practice among rheumatologists performing MUS. This important information will not only direct future rheumatology MUS practice and research, but will also facilitate informed educational development in this rapidly evolving field. This data will be used to develop precise learning outcomes and competency standards that will enable the introduction of a specific training curriculum and assessment process to ensure competent rheumatologist ultrasonographers.
Acknowledgements
We would like to acknowledge the contribution of our panel of experts for their assistance with this project. We would also like to thank Godfrey Pell for his statistical advice.