NotesFAQContact Us
Collection
Advanced
Search Tips
Back to results
Peer reviewed Peer reviewed
Direct linkDirect link
ERIC Number: EJ1447297
Record Type: Journal
Publication Date: 2024-Nov
Pages: 13
Abstractor: As Provided
ISBN: N/A
ISSN: ISSN-1759-2879
EISSN: EISSN-1759-2887
Zero- and Few-Shot Prompting of Generative Large Language Models Provides Weak Assessment of Risk of Bias in Clinical Trials
Simon Šuster; Timothy Baldwin; Karin Verspoor
Research Synthesis Methods, v15 n6 p988-1000 2024
Existing systems for automating the assessment of risk-of-bias (RoB) in medical studies are supervised approaches that require substantial training data to work well. However, recent revisions to RoB guidelines have resulted in a scarcity of available training data. In this study, we investigate the effectiveness of generative large language models (LLMs) for assessing RoB. Their application requires little or no training data and, if successful, could serve as a valuable tool to assist human experts during the construction of systematic reviews. Following Cochrane's latest guidelines (RoB2) designed for human reviewers, we prepare instructions that are fed as input to LLMs, which then infer the risk associated with a trial publication. We distinguish between two modelling tasks: directly predicting RoB2 from text; and employing decomposition, in which a RoB2 decision is made after the LLM responds to a series of signalling questions. We curate new testing data sets and evaluate the performance of four general- and medical-domain LLMs. The results fall short of expectations, with LLMs seldom surpassing trivial baselines. On the direct RoB2 prediction test set (n = 5993), LLMs perform akin to the baselines (F1: 0.1-0.2). In the decomposition task setup (n = 28,150), similar F1 scores are observed. Our additional comparative evaluation on RoB1 data also reveals results substantially below those of a supervised system. This testifies to the difficulty of solving this task based on (complex) instructions alone. Using LLMs as an assisting technology for assessing RoB2 thus currently seems beyond their reach.
Wiley. Available from: John Wiley & Sons, Inc. 111 River Street, Hoboken, NJ 07030. Tel: 800-835-6770; e-mail: cs-journals@wiley.com; Web site: https://bibliotheek.ehb.be:2191/en-us
Publication Type: Journal Articles; Reports - Research
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Grant or Contract Numbers: N/A