Machine learning for screening prioritization in systematic reviews: comparative performance of Abstrackr and EPPI-Reviewer

被引:58
作者
Tsou, Amy Y. [1 ]
Treadwell, Jonathan R. [1 ]
Erinoff, Eileen [1 ]
Schoelles, Karen [1 ]
机构
[1] ECRI Inst, Evidence Based Practice Ctr, Ctr Clin Excellence & Guidelines, Plymouth Meeting, PA 19462 USA
基金
美国医疗保健研究与质量局;
关键词
Machine learning; Citation screening; Text-mining; Abstrackr; EPPI-Reviewer; Screening prioritization; Methodology; Screening burden; Efficiency;
D O I
10.1186/s13643-020-01324-7
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Background Improving the speed of systematic review (SR) development is key to supporting evidence-based medicine. Machine learning tools which semi-automate citation screening might improve efficiency. Few studies have assessed use of screening prioritization functionality or compared two tools head to head. In this project, we compared performance of two machine-learning tools for potential use in citation screening. Methods Using 9 evidence reports previously completed by the ECRI Institute Evidence-based Practice Center team, we compared performance of Abstrackr and EPPI-Reviewer, two off-the-shelf citations screening tools, for identifying relevant citations. Screening prioritization functionality was tested for 3 large reports and 6 small reports on a range of clinical topics. Large report topics were imaging for pancreatic cancer, indoor allergen reduction, and inguinal hernia repair. We trained Abstrackr and EPPI-Reviewer and screened all citations in 10% increments. In Task 1, we inputted whether an abstract was ordered for full-text screening; in Task 2, we inputted whether an abstract was included in the final report. For both tasks, screening continued until all studies ordered and included for the actual reports were identified. We assessed potential reductions in hypothetical screening burden (proportion of citations screened to identify all included studies) offered by each tool for all 9 reports. Results For the 3 large reports, both EPPI-Reviewer and Abstrackr performed well with potential reductions in screening burden of 4 to 49% (Abstrackr) and 9 to 60% (EPPI-Reviewer). Both tools had markedly poorer performance for 1 large report (inguinal hernia), possibly due to its heterogeneous key questions. Based on McNemar's test for paired proportions in the 3 large reports, EPPI-Reviewer outperformed Abstrackr for identifying articles ordered for full-text review, but Abstrackr performed better in 2 of 3 reports for identifying articles included in the final report. For small reports, both tools provided benefits but EPPI-Reviewer generally outperformed Abstrackr in both tasks, although these results were often not statistically significant. Conclusions Abstrackr and EPPI-Reviewer performed well, but prioritization accuracy varied greatly across reports. Our work suggests screening prioritization functionality is a promising modality offering efficiency gains without giving up human involvement in the screening process.
引用
收藏
页数:14
相关论文
共 20 条
[1]  
[Anonymous], 2016, EPC METHODS EXPLORAT
[2]  
[Anonymous], STAT METHODS MED RES
[3]  
[Anonymous], EPC METHODS EXPLORAT
[4]  
[Anonymous], P 2 ACM SIGHIT INT H
[5]  
[Anonymous], SYST REV
[6]  
[Anonymous], TEXT MINING REDUCING
[7]   Analysis of the time and workers needed to conduct systematic reviews of medical interventions using data from the PROSPERO registry [J].
Borah, Rohit ;
Brown, Andrew W. ;
Capers, Patrice L. ;
Kaiser, Kathryn A. .
BMJ OPEN, 2017, 7 (02)
[8]  
Graham R, 2011, CLINICAL PRACTICE GUIDELINES WE CAN TRUST, P1
[9]   Software tools to support title and abstract screening for systematic reviews in healthcare: an evaluation [J].
Harrison, Hannah ;
Griffin, Simon J. ;
Kuhn, Isla ;
Usher-Smith, Juliet A. .
BMC MEDICAL RESEARCH METHODOLOGY, 2020, 20 (01)
[10]  
Institute of Medicine Committee on Standards for Systematic Reviews of Comparative Effectiveness Research, 2011, FIND WHAT WORKS HLTH