SWIFT-Active Screener: Accelerated document screening through active learning and integrated recall estimation

被引:85
作者
Howard, Brian E. [1 ]
Phillips, Jason [1 ]
Tandon, Arpit [1 ]
Maharana, Adyasha [1 ]
Elmore, Rebecca [1 ]
Mav, Deepak [1 ]
Sedykh, Alex [1 ]
Thayer, Kristina [3 ]
Merrick, B. Alex [2 ]
Walker, Vickie [2 ]
Rooney, Andrew [2 ]
Shah, Ruchir R. [1 ]
机构
[1] Sciome LLC, 2 Davis Dr, Durham, NC 27709 USA
[2] NIEHS, NTP, 111 TW Alexander Dr RTP, Res Triangle Pk, NC 27709 USA
[3] US EPA, Integrated Risk Informat Syst IRIS Div, 109 TW Alexander Dr RTP, Res Triangle Pk, NC 27709 USA
关键词
Systematic review; Evidence mapping; Active learning; Machine learning; Document screening; Recall estimation; WORKLOAD; PERFORMANCE;
D O I
10.1016/j.envint.2020.105623
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Background: In the screening phase of systematic review, researchers use detailed inclusion/exclusion criteria to decide whether each article in a set of candidate articles is relevant to the research question under consideration. A typical review may require screening thousands or tens of thousands of articles in and can utilize hundreds of person-hours of labor. Methods: Here we introduce SWIFT-Active Screener, a web-based, collaborative systematic review software application, designed to reduce the overall screening burden required during this resource-intensive phase of the review process. To prioritize articles for review, SWIFT-Active Screener uses active learning, a type of machine learning that incorporates user feedback during screening. Meanwhile, a negative binomial model is employed to estimate the number of relevant articles remaining in the unscreened document list. Using a simulation involving 26 diverse systematic review datasets that were previously screened by reviewers, we evaluated both the document prioritization and recall estimation methods. Results: On average, 95% of the relevant articles were identified after screening only 40% of the total reference list. In the 5 document sets with 5,000 or more references, 95% recall was achieved after screening only 34% of the available references, on average. Furthermore, the recall estimator we have proposed provides a useful, conservative estimate of the percentage of relevant documents identified during the screening process. Conclusion: SWIFT-Active Screener can result in significant time savings compared to traditional screening and the savings are increased for larger project sizes. Moreover, the integration of explicit recall estimation during screening solves an important challenge faced by all machine learning systems for document screening: when to stop screening a prioritized reference list. The software is currently available in the form of a multi-user, collaborative, online web application.
引用
收藏
页数:13
相关论文
共 30 条
  • [11] Expediting systematic reviews: methods and implications of rapid reviews
    Ganann, Rebecca
    Ciliska, Donna
    Thomas, Helen
    [J]. IMPLEMENTATION SCIENCE, 2010, 5 : 10 - 19
  • [12] SWIFT-Review: A text-mining workbench for systematic review
    Howard B.E.
    Phillips J.
    Miller K.
    Tandon A.
    Mav D.
    Shah M.R.
    Holmgren S.
    Pelch K.E.
    Walker V.
    Rooney A.A.
    Macleod M.
    Shah R.R.
    Thayer K.
    [J]. Systematic Reviews, 5 (1)
  • [13] Jonnalagadda Siddhartha, 2013, International Journal of Computational Biology and Drug Design, V6, P5, DOI 10.1504/IJCBDD.2013.052198
  • [14] Learning to identify relevant studies for systematic reviews using random forest and external information
    Khabsa, Madian
    Elmagarmid, Ahmed
    Ilyas, Ihab
    Hammady, Hossam
    Ouzzani, Mourad
    [J]. MACHINE LEARNING, 2016, 102 (03) : 465 - 482
  • [15] Improving the Performance of Text Categorization Models Used for the Selection of High Quality Articles
    Kim, Seunghee
    Choi, Jinwook
    [J]. HEALTHCARE INFORMATICS RESEARCH, 2012, 18 (01) : 18 - 28
  • [16] RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials
    Marshall, Iain J.
    Kuiper, Joel
    Wallace, Byron C.
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2016, 23 (01) : 193 - 201
  • [17] A new algorithm for reducing the workload of experts in performing systematic reviews
    Matwin, Stan
    Kouznetsov, Alexandre
    Inkpen, Diana
    Frunza, Oana
    O'Blenis, Peter
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2010, 17 (04) : 446 - 453
  • [18] Reducing systematic review workload through certainty-based screening
    Miwa, Makoto
    Thomas, James
    O'Mara-Eves, Alison
    Ananiadou, Sophia
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2014, 51 : 242 - 253
  • [19] Supporting systematic reviews using LDA-based document representations
    Mo Y.
    Kontonatsios G.
    Ananiadou S.
    [J]. Systematic Reviews, 4 (1)
  • [20] Using text mining for study identification in systematic reviews: A systematic review of current approaches
    O'Mara-Eves A.
    Thomas J.
    McNaught J.
    Miwa M.
    Ananiadou S.
    [J]. Systematic Reviews, 4 (1)