Semi-automated title-abstract screening using natural language processing and machine learning

被引:0
作者
Pilz, Maximilian [1 ,2 ]
Zimmermann, Samuel [1 ]
Friedrichs, Juliane [3 ]
Woerdehoff, Enrica [3 ]
Ronellenfitsch, Ulrich [3 ]
Kieser, Meinhard [1 ]
Vey, Johannes A. [1 ]
机构
[1] Heidelberg Univ, Inst Med Biometry, Heidelberg, Germany
[2] Fraunhofer Inst Ind Math, Dept Optimizat, Kaiserslautern, Germany
[3] Martin Luther Univ Halle Wittenberg, Dept Visceral Vasc & Endocrine Surg, Med Fac, Halle, Saale, Germany
关键词
Machine learning; Natural language processing; Language models; Systematic review; Meta analysis; Automatization; Title-abstract screening;
D O I
10.1186/s13643-024-02688-w
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
BackgroundTitle-abstract screening in the preparation of a systematic review is a time-consuming task. Modern techniques of natural language processing and machine learning might allow partly automatization of title-abstract screening. In particular, clear guidance on how to proceed with these techniques in practice is of high relevance.MethodsThis paper presents an entire pipeline how to use natural language processing techniques to make the titles and abstracts usable for machine learning and how to apply machine learning algorithms to adequately predict whether or not a publication should be forwarded to full text screening. Guidance for the practical use of the methodology is given.ResultsThe appealing performance of the approach is demonstrated by means of two real-world systematic reviews with meta analysis.ConclusionsNatural language processing and machine learning can help to semi-automatize title-abstract screening. Different project-specific considerations have to be made for applying them in practice.
引用
收藏
页数:14
相关论文
共 48 条
  • [1] Machine learning algorithms to identify cluster randomized trials from MEDLINE and EMBASE
    Al-Jaishi, Ahmed A.
    Taljaard, Monica
    Al-Jaishi, Melissa D.
    Abdullah, Sheikh S.
    Thabane, Lehana
    Devereaux, P. J.
    Dixon, Stephanie N.
    Garg, Amit X.
    [J]. SYSTEMATIC REVIEWS, 2022, 11 (01)
  • [2] Machine learning algorithms for systematic review: reducing workload in a preclinical review of animal studies and reducing human screening error
    Bannach-Brown, Alexandra
    Przybyla, Piotr
    Thomas, James
    Rice, Andrew S. C.
    Ananiadou, Sophia
    Liao, Jing
    Macleod, Malcolm Robert
    [J]. SYSTEMATIC REVIEWS, 2019, 8 (1)
  • [3] Beltagy I, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P3615
  • [4] Using artificial intelligence methods for systematic review in health sciences: A systematic review
    Blaizot, Aymeric
    Veettil, Sajesh K.
    Saidoung, Pantakarn
    Moreno-Garcia, Carlos Francisco
    Wiratunga, Nirmalie
    Aceves-Martins, Magaly
    Lai, Nai Ming
    Chaiyakunapruk, Nathorn
    [J]. RESEARCH SYNTHESIS METHODS, 2022, 13 (03) : 353 - 362
  • [5] Analysis of the time and workers needed to conduct systematic reviews of medical interventions using data from the PROSPERO registry
    Borah, Rohit
    Brown, Andrew W.
    Capers, Patrice L.
    Kaiser, Kathryn A.
    [J]. BMJ OPEN, 2017, 7 (02):
  • [6] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [7] Using machine learning to advance synthesis and use of conservation and environmental evidence
    Cheng, S. H.
    Augustin, C.
    Bethel, A.
    Gill, D.
    Anzaroot, S.
    Brun, J.
    DeWilde, B.
    Minnich, R. C.
    Garside, R.
    Masuda, Y. J.
    Miller, D. C.
    Wilkie, D.
    Wongbusarakum, S.
    McKinnon, M. C.
    [J]. CONSERVATION BIOLOGY, 2018, 32 (04) : 762 - 764
  • [8] Christmann A., 2008, SUPPORT VECTOR MACHI
  • [9] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [10] Feinerer I, 2008, J STAT SOFTW, V25, P1