Performance of 4 Pre-Trained Sentence Transformer Models in the Semantic Query of a Systematic Review Dataset on Peri-Implantitis

被引:7
作者
Galli, Carlo [1 ]
Donos, Nikolaos [2 ]
Calciolari, Elena [2 ,3 ]
机构
[1] Univ Parma, Dept Med & Surg, Histol & Embryol Lab, Via Volturno 39, I-43126 Parma, Italy
[2] Queen Mary Univ London, Inst Dent, Fac Med & Dent, Ctr Oral Clin Res, London, England
[3] Univ Parma, Dent Sch, Dept Med & Dent, I-43126 Parma, Italy
关键词
transformers; embeddings; natural language processing; deep learning; systematic reviews; literature search; SURGICAL-TREATMENT; BONE; DEFECTS;
D O I
10.3390/info15020068
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Systematic reviews are cumbersome yet essential to the epistemic process of medical science. Finding significant reports, however, is a daunting task because the sheer volume of published literature makes the manual screening of databases time-consuming. The use of Artificial Intelligence could make literature processing faster and more efficient. Sentence transformers are groundbreaking algorithms that can generate rich semantic representations of text documents and allow for semantic queries. In the present report, we compared four freely available sentence transformer pre-trained models (all-MiniLM-L6-v2, all-MiniLM-L12-v2, all-mpnet-base-v2, and All-distilroberta-v1) on a convenience sample of 6110 articles from a published systematic review. The authors of this review manually screened the dataset and identified 24 target articles that addressed the Focused Questions (FQ) of the review. We applied the four sentence transformers to the dataset and, using the FQ as a query, performed a semantic similarity search on the dataset. The models identified similarities between the FQ and the target articles to a varying degree, and, sorting the dataset by semantic similarities using the best-performing model (all-mpnet-base-v2), the target articles could be found in the top 700 papers out of the 6110 dataset. Our data indicate that the choice of an appropriate pre-trained model could remarkably reduce the number of articles to screen and the time to completion for systematic reviews.
引用
收藏
页数:29
相关论文
共 69 条
  • [61] Systematic reviews of complex interventions: framing the review question
    Squires, Janet E.
    Valentine, Jeffrey C.
    Grimshaw, Jeremy M.
    [J]. JOURNAL OF CLINICAL EPIDEMIOLOGY, 2013, 66 (11) : 1215 - 1222
  • [62] Application of Text Mining Techniques on Scholarly Research Articles: Methods and Tools
    Thakur, Khusbu
    Kumar, Vinit
    [J]. NEW REVIEW OF ACADEMIC LIBRARIANSHIP, 2022, 28 (03) : 279 - 302
  • [63] van de Schoot R, 2020, Arxiv, DOI arXiv:2006.12166
  • [64] Vaswani A, 2017, ADV NEUR IN, V30
  • [65] Measurement of Text Similarity: A Survey
    Wang, Jiapeng
    Dong, Yihong
    [J]. INFORMATION, 2020, 11 (09) : 1 - 17
  • [66] A survey of word embeddings based on deep learning
    Wang, Shirui
    Zhou, Wenan
    Jiang, Chao
    [J]. COMPUTING, 2020, 102 (03) : 717 - 740
  • [67] Waskom M., 2021, J Open Source Softw, V6, P3021, DOI DOI 10.21105/JOSS.03021
  • [68] Wohlfahrt JC, 2012, INT J ORAL MAX IMPL, V27, P401
  • [69] Wolf T, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING: SYSTEM DEMONSTRATIONS, P38