Performance of 4 Pre-Trained Sentence Transformer Models in the Semantic Query of a Systematic Review Dataset on Peri-Implantitis

被引：13

作者：

Galli, Carlo ^{[1
]}

Donos, Nikolaos ^{[2
]}

Calciolari, Elena ^{[2
,3
]}

机构：

[1] Univ Parma, Dept Med & Surg, Histol & Embryol Lab, Via Volturno 39, I-43126 Parma, Italy

[2] Queen Mary Univ London, Inst Dent, Fac Med & Dent, Ctr Oral Clin Res, London, England

[3] Univ Parma, Dent Sch, Dept Med & Dent, I-43126 Parma, Italy

来源：

INFORMATION | 2024年 / 15卷 / 02期

关键词：

transformers; embeddings; natural language processing; deep learning; systematic reviews; literature search; SURGICAL-TREATMENT; BONE; DEFECTS;

D O I：

10.3390/info15020068

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Systematic reviews are cumbersome yet essential to the epistemic process of medical science. Finding significant reports, however, is a daunting task because the sheer volume of published literature makes the manual screening of databases time-consuming. The use of Artificial Intelligence could make literature processing faster and more efficient. Sentence transformers are groundbreaking algorithms that can generate rich semantic representations of text documents and allow for semantic queries. In the present report, we compared four freely available sentence transformer pre-trained models (all-MiniLM-L6-v2, all-MiniLM-L12-v2, all-mpnet-base-v2, and All-distilroberta-v1) on a convenience sample of 6110 articles from a published systematic review. The authors of this review manually screened the dataset and identified 24 target articles that addressed the Focused Questions (FQ) of the review. We applied the four sentence transformers to the dataset and, using the FQ as a query, performed a semantic similarity search on the dataset. The models identified similarities between the FQ and the target articles to a varying degree, and, sorting the dataset by semantic similarities using the best-performing model (all-mpnet-base-v2), the target articles could be found in the top 700 papers out of the 6110 dataset. Our data indicate that the choice of an appropriate pre-trained model could remarkably reduce the number of articles to screen and the time to completion for systematic reviews.

引用

页数：29

共 69 条

[61] Systematic reviews of complex interventions: framing the review question [J].