Natural Language Processing Techniques for Text Classification of Biomedical Documents: A Systematic Review

被引：6

作者：

YetuYetu Kesiku, Cyrille ^{[1
]}

Chaves-Villota, Andrea ^{[1
]}

Garcia-Zapirain, Begonya ^{[1
]}

机构：

[1] Univ Deusto, eVida Res Grp, Avda Univ 24, Bilbao 48007, Spain

来源：

INFORMATION | 2022年 / 13卷 / 10期

关键词：

text classification; biomedical document; natural language processing; biomedical text classification challenges;

D O I：

10.3390/info13100499

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The classification of biomedical literature is engaged in a number of critical issues that physicians are expected to answer. In many cases, these issues are extremely difficult. This can be conducted for jobs such as diagnosis and treatment, as well as efficient representations of ideas such as medications, procedure codes, and patient visits, as well as in the quick search of a document or disease classification. Pathologies are being sought from clinical notes, among other sources. The goal of this systematic review is to analyze the literature on various problems of classification of medical texts of patients based on criteria such as: the quality of the evaluation metrics used, the different methods of machine learning applied, the different data sets, to highlight the best methods in this type of problem, and to identify the different challenges associated. The study covers the period from 1 January 2016 to 10 July 2022. We used multiple databases and archives of research articles, including Web Of Science, Scopus, MDPI, arXiv, IEEE, and ACM, to find 894 articles dealing with the subject of text classification, which we were able to filter using inclusion and exclusion criteria. Following a thorough review, we selected 33 articles dealing with biological text categorization issues. Following our investigation, we discovered two major issues linked to the methodology and data used for biomedical text classification. First, there is the data-centric challenge, followed by the data quality challenge.

引用

页数：19

共 59 条

[1]

Aattouchi Issam, 2021, E3S Web of Conferences, V319, DOI 10.1051/e3sconf/202131901064

[2] Hierarchical Attentional Hybrid Neural Networks for Document Classification [J].

Abreu, Jader ;

Fred, Luis ;

Macedo, David ;

Zanchettin, Cleber .

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: WORKSHOP AND SPECIAL SESSIONS, 2019, 11731 :396-402

[3] A Text Mining Approach in the Classification of Free-Text Cancer Pathology Reports from the South African National Health Laboratory Services [J].

Achilonu, Okechinyere J. ;

Olago, Victor ;

Singh, Elvira ;

Eijkemans, Rene M. J. C. ;

Nimako, Gideon ;

Musenge, Eustasius .

INFORMATION, 2021, 12 (11)

[4] Unstructured Medical Text Classification Using Linguistic Analysis: A Supervised Deep Learning Approach [J].

Al-Doulat, Ahmad ;

Obaidat, Islam ;

Lee, Minwoo .

2019 IEEE/ACS 16TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA 2019), 2019,

[5]

Amin-Nejad A., 2020, P 12 LANGUAGE RESOUR

[6]

[Anonymous], 2015, International Classification of Diseases

[7] Multimodal Deep Networks for Text and Image-Based Document Classification [J].

Audebert, Nicolas ;

Herold, Catherine ;

Slimani, Kuider ;

Vidal, Cedric .

MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT I, 2020, 1167 :427-443

[8]

Bosc T, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P1522

[9] A Hybrid BERT Model That Incorporates Label Semantics via Adjustive Attention for Multi-Label Text Classification [J].

Cai, Linkun ;

Song, Yu ;

Liu, Tao ;

Zhang, Kunli .

IEEE ACCESS, 2020, 8 :152183-152192

[10]

Chen Pei-Fu, 2021, JMIR Med Inform, V9, pe23230, DOI 10.2196/23230

← 1 2 3 4 5 6 →