Clinical validation of an artificial intelligence algorithm for classifying tuberculosis and pulmonary findings in chest radiographs

被引：0

作者：

de Camargo, Thiago Fellipe Ortiz ^{[1
,2
]}

Ribeiro, Guilherme Alberto Sousa ^{[1
,2
]}

da Silva, Maria Carolina Bueno ^{[1
]}

da Silva, Luan Oliveira ^{[1
]}

Torres, Pedro Paulo Teixeira e Silva ^{[1
]}

da Silva, Denise do Socorro Rodrigues ^{[3
]}

de Santos, Mayler Olombrada Nunes ^{[4
]}

Salibe Filho, William ^{[5
]}

Rosa, Marcela Emer Egypto ^{[1
]}

Novaes, Magdala de Araujo ^{[6
]}

Massarutto, Thiago Augusto ^{[7
]}

Landi Junior, Osvaldo ^{[7
]}

Yanata, Elaine ^{[1
]}

Reis, Marcio Rodrigues da Cunha ^{[8
]}

Szarf, Gilberto ^{[1
]}

Netto, Pedro Vieira Santana ^{[1
]}

de Paiva, Joselisa Peres Queiroz ^{[1
]}

机构：

[1] Hosp Israelita Albert Einstein, Image Res Ctr, Sao Paulo, Brazil

[2] Univ Fed Goias, Elect Mech & Comp Engn Sch, Goiania, Go, Brazil

[3] Clemente Ferreira Inst, Infectol Div, Sao Paulo, Brazil

[4] Hosp Israelita Albert Einstein, Aparecida Goiania Municipal Hosp, Sao Paulo, Go, Brazil

[5] Heart Inst, Pulm Div, Sao Paulo, Brazil

[6] Univ Fed Pernambuco, Med Sci Ctr, Recife, PE, Brazil

[7] Diagnost Imaging Res & Study Inst Fdn, Sao Paulo, Brazil

[8] Fed Inst Goias, Studies & Res Sci & Technol Grp, Goias, GO, Brazil

来源：

FRONTIERS IN ARTIFICIAL INTELLIGENCE | 2025年 / 8卷

关键词：

chest X-rays; artificial intelligence; deep learning; clinical validation; convolutional neural network; CONFIDENCE; CANCER;

D O I：

10.3389/frai.2025.1512910

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Background Chest X-ray (CXR) interpretation is critical in diagnosing various lung diseases. However, physicians, not specialists, are often the first ones to read them, frequently facing challenges in accurate interpretation. Artificial Intelligence (AI) algorithms could be of great help, but using real-world data is crucial to ensure their effectiveness in diverse healthcare settings. This study evaluates a deep learning algorithm designed for CXR interpretation, focusing on its utility for non-specialists in thoracic radiology physicians.Purpose To assess the performance of a Convolutional Neural Networks (CNNs)-based AI algorithm in interpreting CXRs and compare it with a team of physicians, including thoracic radiologists, who served as the gold-standard.Methods A retrospective study from January 2021 to July 2023 evaluated an algorithm with three independent models for Lung Abnormality, Radiological Findings, and Tuberculosis. The algorithm's performance was measured using accuracy, sensitivity, and specificity. Two groups of physicians validated the model: one with varying specialties and experience levels in interpreting chest radiographs (Group A) and another of board-certified thoracic radiologists (Group B). The study also assessed the agreement between the two groups on the algorithm's heatmap and its influence on their decisions.Results In the internal validation, the Lung Abnormality and Tuberculosis models achieved an AUC of 0.94, while the Radiological Findings model yielded a mean AUC of 0.84. During the external validation, utilizing the ground truth generated by board-certified thoracic radiologists, the algorithm achieved better sensitivity in 6 out of 11 classes than physicians with varying experience levels. Furthermore, Group A physicians demonstrated higher agreement with the algorithm in identifying markings in specific lung regions than Group B (37.56% Group A vs. 21.75% Group B). Additionally, physicians declared that the algorithm did not influence their decisions in 93% of the cases.Conclusion This retrospective clinical validation study assesses an AI algorithm's effectiveness in interpreting Chest X-rays (CXR). The results show the algorithm's performance is comparable to Group A physicians, using gold-standard analysis (Group B) as the reference. Notably, both Groups reported minimal influence of the algorithm on their decisions in most cases.

引用

页数：14

共 47 条

[1]

Altman DG, 2000, STAT MED, V19, P453, DOI 10.1002/(SICI)1097-0258(20000229)19:4<453::AID-SIM350>3.0.CO

[2]

2-5

[3]

Balogh EP, 2015, IMPROVING DIAGNOSIS IN HEALTH CARE, P1, DOI 10.17226/21794

[4] Can incorrect artificial intelligence (AI) results impact radiologists, and if so, what can we do about it? A multi-reader pilot study of lung cancer detection with chest radiography [J].

Bernstein, Michael H. ;

Atalay, Michael K. ;

Dibble, Elizabeth H. ;

Maxwell, Aaron W. P. ;

Karam, Adib R. ;

Agarwal, Saurabh ;

Ward, Robert C. ;

Healey, Terrance T. ;

Baird, Grayson L. .

EUROPEAN RADIOLOGY, 2023, 33 (11) :8263-8269

[5] Using Artificial Intelligence to Stratify Normal versus Abnormal Chest X-rays: External Validation of a Deep Learning Algorithm at East Kent Hospitals University NHS Foundation Trust [J].

Blake, Sarah R. ;

Das, Neelanjan ;

Tadepalli, Manoj ;

Reddy, Bhargava ;

Singh, Anshul ;

Agrawal, Rohitashva ;

Chattoraj, Subhankar ;

Shah, Dhruv ;

Putha, Preetham .

DIAGNOSTICS, 2023, 13 (22)

[6] Artificial Intelligence in Medicine: Today and Tomorrow [J].

Briganti, Giovanni ;

Le Moine, Olivier .

FRONTIERS IN MEDICINE, 2020, 7

[7] PadChest: A large chest x-ray image dataset with multi-label annotated reports [J].

Bustos, Aurelia ;

Pertusa, Antonio ;

Salinas, Jose-Maria ;

de la Iglesia-Vaya, Maria .

MEDICAL IMAGE ANALYSIS, 2020, 66

[8] Deep Learning: A Primer for Radiologists [J].

Chartrand, Gabriel ;

Cheng, Phillip M. ;

Vorontsov, Eugene ;

Drozdzal, Michal ;

Turcotte, Simon ;

Pal, Christopher J. ;

Kadoury, Samuel ;

Tang, An .

RADIOGRAPHICS, 2017, 37 (07) :2113-2131

[9] Development and validation of open-source deep neural networks for comprehensive chest x-ray reading: a retrospective, multicentre study [J].

Cid, Yashin Dicente ;

Macpherson, Matthew ;

Gervais-Andre, Louise ;

Zhu, Yuanyi ;

Franco, Giuseppe ;

Santeramo, Ruggiero ;

Lim, Chee ;

Selby, Ian ;

Muthuswamy, Keerthini ;

Amlani, Ashik ;

Hopewell, Heath ;

Indrajeet, Das ;

Liakata, Maria ;

Hutchinson, Charles E. ;

Goh, Vicky ;

Montana, Giovanni .

LANCET DIGITAL HEALTH, 2024, 6 (01) :e44-e57

[10] The use of confidence or fiducial limits illustrated in the case of the binomial. [J].

Clopper, CJ ;

Pearson, ES .

BIOMETRIKA, 1934, 26 :404-413

← 1 2 3 4 5 →