Comparative Analysis of Deep Learning Models for Breast Cancer Classification on Multimodal Data

被引：0

作者：

Hussain, Sadam ^{[1
]}

Ali, Mansoor ^{[1
]}

Ali Pirzado, Farman ^{[1
]}

Ahmed, Masroor ^{[1
]}

Gerardo Tamez-Pena, Jose ^{[2
]}

机构：

[1] Tecnol Monterrey, Sch Engn & Sci, Monterrey, Nuevo Leon, Mexico

[2] Tecnol Monterrey, Sch Med & Hlth Sci, Monterrey, Nuevo Leon, Mexico

来源：

PROCEEDINGS OF THE FIRST INTERNATIONAL WORKSHOP ON VISION-LANGUAGE MODELS FOR BIOMEDICAL APPLICATIONS, VLM4BIO 2024 | 2024年

关键词：

Breast Cancer; Feature Fusion; Multi-modal Classification; Deep Learning; Vision Transformer; COMPUTER-AIDED DETECTION; MAMMOGRAMS; DIAGNOSIS;

D O I：

10.1145/3689096.3689462

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Rising breast cancer incidence and mortality represent significant global challenges for women. Deep learning has demonstrated superior diagnostic performance in breast cancer classification compared to human experts. However, most deep learning methods have relied on unimodal features, potentially limiting the performance of diagnostic models. Additionally, most studies conducted so far have used a single view of digital mammograms, which significantly reduces model performance due to limited overall perspective and generalizability. To address these limitations, we collected a multiview multimodal dataset, including digital mammograms four views two craniocaudal (CC), two mediolateral oblique (MLO) one for each breast, and textual data extracted from radiological reports. We propose a multimodal deep learning architecture for breast cancer classification, utilizing images (digital mammograms) and textual data (radiological reports) from our new in-house dataset. In addition, various augmentation techniques are applied to both imaging and textual data to enhance the training data size. In our investigation, we explored the performance of six state-of-the-art (SOTA) deep learning architectures: VGG16, VGG19, ResNet34, MobileNetV3, EfficientNetB7, and a vision transformer (ViT) as an imaging feature extractors. For textual feature extraction, we employed an artificial neural network (ANN). Afterwards, features were fused using an early fusion and late fusion strategy. The fused imaging and textual features were then inputted into an ANN classifier for breast cancer classification. We evaluated various feature extractors and an ANN classifier combinations, finding that VGG19 in association with ANN achieved the highest accuracy at 0.951. In terms of precision, again VGG19 and ANN combination surpassed other SOTA CNN and attention-based architectures, achieving a score of 0.95. The best sensitivity score of 0.893 was recorded by VGG16+ANN, followed by VGG19+ANN with 0.884. The highest F1 score of 0.922 was achieved by VGG19+ANN. VGG16+ANN achieved the best area under the curve (AUC) score of 0.929, closely followed by VGG19+ANN with a score of 0.915.

引用

页码：31 / 39

页数：9

共 43 条

[11] Integrative Analysis of Complex Cancer Genomics and Clinical Profiles Using the cBioPortal
Gao, Jianjiong
Aksoy, Buelent Arman
Dogrusoz, Ugur
Dresdner, Gideon
Gross, Benjamin
Sumer, S. Onur
Sun, Yichao
Jacobsen, Anders
Sinha, Rileen
Larsson, Erik
Cerami, Ethan
Sander, Chris
Schultz, Nikolaus
[J]. SCIENCE SIGNALING, 2013, 6 (269) : pl1
[12] Gardezi SJS, 2017, IEEE I C SIGNAL IMAG, P485, DOI 10.1109/ICSIPA.2017.8120660
[13] Repeatability of Linear and Nonlinear Elastic Modulus Maps From Repeat Scans in the Breast
Gendin, Daniel, I
Nayak, Rohit
Wang, Yuqi
Bayat, Mahdi
Fazzio, Robert T.
Oberai, Assad A.
Hall, Timothy J.
Barbone, Paul E.
Alizad, Azra
Fatemi, Mostafa
[J]. IEEE TRANSACTIONS ON MEDICAL IMAGING, 2021, 40 (02) : 748 - 757
[14] Gu XW, 2018, 2018 IEEE INTERNATIONAL CONFERENCE ON SMART INTERNET OF THINGS (SMARTIOT 2018), P149, DOI [10.1109/SmartIoT.2018.00035, 10.1109/SmartIoT.2018.00-10]
[15] Gudhe Raju Naga, 2023, arXiv
[16] An early prediction and classification of lung nodule diagnosis on CT images based on hybrid deep learning techniques
Gugulothu, Vijay Kumar
Balaji, S.
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (1) : 1041 - 1061
[17] Hansebout RR, 2009, CAN J SURG, V52, P328
[18] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[19] Heiliger Lars, 2022, TechRxiv, DOI DOI 10.36227/TECHRXIV.19103432.V1, Patent No. 19103432
[20] End-to-End Learning of Fused Image and Non-Image Features for Improved Breast Cancer Classification from MRI
Holste, Gregory
Partridge, Savannah C.
Rahbar, Habib
Biswas, Debosmita
Lee, Christoph, I
Alessio, Adam M.
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 3287 - 3296

← 1 2 3 4 5 →