Comparative Analysis of Deep Learning Models for Breast Cancer Classification on Multimodal Data

被引:0
作者
Hussain, Sadam [1 ]
Ali, Mansoor [1 ]
Ali Pirzado, Farman [1 ]
Ahmed, Masroor [1 ]
Gerardo Tamez-Pena, Jose [2 ]
机构
[1] Tecnol Monterrey, Sch Engn & Sci, Monterrey, Nuevo Leon, Mexico
[2] Tecnol Monterrey, Sch Med & Hlth Sci, Monterrey, Nuevo Leon, Mexico
来源
PROCEEDINGS OF THE FIRST INTERNATIONAL WORKSHOP ON VISION-LANGUAGE MODELS FOR BIOMEDICAL APPLICATIONS, VLM4BIO 2024 | 2024年
关键词
Breast Cancer; Feature Fusion; Multi-modal Classification; Deep Learning; Vision Transformer; COMPUTER-AIDED DETECTION; MAMMOGRAMS; DIAGNOSIS;
D O I
10.1145/3689096.3689462
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Rising breast cancer incidence and mortality represent significant global challenges for women. Deep learning has demonstrated superior diagnostic performance in breast cancer classification compared to human experts. However, most deep learning methods have relied on unimodal features, potentially limiting the performance of diagnostic models. Additionally, most studies conducted so far have used a single view of digital mammograms, which significantly reduces model performance due to limited overall perspective and generalizability. To address these limitations, we collected a multiview multimodal dataset, including digital mammograms four views two craniocaudal (CC), two mediolateral oblique (MLO) one for each breast, and textual data extracted from radiological reports. We propose a multimodal deep learning architecture for breast cancer classification, utilizing images (digital mammograms) and textual data (radiological reports) from our new in-house dataset. In addition, various augmentation techniques are applied to both imaging and textual data to enhance the training data size. In our investigation, we explored the performance of six state-of-the-art (SOTA) deep learning architectures: VGG16, VGG19, ResNet34, MobileNetV3, EfficientNetB7, and a vision transformer (ViT) as an imaging feature extractors. For textual feature extraction, we employed an artificial neural network (ANN). Afterwards, features were fused using an early fusion and late fusion strategy. The fused imaging and textual features were then inputted into an ANN classifier for breast cancer classification. We evaluated various feature extractors and an ANN classifier combinations, finding that VGG19 in association with ANN achieved the highest accuracy at 0.951. In terms of precision, again VGG19 and ANN combination surpassed other SOTA CNN and attention-based architectures, achieving a score of 0.95. The best sensitivity score of 0.893 was recorded by VGG16+ANN, followed by VGG19+ANN with 0.884. The highest F1 score of 0.922 was achieved by VGG19+ANN. VGG16+ANN achieved the best area under the curve (AUC) score of 0.929, closely followed by VGG19+ANN with a score of 0.915.
引用
收藏
页码:31 / 39
页数:9
相关论文
共 43 条
  • [11] Integrative Analysis of Complex Cancer Genomics and Clinical Profiles Using the cBioPortal
    Gao, Jianjiong
    Aksoy, Buelent Arman
    Dogrusoz, Ugur
    Dresdner, Gideon
    Gross, Benjamin
    Sumer, S. Onur
    Sun, Yichao
    Jacobsen, Anders
    Sinha, Rileen
    Larsson, Erik
    Cerami, Ethan
    Sander, Chris
    Schultz, Nikolaus
    [J]. SCIENCE SIGNALING, 2013, 6 (269) : pl1
  • [12] Gardezi SJS, 2017, IEEE I C SIGNAL IMAG, P485, DOI 10.1109/ICSIPA.2017.8120660
  • [13] Repeatability of Linear and Nonlinear Elastic Modulus Maps From Repeat Scans in the Breast
    Gendin, Daniel, I
    Nayak, Rohit
    Wang, Yuqi
    Bayat, Mahdi
    Fazzio, Robert T.
    Oberai, Assad A.
    Hall, Timothy J.
    Barbone, Paul E.
    Alizad, Azra
    Fatemi, Mostafa
    [J]. IEEE TRANSACTIONS ON MEDICAL IMAGING, 2021, 40 (02) : 748 - 757
  • [14] Gu XW, 2018, 2018 IEEE INTERNATIONAL CONFERENCE ON SMART INTERNET OF THINGS (SMARTIOT 2018), P149, DOI [10.1109/SmartIoT.2018.00035, 10.1109/SmartIoT.2018.00-10]
  • [15] Gudhe Raju Naga, 2023, arXiv
  • [16] An early prediction and classification of lung nodule diagnosis on CT images based on hybrid deep learning techniques
    Gugulothu, Vijay Kumar
    Balaji, S.
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (1) : 1041 - 1061
  • [17] Hansebout RR, 2009, CAN J SURG, V52, P328
  • [18] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [19] Heiliger Lars, 2022, TechRxiv, DOI DOI 10.36227/TECHRXIV.19103432.V1, Patent No. 19103432
  • [20] End-to-End Learning of Fused Image and Non-Image Features for Improved Breast Cancer Classification from MRI
    Holste, Gregory
    Partridge, Savannah C.
    Rahbar, Habib
    Biswas, Debosmita
    Lee, Christoph, I
    Alessio, Adam M.
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 3287 - 3296