Applying Data Mining Techniques to Improve Breast Cancer Diagnosis

被引:41
作者
Diz, Joana [1 ]
Marreiros, Goreti [2 ]
Freitas, Alberto [1 ,3 ]
机构
[1] Univ Porto, Fac Med, CINTESIS Ctr Hlth Technol & Serv Res, Oporto, Portugal
[2] Polytech Porto, Inst Engn, GECAD Res Grp Intelligent Engn & Comp Adv Innovat, Oporto, Portugal
[3] Univ Porto, Fac Med, CIDES Dept Hlth Informat & Decis Sci, Oporto, Portugal
关键词
Breast cancer diagnosis; Features extraction; Data mining techniques; CLINICAL-DATA; CLASSIFICATION; DENSITY;
D O I
10.1007/s10916-016-0561-y
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
In the field of breast cancer research, and more than ever, new computer aided diagnosis based systems have been developed aiming to reduce diagnostic tests false-positives. Within this work, we present a data mining based approach which might support oncologists in the process of breast cancer classification and diagnosis. The present study aims to compare two breast cancer datasets and find the best methods in predicting benign/malignant lesions, breast density classification, and even for finding identification (mass / microcalcification distinction). To carry out these tasks, two matrices of texture features extraction were implemented using Matlab, and classified using data mining algorithms, on WEKA. Results revealed good percentages of accuracy for each class: 89.3 to 64.7 % - benign/malignant; 75.8 to 78.3 % - dense/fatty tissue; 71.0 to 83.1 % - finding identification. Among the different tests classifiers, Naive Bayes was the best to identify masses texture, and Random Forests was the first or second best classifier for the majority of tested groups.
引用
收藏
页数:7
相关论文
共 42 条
  • [1] American Cancer Society, 2016, CANC FACTS FIGS 2016
  • [2] [Anonymous], 2005, DATA MINING
  • [3] [Anonymous], 15 INT C EXP MECH FE
  • [4] Development of an online, publicly accessible naive Bayesian decision support tool for mammographic mass lesions based on the American College of Radiology (ACR) BI-RADS lexicon
    Benndorf, Matthias
    Kotter, Elmar
    Langer, Mathias
    Herda, Christoph
    Wu, Yirong
    Burnside, Elizabeth S.
    [J]. EUROPEAN RADIOLOGY, 2015, 25 (06) : 1768 - 1775
  • [5] Breast Tissue Composition and Susceptibility to Breast Cancer
    Boyd, Norman F.
    Martin, Lisa J.
    Bronskill, Michael
    Yaffe, Martin J.
    Duric, Neb
    Minkin, Salomon
    [J]. JNCI-JOURNAL OF THE NATIONAL CANCER INSTITUTE, 2010, 102 (16): : 1224 - 1237
  • [6] Automatic breast parenchymal density classification integrated into a CADe system
    Bueno, G.
    Vallez, N.
    Deniz, O.
    Esteve, P.
    Rienda, M. A.
    Arias, M.
    Pastor, C.
    [J]. INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2011, 6 (03) : 309 - 318
  • [7] Probabilistic Computer Model Developed from Clinical Data in National Mammography Database Format to Classify Mammographic Findings
    Burnside, Elizabeth S.
    Davis, Jesse
    Chhatwal, Jagpreet
    Alagoz, Oguzhan
    Lindstrom, Mary J.
    Geller, Berta M.
    Littenberg, Benjamin
    Shaffer, Katherine A.
    Kahn, Charles E., Jr.
    Page, C. David
    [J]. RADIOLOGY, 2009, 251 (03) : 663 - 672
  • [8] Screening for Breast Cancer: US Preventive Services Task Force Recommendation Statement
    Calonge, Ned
    Petitti, Diana B.
    DeWitt, Thomas G.
    Dietrich, Allen J.
    Gregory, Kimberly D.
    Grossman, David
    Isham, George
    LeFevre, Michael L.
    Leipzig, Rosanne M.
    Marion, Lucy N.
    Melnyk, Bernadette
    Moyer, Virginia A.
    Ockene, Judith K.
    Sawaya, George F.
    Schwartz, J. Sanford
    Wilt, Timothy
    [J]. ANNALS OF INTERNAL MEDICINE, 2009, 151 (10) : 716 - W236
  • [9] Semiautomatic mammographic parenchymal patterns classification using multiple statistical features
    Castella, Cyril
    Kinkel, Karen
    Eckstein, Miguel P.
    Sottas, Pierre-Edouard
    Verdun, Francis R.
    Bochud, Francois O.
    [J]. ACADEMIC RADIOLOGY, 2007, 14 (12) : 1486 - 1499
  • [10] MammoSys: A content-based image retrieval system using breast density patterns
    de Oliveira, Julia E. E.
    Machado, Alexei M. C.
    Chavez, Guillermo C.
    Lopes, Ana Paula B.
    Deserno, Thomas M.
    Araujo, Arnaldo de A.
    [J]. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2010, 99 (03) : 289 - 297