Correlation based Feature Selection impact on the classification of breast cancer patients response to neoadjuvant chemotherapy

被引:0
作者
Rosati, S. [1 ]
Gianfreda, C. M. [1 ]
Balestra, G. [1 ]
Martincich, L. [2 ]
Giannini, V. [2 ,3 ]
Regge, D. [2 ,3 ]
机构
[1] Politecn Torino, Dept Elect & Telecommun, Turin, Italy
[2] IRCCS, Candiolo Canc Inst FPO, Dept Radiol, Candiolo, TO, Italy
[3] Univ Turin, Dept Surg Sci, Turin, Italy
来源
2018 IEEE INTERNATIONAL SYMPOSIUM ON MEDICAL MEASUREMENTS AND APPLICATIONS (MEMEA) | 2018年
关键词
neoadjuvant chemotherapy; breast cancer; texture features; feature selection; correlation; decision tree; linear model regression; SYSTEM;
D O I
暂无
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
The availability of a huge number of variables is not always associated to better classification performances, as some of them can be redundant, irrelevant or source of noise. For this reason, a Feature Selection (FS) step is often applied to high-dimensional datasets. FS based on correlation relies on the idea that "good feature subsets contain features highly correlated with the class yet uncorrelated with each other". However, the main problem of this kind of approach is to define a threshold from which considering two variables correlated. In this study, we evaluated the impact of different thresholds on the performances of two classifiers trained to predict response to neoadjuvant chemotherapy (from grade 1 to 5) of 44 patients with breast cancer. First, 27 texture features were computed on the largest slices belonging to the segmented tumor on the pretreatment dynamic contrast enhanced-MRI. Then, we applied a FS algorithm that identifies the couples of variables with absolute value of the linear correlation coefficient above a given threshold and removed, for each couple, the variable less correlated with the response to the neoadjuvant chemotherapy. We tested correlation thresholds ranging from 1 to 0.8 with intervals of 0.01, and we used each obtained subset to construct a Decision Tree (DT) classifier and a Linear Regression Model (LRM). Our results showed that the removal of highly correlated variables (absolute value of the correlation coefficient >0.97) produced a reduction of the DT performance of about 10%. Although the LRM was not able to reach acceptable results in terms of chemotherapy response prediction (accuracy=40.9%), its intrinsic linearity allowed to be more stable to linear redundancy removal.
引用
收藏
页码:753 / 757
页数:5
相关论文
共 18 条
  • [1] Supervised, Unsupervised, and Semi-Supervised Feature Selection: A Review on Gene Selection
    Ang, Jun Chin
    Mirzal, Andri
    Haron, Habibollah
    Hamed, Haza Nuzly Abdull
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2016, 13 (05) : 971 - 989
  • [2] [Anonymous], 2016, J MECH MED BIOL
  • [3] Bühlmann P, 2011, SPRINGER SER STAT, P1, DOI 10.1007/978-3-642-20192-9
  • [4] A THEORETICAL COMPARISON OF TEXTURE ALGORITHMS
    CONNERS, RW
    HARLOW, CA
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1980, 2 (03) : 204 - 222
  • [5] A computer-aided diagnosis (CAD) scheme for pretreatment prediction of pathological response to neoadjuvant therapy using dynamic contrast-enhanced MRI texture features
    Giannini, Valentina
    Mazzetti, Simone
    Marmo, Agnese
    Montemurro, Filippo
    Regge, Daniele
    Martincich, Laura
    [J]. BRITISH JOURNAL OF RADIOLOGY, 2017, 90 (1077)
  • [6] A fully automatic computer aided diagnosis system for peripheral zone prostate cancer detection using multi-parametric magnetic
    Giannini, Valentina
    Mazzetti, Simone
    Vignati, Anna
    Russo, Filippo
    Bollito, Enrico
    Porpiglia, Francesco
    Stasi, Michele
    Regge, Daniele
    [J]. COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2015, 46 : 219 - 226
  • [7] Radiomics: Images Are More than Pictures, They Are Data
    Gillies, Robert J.
    Kinahan, Paul E.
    Hricak, Hedvig
    [J]. RADIOLOGY, 2016, 278 (02) : 563 - 577
  • [8] Hall M. A., 1999, Proceedings of the Twelfth International Florida AI Research Society Conference, P235
  • [9] Han J, 2012, MOR KAUF D, P1
  • [10] TEXTURAL FEATURES FOR IMAGE CLASSIFICATION
    HARALICK, RM
    SHANMUGAM, K
    DINSTEIN, I
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1973, SMC3 (06): : 610 - 621