Predicting pathological response to neoadjuvant chemotherapy in breast cancer patients based on imbalanced clinical data

被引:11
作者
Gao, Ting [1 ]
Hao, Yaguang [1 ]
Zhang, Haipeng [2 ]
Hu, Lina [1 ]
Li, Hongzhi [1 ]
Li, Hui [1 ]
Hu, LiHong [1 ]
Han, Bing [2 ]
机构
[1] Northeast Normal Univ, Sch Informat Sci & Technol, Changchun 130117, Jilin, Peoples R China
[2] Jilin Univ, Hosp 1, Changchun 130021, Jilin, Peoples R China
关键词
Breast cancer; Neoadjuvant chemotherapy; k-nearest neighbor; Pathological response; Imbalanced clinical data; SURVIVAL; EXPRESSION;
D O I
10.1007/s00779-018-1144-3
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Neoadjuvant chemotherapy (NAC) may help some breast cancer patients with subsequent surgery or radiotherapy. However, there are certain risks associated with NAC. To lower the risks, machine-learning methods can be used to assist the diagnosis of breast tumors based on clinical data. This study investigated the use of ensemble machine-learning models in the prediction of pathological response to NAC for breast cancer patients with actual clinical data. The ensemble k-nearest neighbor (EKNN) model was determined to predict pathological responses. The imbalanced clinical data of patients with NAC were reviewed retrospectively, and 11 clinicopathological variables were selected from all features to establish succinct EKNN model. A total of 259 patients' clinical data was included in the model. The training and testing set for each single k-nearest neighbor (KNN) contained 27 and 9 patients, respectively. A total of 259 breast cancer patients in the database included 36 cases of pathological complete response, 157 cases of partial response, and 66 cases of stable disease. To solve the imbalanced clinical data problem, an ensemble-learning EKNN was designed, where the number of samples for each class in a base learner is set to equal to the minimum number 36. It showed that the classification accuracy of pathological response for breast cancer patients after NAC was 81.48% by EKNN model and the Kappa coefficient was 0.72, indicating that the robustness and generalization were better than the average prediction ability of single KNN model (average accuracy of single KNN model was 62.22% and Kappa coefficient was 0.43). Based on actual clinical data, important clinicopathological variables are selected, and the imbalanced problem are well solved by the ensemble EKNN model. The model improved the robustness and generalization for predicting the pathological response with imbalanced clinical data. It suggested that ensemble machine learning has possible practical applications for assisting cancer stage diagnoses and precision medicine.
引用
收藏
页码:1039 / 1047
页数:9
相关论文
共 33 条
[21]   Ensemble gene selection for cancer classification [J].
Liu, Huawen ;
Liu, Lei ;
Zhang, Huijie .
PATTERN RECOGNITION, 2010, 43 (08) :2763-2772
[22]   ROC analysis of classifiers in machine learning: A survey [J].
Majnik, Matjaz ;
Bosnic, Zoran .
INTELLIGENT DATA ANALYSIS, 2013, 17 (03) :531-558
[23]   Machine learning for predicting the response of breast cancer to neoadjuvant chemotherapy [J].
Mani, Subramani ;
Chen, Yukun ;
Li, Xia ;
Arlinghaus, Lori ;
Chakravarthy, A. Bapsi ;
Abramson, Vandana ;
Bhave, Sandeep R. ;
Levy, Mia A. ;
Xu, Hua ;
Yankeelov, Thomas E. .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2013, 20 (04) :688-695
[24]   In the Era of Genomics, Should Tumor Size Be Reconsidered as a Criterion for Neoadjuvant Chemotherapy? [J].
Pivot, Xavier ;
Mansi, Laura ;
Chaigneau, Loic ;
Montcuquet, Philippe ;
Thiery-Vuillemin, Antoine ;
Bazan, Fernando ;
Dobi, Erion ;
Sautiere, Jean L. ;
Rigenbach, Frederic ;
Algros, Marie P. ;
Butler, Steve ;
Jamshidian, Farid ;
Febbo, Phillip ;
Svedman, Christer ;
Paget-Bailly, Sophie ;
Bonnetain, Franck ;
Villanueva, Christian .
ONCOLOGIST, 2015, 20 (04) :344-350
[25]   Correlation between response to neoadjuvant chemotherapy and survival in locally advanced breast cancer patients [J].
Romero, A. ;
Garcia-Saenz, J. A. ;
Fuentes-Ferrer, M. ;
Lopez Garcia-Asenjo, J. A. ;
Furio, V. ;
Roman, J. M. ;
Moreno, A. ;
de la Hoya, M. ;
Diaz-Rubio, E. ;
Martin, M. ;
Caldes, T. .
ANNALS OF ONCOLOGY, 2013, 24 (03) :655-661
[26]   Medical decision support system for extremely imbalanced datasets [J].
Shilaskar, Swati ;
Ghatol, Ashok ;
Chatur, Prashant .
INFORMATION SCIENCES, 2017, 384 :205-219
[27]   Development of Web tools to predict axillary lymph node metastasis and pathological response to neoadjuvant chemotherapy in breast cancer patients [J].
Sugimoto, Masahiro ;
Takada, Masahiro ;
Toi, Masakazu .
INTERNATIONAL JOURNAL OF BIOLOGICAL MARKERS, 2014, 29 (04) :E372-E379
[28]   Predictions of the pathological response to neoadjuvant chemotherapy in patients with primary breast cancer using a data mining technique [J].
Takada, M. ;
Sugimoto, M. ;
Ohno, S. ;
Kuroi, K. ;
Sato, N. ;
Bando, H. ;
Masuda, N. ;
Iwata, H. ;
Kondo, M. ;
Sasano, H. ;
Chow, L. W. C. ;
Inamoto, T. ;
Naito, Y. ;
Tomita, M. ;
Toi, M. .
BREAST CANCER RESEARCH AND TREATMENT, 2012, 134 (02) :661-670
[29]   An update on chemotherapy and tumor gene expression profiles in breast cancer [J].
Tan, Sing-Huang ;
Lee, Soo-Chin .
EXPERT OPINION ON DRUG METABOLISM & TOXICOLOGY, 2012, 8 (09) :1083-1113
[30]   Insight or Confusion: Survival After Response-Guided Neoadjuvant Chemotherapy in Breast Cancer [J].
Telli, Melinda L. .
JOURNAL OF CLINICAL ONCOLOGY, 2013, 31 (29) :3613-3615