FeAture Explorer (FAE): A tool for developing and comparing radiomics models

被引:227
作者
Song, Yang [1 ]
Zhang, Jing [1 ]
Zhang, Yu-Dong [2 ]
Hou, Ying [2 ]
Yan, Xu [3 ]
Wang, Yida [1 ]
Zhou, Minxiong [4 ]
Yao, Ye-Feng [1 ]
Yang, Guang [1 ]
机构
[1] East China Normal Univ, Shanghai Key Lab Magnet Resonance, Shanghai, Peoples R China
[2] Nanjing Med Univ, Affiliated Hosp 1, Dept Radiol, Nanjing, Peoples R China
[3] Siemens Healthcare, MR Sci Mkt, Shanghai, Peoples R China
[4] Shanghai Univ Med & Hlth Sci, Shanghai, Peoples R China
关键词
D O I
10.1371/journal.pone.0237587
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In radiomics studies, researchers usually need to develop a supervised machine learning model to map image features onto the clinical conclusion. A classical machine learning pipeline consists of several steps, including normalization, feature selection, and classification. It is often tedious to find an optimal pipeline with appropriate combinations. We designed an open-source software package named FeAture Explorer (FAE). It was programmed with Python and used NumPy, pandas, and scikit-learning modules. FAE can be used to extract image features, preprocess the feature matrix, develop different models automatically, and evaluate them with common clinical statistics. FAE features a user-friendly graphical user interface that can be used by radiologists and researchers to build many different pipelines, and to compare their results visually. To prove the effectiveness of FAE, we developed a candidate model to classify the clinical-significant prostate cancer (CS PCa) and non-CS PCa using the PROSTATEx dataset. We used FAE to try out different combinations of feature selectors and classifiers, compare the area under the receiver operating characteristic curve of different models on the validation dataset, and evaluate the model using independent test data. The final model with the analysis of variance as the feature selector and linear discriminate analysis as the classifier was selected and evaluated conveniently by FAE. The area under the receiver operating characteristic curve on the training, validation, and test dataset achieved results of 0.838, 0.814, and 0.824, respectively. FAE allows researchers to build radiomics models and evaluate them using an independent testing dataset. It also provides easy model comparison and result visualization. We believe FAE can be a convenient tool for radiomics studies and other medical studies involving supervised machine learning.
引用
收藏
页数:10
相关论文
共 21 条
[1]  
Batista G. E. A. P. A., 2004, ACM SIGKDD Explor Newsl, V6, P20, DOI [10.1145/1007730.1007735, DOI 10.1145/1007730.1007735]
[2]  
Benesty J., 2009, Speech Process, P1, DOI DOI 10.1007/978-3-642-00296-05
[3]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[4]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[5]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[6]   Radiomics: Images Are More than Pictures, They Are Data [J].
Gillies, Robert J. ;
Kinahan, Paul E. ;
Hricak, Hedvig .
RADIOLOGY, 2016, 278 (02) :563-577
[7]   THE MEANING AND USE OF THE AREA UNDER A RECEIVER OPERATING CHARACTERISTIC (ROC) CURVE [J].
HANLEY, JA ;
MCNEIL, BJ .
RADIOLOGY, 1982, 143 (01) :29-36
[8]   CONNECTIONIST LEARNING PROCEDURES [J].
HINTON, GE .
ARTIFICIAL INTELLIGENCE, 1989, 40 (1-3) :185-234
[9]   Development and Validation of a Radiomics Nomogram for Preoperative Prediction of Lymph Node Metastasis in Colorectal Cancer [J].
Huang, Yan-qi ;
Liang, Chang-hong ;
He, Lan ;
Tian, Jie ;
Liang, Cui-shan ;
Chen, Xin ;
Ma, Ze-lan ;
Liu, Zai-yi .
JOURNAL OF CLINICAL ONCOLOGY, 2016, 34 (18) :2157-+
[10]  
James G, 2013, SPRINGER TEXTS STAT, V103, P1, DOI 10.1007/978-1-4614-7138-7_1