Quantile normalization for combining gene-expression datasets

被引:9
作者
Pan, Meng [1 ]
Zhang, Jie [2 ]
机构
[1] Jinan Univ, Coll Sci & Engn, Dept Optoelect Engn, Guangzhou, Guangdong, Peoples R China
[2] Jinan Univ, Coll Sci & Engn, Dept Phys, Guangzhou, Guangdong, Peoples R China
关键词
Quantile normalization; gene expression; dataset; combination; prediction; inter-study validation; BREAST-CANCER; UNWANTED VARIATION; CLASS PREDICTION; MICROARRAY DATA; N-GRAM; CLASSIFICATION; VALIDATION;
D O I
10.1080/13102818.2017.1419376
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Our research aimed to improve survival prediction by combining gene expression datasets, and to apply molecular signatures across different datasets. Many methods have previously been developed to remove unwanted variations among datasets and maintain the wanted factor variations. However, for inter-study validation (ISV) research, a whole dataset is set aside for testing, and the statuses of wanted factors are assumed unknown for the whole dataset; thus, regression cannot be used to determine the unwanted variations for this dataset. In this study, quantile normalization (QN) was utilized to remove the unwanted dataset variations, after which the adjusted datasets were used for classification. It was observed that the datasets formed by QN combination in the study of ISV had superior prediction performance compared to the datasets combined by other methods. Combining datasets using QN could improve the prediction performance for the study of ISV.
引用
收藏
页码:751 / 758
页数:8
相关论文
共 32 条
[11]   Genetic reclassification of histologic grade delineates new clinical subtypes of breast cancer [J].
Ivshina, Anna V. ;
George, Joshy ;
Senko, Oleg ;
Mow, Benjamin ;
Putti, Thomas C. ;
Smeds, Johanna ;
Lindahl, Thomas ;
Pawitan, Yudi ;
Hall, Per ;
Nordgren, Hans ;
Wong, John E. L. ;
Liu, Edison T. ;
Bergh, Jonas ;
Kuznetsov, Vladimir A. ;
Miller, Lance D. .
CANCER RESEARCH, 2006, 66 (21) :10292-10301
[12]   Correcting gene expression data when neither the unwanted variation nor the factor of interest are observed [J].
Jacob, Laurent ;
Gagnon-Bartsch, Johann A. ;
Speed, Terence P. .
BIOSTATISTICS, 2016, 17 (01) :16-28
[13]   Adjusting batch effects in microarray expression data using empirical Bayes methods [J].
Johnson, W. Evan ;
Li, Cheng ;
Rabinovic, Ariel .
BIOSTATISTICS, 2007, 8 (01) :118-127
[14]   Exploring homogeneity of correlation structures of gene expression datasets within and between etiological disease categories [J].
Jong, Victor L. ;
Novianti, Putri W. ;
Roes, Kit C. B. ;
Eijkemans, Marinus J. C. .
STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2014, 13 (06) :717-732
[15]   Improving the prediction accuracy in classification using the combined data sets by ranks of gene expressions [J].
Kim, Ki-Yeol ;
Ki, Dong Hyuk ;
Jeung, Hei-Cheul ;
Chung, Hyun Cheol ;
Rha, Sun Young .
BMC BIOINFORMATICS, 2008, 9 (1) :283
[16]   Probabilistic classifiers with high-dimensional data [J].
Kim, Kyung In ;
Simon, Richard .
BIOSTATISTICS, 2011, 12 (03) :399-412
[17]   A multi-center, randomized, clinical study to compare the effect and safety of autologous cultured osteoblast(Ossron™) injection to treat fractures [J].
Kim, Seok-Jung ;
Shin, Yong-Woon ;
Yang, Kyu-Hyun ;
Kim, Sang-Bum ;
Yoo, Moon-Jib ;
Han, Suk-Ku ;
Im, Soo-Ah ;
Won, Yoo-Dong ;
Sung, Yerl-Bo ;
Jeon, Taek-Soo ;
Chang, Cheong-Ho ;
Jang, Jae-Deog ;
Lee, Sae-Bom ;
Kim, Hyun-Cho ;
Lee, Soo-Young .
BMC MUSCULOSKELETAL DISORDERS, 2009, 10
[18]   Capturing heterogeneity in gene expression studies by surrogate variable analysis [J].
Leek, Jeffrey T. ;
Storey, John D. .
PLOS GENETICS, 2007, 3 (09) :1724-1735
[19]   Measuring the Effect of Inter-Study Variability on Estimating Prediction Error [J].
Ma, Shuyi ;
Sung, Jaeyun ;
Magis, Andrew T. ;
Wang, Yuliang ;
Geman, Donald ;
Price, Nathan D. .
PLOS ONE, 2014, 9 (10)
[20]   Interactions between immunity, proliferation and molecular subtype in breast cancer prognosis [J].
Nagalla, Srikanth ;
Chou, Jeff W. ;
Willingham, Mark C. ;
Ruiz, Jimmy ;
Vaughn, James P. ;
Dubey, Purnima ;
Lash, Timothy L. ;
Hamilton-Dutoit, Stephen J. ;
Bergh, Jonas ;
Sotiriou, Christos ;
Black, Michael A. ;
Miller, Lance D. .
GENOME BIOLOGY, 2013, 14 (04)