Quantile normalization for combining gene-expression datasets

被引:9
作者
Pan, Meng [1 ]
Zhang, Jie [2 ]
机构
[1] Jinan Univ, Coll Sci & Engn, Dept Optoelect Engn, Guangzhou, Guangdong, Peoples R China
[2] Jinan Univ, Coll Sci & Engn, Dept Phys, Guangzhou, Guangdong, Peoples R China
关键词
Quantile normalization; gene expression; dataset; combination; prediction; inter-study validation; BREAST-CANCER; UNWANTED VARIATION; CLASS PREDICTION; MICROARRAY DATA; N-GRAM; CLASSIFICATION; VALIDATION;
D O I
10.1080/13102818.2017.1419376
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Our research aimed to improve survival prediction by combining gene expression datasets, and to apply molecular signatures across different datasets. Many methods have previously been developed to remove unwanted variations among datasets and maintain the wanted factor variations. However, for inter-study validation (ISV) research, a whole dataset is set aside for testing, and the statuses of wanted factors are assumed unknown for the whole dataset; thus, regression cannot be used to determine the unwanted variations for this dataset. In this study, quantile normalization (QN) was utilized to remove the unwanted dataset variations, after which the adjusted datasets were used for classification. It was observed that the datasets formed by QN combination in the study of ISV had superior prediction performance compared to the datasets combined by other methods. Combining datasets using QN could improve the prediction performance for the study of ISV.
引用
收藏
页码:751 / 758
页数:8
相关论文
共 32 条
[1]   FERAL: network-based classifier with application to breast cancer outcome prediction [J].
Allahyar, Amin ;
de Ridder, Jeroen .
BIOINFORMATICS, 2015, 31 (12) :311-319
[2]   Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations [J].
Autio, Reija ;
Kilpinen, Sami ;
Saarela, Matti ;
Kallioniemi, Olli ;
Hautaniemi, Sampsa ;
Astola, Jaakko .
BMC BIOINFORMATICS, 2009, 10
[3]   Comparison of data-merging methods with SVM attribute selection and classification in breast cancer gene expression [J].
Bevilacqua, Vitoantonio ;
Pannarale, Paolo ;
Abbrescia, Mirko ;
Cava, Claudia ;
Paradiso, Angelo ;
Tommasi, Stefania .
BMC BIOINFORMATICS, 2012, 13
[4]   Strong time dependence of the 76-gene prognostic signature for node-negative breast cancer patients in the TRANSBIG multicenter independent validation series [J].
Desmedt, Christine ;
Piette, Fanny ;
Loi, Sherene ;
Wang, Yixin ;
d'assignies, Mahasti Saghatchian ;
Bergh, Jonas ;
Lidereau, Rosette ;
Ellis, Paul ;
Harris, Adrian L. ;
Klijn, Jan G. M. ;
Foekens, John A. ;
Cardoso, Fatima ;
Piccart, Martine J. ;
Buyse, Marc ;
Sotiriou, Christos .
CLINICAL CANCER RESEARCH, 2007, 13 (11) :3207-3214
[5]   Random projections as regularizers: learning a linear discriminant from fewer observations than dimensions [J].
Durrant, Robert J. ;
Kaban, Ata .
MACHINE LEARNING, 2015, 99 (02) :257-286
[6]   Using control genes to correct for unwanted variation in microarray data [J].
Gagnon-Bartsch, Johann A. ;
Speed, Terence P. .
BIOSTATISTICS, 2012, 13 (03) :539-552
[7]   Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring [J].
Golub, TR ;
Slonim, DK ;
Tamayo, P ;
Huard, C ;
Gaasenbeek, M ;
Mesirov, JP ;
Coller, H ;
Loh, ML ;
Downing, JR ;
Caligiuri, MA ;
Bloomfield, CD ;
Lander, ES .
SCIENCE, 1999, 286 (5439) :531-537
[8]   A Genomic Predictor of Response and Survival Following Taxane-Anthracycline Chemotherapy for Invasive Breast Cancer [J].
Hatzis, Christos ;
Pusztai, Lajos ;
Valero, Vicente ;
Booser, Daniel J. ;
Esserman, Laura ;
Lluch, Ana ;
Vidaurre, Tatiana ;
Holmes, Frankie ;
Souchon, Eduardo ;
Wang, Hongkun ;
Martin, Miguel ;
Cotrina, Jose ;
Gomez, Henry ;
Hubbard, Rebekah ;
Ignacio Chacon, J. ;
Ferrer-Lozano, Jaime ;
Dyer, Richard ;
Buxton, Meredith ;
Gong, Yun ;
Wu, Yun ;
Ibrahim, Nuhad ;
Andreopoulou, Eleni ;
Ueno, Naoto T. ;
Hunt, Kelly ;
Yang, Wei ;
Nazario, Arlene ;
DeMichele, Angela ;
O'Shaughnessy, Joyce ;
Hortobagyi, Gabriel N. ;
Symmans, W. Fraser .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2011, 305 (18) :1873-1881
[9]   quantro: a data-driven approach to guide the choice of an appropriate normalization method [J].
Hicks, Stephanie C. ;
Irizarry, Rafael A. .
GENOME BIOLOGY, 2015, 16
[10]   Exploration, normalization, and summaries of high density oligonucleotide array probe level data [J].
Irizarry, RA ;
Hobbs, B ;
Collin, F ;
Beazer-Barclay, YD ;
Antonellis, KJ ;
Scherf, U ;
Speed, TP .
BIOSTATISTICS, 2003, 4 (02) :249-264