Selecting significant genes by randomization test for cancer classification using gene expression data

被引：26

作者：

Mao, Zhiyi ^{[1
,2
]}

Cai, Wensheng ^{[1
,2
]}

Shao, Xueguang ^{[1
,2
]}

机构：

[1] Nankai Univ, State Key Lab Med Chem Biol, Coll Chem, Tianjin 300071, Peoples R China

[2] Nankai Univ, Res Ctr Analyt Sci, Coll Chem, Tianjin 300071, Peoples R China

来源：

JOURNAL OF BIOMEDICAL INFORMATICS | 2013年 / 46卷 / 04期

基金：

中国国家自然科学基金;

关键词：

Gene expression data; Randomization test; Partial least squares discriminant analysis; Gene selection; Cancer classification; PARTIAL LEAST-SQUARES; MICROARRAY DATA; MULTIVARIATE CALIBRATION; MOLECULAR CLASSIFICATION; DISCRIMINANT-ANALYSIS; TUMOR CLASSIFICATION; VARIABLE SELECTION; LUNG-CANCER; PREDICTION; COMPONENT;

D O I：

10.1016/j.jbi.2013.03.009

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Gene selection is an important task in bioinformatics studies, because the accuracy of cancer classification generally depends upon the genes that have biological relevance to the classifying problems. In this work, randomization test (RT) is used as a gene selection method for dealing with gene expression data. In the method, a statistic derived from the statistics of the regression coefficients in a series of partial least squares discriminant analysis (PLSDA) models is used to evaluate the significance of the genes. Informative genes are selected for classifying the four gene expression datasets of prostate cancer, lung cancer, leukemia and non-small cell lung cancer (NSCLC) and the rationality of the results is validated by multiple linear regression (MLR) modeling and principal component analysis (PCA). With the selected genes, satisfactory results can be obtained. (C) 2013 Elsevier Inc. All rights reserved.

引用

页码：594 / 601

页数：8

共 50 条

[41] Risk classification of cancer survival using ANN with gene expression data from multiple laboratories
Chen, Yen-Chen
Ke, Wan-Chi
Chiu, Hung-Wen
COMPUTERS IN BIOLOGY AND MEDICINE, 2014, 48 : 1 - 7
[42] Cancer classification based on microarray gene expression data using a principal component accumulation method
JingJing Liu
WenSheng Cai
XueGuang Shao
Science China Chemistry, 2011, 54 : 802 - 811
[43] Clustering genes using gene expression and text literature data
Yang, CY
Zeng, EL
Li, T
Narasimhan, G
2005 IEEE COMPUTATIONAL SYSTEMS BIOINFORMATICS CONFERENCE, PROCEEDINGS, 2005, : 329 - 340
[44] Sparse Representation for Classification of Tumors Using Gene Expression Data
Hang, Xiyi
Wu, Fang-Xiang
JOURNAL OF BIOMEDICINE AND BIOTECHNOLOGY, 2009,
[45] Cancer classification using ensemble of neural networks with multiple significant gene subsets
Sung Bae Cho
Hong-Hee Won
Applied Intelligence, 2007, 26 : 243 - 250
[46] Gene boosting for cancer classification based on gene expression profiles
Hong, Jin-Hyuk
Cho, Sung-Bae
PATTERN RECOGNITION, 2009, 42 (09) : 1761 - 1767
[47] Analysis of complexity indices for classification problems: Cancer gene expression data
Lorena, Ana C.
Costa, Ivan G.
Spolaor, Newton
de Souto, Marcilio C. P.
NEUROCOMPUTING, 2012, 75 (01) : 33 - 42
[48] Machine Learning Methods for Cancer Classification Using Gene Expression Data: A Review
Alharbi, Fadi
Vakanski, Aleksandar
BIOENGINEERING-BASEL, 2023, 10 (02):
[49] A support vector machine ensemble for cancer classification using gene expression data
Liao, Chen
Li, Shutao
BIOINFORMATICS RESEARCH AND APPLICATIONS, PROCEEDINGS, 2007, 4463 : 488 - +
[50] Marker identification and classification of cancer types using gene expression data and SIMCA
Bicciato, S
Luchini, A
Di Bello, C
METHODS OF INFORMATION IN MEDICINE, 2004, 43 (01) : 4 - 8

← 1 2 3 4 5 →