Outlier analysis for gene expression data

被引:9
作者
Yan, C [1 ]
Chen, GL [1 ]
Shen, YF [1 ]
机构
[1] Univ Sci & Technol China, Natl High Performance Computat Ctr, Hefei 230027, Peoples R China
关键词
gene expression data; outlier analysis; cell-based algorithm; genetic algorithm;
D O I
10.1007/BF02944782
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The rapid developments of technologies that generate arrays of gene data enable a global view of the transcription levels of hundreds of thousands of genes simultaneously. The outlier detection problem for gene data has its importance but together with the difficulty of high dimensionality. The sparsity of data in high-dimensional space makes each point a relatively good outlier in the view of traditional distance-based definitions. Thus, finding outliers in high dimensional data is more complex. In this paper, some basic outlier analysis algorithms are discussed and a new genetic algorithm is presented. This algorithm is to find best dimension projections based on a revised cell-based algorithm and to give explanations to solutions. It can solve the outlier detection problem for gene expression data and for other high dimensional data as well.
引用
收藏
页码:13 / 21
页数:9
相关论文
共 42 条
[1]  
Aggarwal C. C., 2001, SIGMOD Record, V30, P37, DOI 10.1145/376284.375668
[2]  
AGGARWAL CC, 1997, OPERATIONS RES, V45
[3]  
AGGARWAL CC, 2000, P ACM SIGMOD INT C M, P70, DOI DOI 10.1145/335191
[4]   Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling [J].
Alizadeh, AA ;
Eisen, MB ;
Davis, RE ;
Ma, C ;
Lossos, IS ;
Rosenwald, A ;
Boldrick, JG ;
Sabet, H ;
Tran, T ;
Yu, X ;
Powell, JI ;
Yang, LM ;
Marti, GE ;
Moore, T ;
Hudson, J ;
Lu, LS ;
Lewis, DB ;
Tibshirani, R ;
Sherlock, G ;
Chan, WC ;
Greiner, TC ;
Weisenburger, DD ;
Armitage, JO ;
Warnke, R ;
Levy, R ;
Wilson, W ;
Grever, MR ;
Byrd, JC ;
Botstein, D ;
Brown, PO ;
Staudt, LM .
NATURE, 2000, 403 (6769) :503-511
[5]  
ARNING A, 2000, P 1996 INT C DAT MIN
[6]  
Barnett V., 1994, Outliers in Statistical Data, V3rd
[7]   LOF: Identifying density-based local outliers [J].
Breunig, MM ;
Kriegel, HP ;
Ng, RT ;
Sander, J .
SIGMOD RECORD, 2000, 29 (02) :93-104
[8]  
Breunig MM, 1999, LECT NOTES ARTIF INT, V1704, P262
[9]  
CHEN GL, 1996, GENETIC ALGORITHM AP
[10]  
De Jong K. A., 1975, ANAL BEHAV CLASS GEN