Shrinkage-based similarity metric for cluster analysis of microarray data

被引:24
|
作者
Cherepinsky, V
Feng, JW
Rejali, M
Mishra, B
机构
[1] NYU, Courant Inst Math Sci, New York, NY 10012 USA
[2] Cold Spring Harbor Lab, Cold Spring Harbor, NY 11724 USA
关键词
D O I
10.1073/pnas.1633770100
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The current standard correlation coefficient used in the analysis of microarray data was introduced by M. B. Eisen, P. T. Spellman, P. O. Brown, and D. Botstein [(1998) Proc. Nati. Acad Sci. USA 95, 1486314868]. Its formulation is rather arbitrary. We give a mathematically rigorous correlation coefficient of two data vectors based on James-Stein shrinkage estimators. We use the assumptions described by Eisen et al., also using the fact that the data can be treated as transformed into normal distributions. While Eisen et A use zero as an estimator for the expression vector mean mu, we start with the assumption that for each gene, IL is itself a zero-mean normal random variable [with a priori distribution N(0, tau(2))], and use Bayesian analysis to obtain a posteriori distribution of mu in terms of the data. The shrunk estimator for mu differs from the mean of the data vectors and ultimately leads to a statistically robust estimator for correlation coefficients. To evaluate the effectiveness of shrinkage, we conducted in silico experiments and also compared similarity metrics on a biological example by using the data set from Eisen et A For the latter, we classified genes involved in the regulation of yeast cell-cycle functions by computing clusters based on various definitions of correlation coefficients and contrasting them against clusters based on the activators known in the literature. The estimated false positives and false negatives from this study indicate that using the shrinkage metric improves the accuracy of the analysis.
引用
收藏
页码:9668 / 9673
页数:6
相关论文
共 50 条
  • [31] Shrinkage-based diagonal Hotelling's tests for high-dimensional small sample size data
    Dong, Kai
    Pang, Herbert
    Tong, Tiejun
    Genton, Marc G.
    JOURNAL OF MULTIVARIATE ANALYSIS, 2016, 143 : 127 - 142
  • [32] Cluster structure inference based on clustering stability with applications to microarray data analysis
    Giurcaneanu, CD
    Tabus, I
    EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2004, 2004 (01) : 64 - 80
  • [33] Novel Method for Lung Tumour Detection Using Wavelet Shrinkage-based Double Classifier Analysis
    Vijila Rani, K.
    Joseph Jawhar, S.
    IETE JOURNAL OF RESEARCH, 2021, 67 (04) : 514 - 531
  • [34] Model-based cluster analysis of microarray gene-expression data
    Wei Pan
    Jizhen Lin
    Chap T Le
    Genome Biology, 3 (2)
  • [35] Cluster Structure Inference Based on Clustering Stability with Applications to Microarray Data Analysis
    Ciprian Doru Giurcăneanu
    Ioan Tăbuş
    EURASIP Journal on Advances in Signal Processing, 2004
  • [36] Cluster Structure Inference Based on Clustering Stability with Applications to Microarray Data Analysis
    GiurcǍneanu, C.D. (cipriand@cs.tut.fi), 1600, Hindawi Publishing Corporation (2004):
  • [37] Model-based cluster analysis of microarray gene-expression data
    Pan, Wei
    Lin, Jizhen
    Le, Chap T.
    GENOME BIOLOGY, 2002, 3 (02):
  • [38] SHAPE TRANSFORMATION BASED SIMILARITY METRIC FOR HYPERSPECTRAL DATA
    Deshpande, Shailesh
    Manish, Kausik H.
    Balamuralidhar, P.
    2022 12TH WORKSHOP ON HYPERSPECTRAL IMAGING AND SIGNAL PROCESSING: EVOLUTION IN REMOTE SENSING (WHISPERS), 2022,
  • [39] Analysing breast cancer microarrays from African Americans using shrinkage-based discriminant analysis
    Pang H.
    Ebisu K.
    Watanabe E.
    Sue L.Y.
    Tong T.
    Human Genomics, 5 (1) : 5 - 16
  • [40] Thermal Shrinkage-Based Model for Predicting the Voids During Solidification of Lead
    Gudibande, Niranjan
    Iyer, Kannan
    NUCLEAR TECHNOLOGY, 2016, 196 (03) : 674 - 683