Semi-supervised gene shaving method for predicting low variation biological pathways from genome-wide data

被引:2
作者
Zhu, Dongxiao [1 ,2 ]
机构
[1] Univ New Orleans, Dept Comp Sci, New Orleans, LA 70148 USA
[2] Childrens Hosp, Res Inst Children, New Orleans, LA 70118 USA
来源
BMC BIOINFORMATICS | 2009年 / 10卷
关键词
SINGULAR-VALUE DECOMPOSITION; EXPRESSION; SET; INFORMATION; PATTERNS; NETWORK;
D O I
10.1186/1471-2105-10-S1-S54
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The gene shaving algorithm and many other clustering algorithms identify gene clusters showing high variation across samples. However, gene expression in many signaling pathways show only modest and concordant changes that fail to be identified by these methods. The increasingly available signaling pathway prior knowledge provide new opportunity to solve this problem. Results: We propose an innovative semi-supervised gene clustering algorithm, where the original gene shaving algorithm was extended and generalized so that prior knowledge of signaling pathways can be incorporated. Different from other methods, our method identifies gene clusters showing concerted and modest expression variation as well as strong expression correlation. Using available pathway gene sets as prior knowledge, whether complete or incomplete, our algorithm is capable of forming tightly regulated gene clusters showing modest variation across samples. We demonstrate the advantages of our algorithm over the original gene shaving algorithm using two microarray data sets. The stability of the gene clusters was accessed using a jackknife approach. Conclusion: Our algorithm represents one of the first clustering algorithms that is particularly designed to identify signaling pathways of low and concordant gene expression variation. The discriminating power is achieved by manufacturing a principal component enriched by signaling pathways.
引用
收藏
页数:12
相关论文
共 30 条
  • [1] Singular value decomposition for genome-wide expression data processing and modeling
    Alter, O
    Brown, PO
    Botstein, D
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (18) : 10101 - 10106
  • [2] An initial blueprint for myogenic differentiation
    Blais, A
    Tsikitis, M
    Acosta-Alvear, D
    Sharan, R
    Kluger, Y
    Dynlacht, BD
    [J]. GENES & DEVELOPMENT, 2005, 19 (05) : 553 - 569
  • [3] Global and gene-specific analyses show distinct roles for Myod and Myog at a common set of promoters
    Cao, Y
    Kumar, RM
    Penn, BH
    Berkes, CA
    Kooperberg, C
    Boyer, LA
    Young, RA
    Tapscott, SJ
    [J]. EMBO JOURNAL, 2006, 25 (03) : 502 - 511
  • [4] Disentangling information flow in the Ras-cAMP signaling network
    Carter, GW
    Rupp, S
    Fink, GR
    Galitski, T
    [J]. GENOME RESEARCH, 2006, 16 (04) : 520 - 526
  • [5] A complex oscillating network of signaling genes underlies the mouse segmentation clock
    Dequeant, Mary-Lee
    Glynn, Earl
    Gaudenz, Karin
    Wahl, Matthias
    Chen, Jie
    Mushegian, Arcady
    Pourquie, Olivier
    [J]. SCIENCE, 2006, 314 (5805) : 1595 - 1598
  • [6] Do KA, 2007, CANCER INFORM, V5, P25
  • [7] EISEN M, 1998, P NATL ACAD SCI USA, V95, P14587
  • [8] Gasch AP, 2002, GENOME BIOL, V3
  • [9] Detecting periodic patterns in unevenly spaced gene expression time series using Lomb-Scargle periodograms
    Glynn, EF
    Chen, J
    Mushegian, AR
    [J]. BIOINFORMATICS, 2006, 22 (03) : 310 - 316
  • [10] Hastie T, 2001, GENOME BIOL, V2