Semi-supervised gene shaving method for predicting low variation biological pathways from genome-wide data

被引:2
|
作者
Zhu, Dongxiao [1 ,2 ]
机构
[1] Univ New Orleans, Dept Comp Sci, New Orleans, LA 70148 USA
[2] Childrens Hosp, Res Inst Children, New Orleans, LA 70118 USA
来源
BMC BIOINFORMATICS | 2009年 / 10卷
关键词
SINGULAR-VALUE DECOMPOSITION; EXPRESSION; SET; INFORMATION; PATTERNS; NETWORK;
D O I
10.1186/1471-2105-10-S1-S54
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The gene shaving algorithm and many other clustering algorithms identify gene clusters showing high variation across samples. However, gene expression in many signaling pathways show only modest and concordant changes that fail to be identified by these methods. The increasingly available signaling pathway prior knowledge provide new opportunity to solve this problem. Results: We propose an innovative semi-supervised gene clustering algorithm, where the original gene shaving algorithm was extended and generalized so that prior knowledge of signaling pathways can be incorporated. Different from other methods, our method identifies gene clusters showing concerted and modest expression variation as well as strong expression correlation. Using available pathway gene sets as prior knowledge, whether complete or incomplete, our algorithm is capable of forming tightly regulated gene clusters showing modest variation across samples. We demonstrate the advantages of our algorithm over the original gene shaving algorithm using two microarray data sets. The stability of the gene clusters was accessed using a jackknife approach. Conclusion: Our algorithm represents one of the first clustering algorithms that is particularly designed to identify signaling pathways of low and concordant gene expression variation. The discriminating power is achieved by manufacturing a principal component enriched by signaling pathways.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Pathway analysis of genome-wide association study and transcriptome data highlights new biological pathways in colorectal cancer
    Baoku Quan
    Xingsi Qi
    Zhihui Yu
    Yongshuai Jiang
    Mingzhi Liao
    Guangyu Wang
    Rennan Feng
    Liangcai Zhang
    Zugen Chen
    Qinghua Jiang
    Guiyou Liu
    Molecular Genetics and Genomics, 2015, 290 : 603 - 610
  • [22] Predicting cancer subtypes from microarray data using semi-supervised fuzzy C-means algorithm
    Deepthi, P. S.
    Thampi, Sabu M.
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2017, 32 (04) : 2797 - 2805
  • [23] Semi-supervised fuzzy K-NN for cancer classification from microarray gene expression data
    Halder, Anindya
    Misra, Subhashis
    2014 FIRST INTERNATIONAL CONFERENCE ON AUTOMATION, CONTROL, ENERGY & SYSTEMS (ACES-14), 2014, : 266 - 270
  • [24] Semi-supervised Ensemble Learning for Efficient Cancer Sample Classification from miRNA Gene Expression Data
    Dikme Chisil B. Marak
    Anindya Halder
    Ansuman Kumar
    New Generation Computing, 2021, 39 : 487 - 513
  • [25] Semi-supervised Ensemble Learning for Efficient Cancer Sample Classification from miRNA Gene Expression Data
    Marak, Dikme Chisil B.
    Halder, Anindya
    Kumar, Ansuman
    NEW GENERATION COMPUTING, 2021, 39 (3-4) : 487 - 513
  • [26] pcaGoPromoter - An R Package for Biological and Regulatory Interpretation of Principal Components in Genome-Wide Gene Expression Data
    Hansen, Morten
    Gerds, Thomas Alexander
    Nielsen, Ole Haagen
    Seidelin, Jakob Benedict
    Troelsen, Jesper Thorvald
    Olsen, Jorgen
    PLOS ONE, 2012, 7 (02):
  • [27] A loop-counting method for covariatecorrected low-rank biclustering of gene expression and genome-wide association study data
    Rangan, Aaditya V.
    McGrouther, Caroline C.
    Kelsoe, John
    Schork, Nicholas
    Stahl, Eli
    Zhu, Qian
    Krishnan, Arjun
    Yao, Vicky
    Troyanskaya, Olga
    Bilaloglu, Seda
    Raghavan, Preeti
    Bergen, Sarah
    Jureus, Anders
    Landen, Mikael
    Disorders, Bipolar
    PLOS COMPUTATIONAL BIOLOGY, 2018, 14 (05)
  • [28] Prior biological knowledge-based approaches for the analysis of genome-wide expression profiles using gene sets and pathways
    Wu, Michael C.
    Lin, Xihong
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2009, 18 (06) : 577 - 593
  • [29] Integrating genome-wide association studies and gene expression data highlights dysregulated multiple sclerosis risk pathways
    Liu, Guiyou
    Zhang, Fang
    Jiang, Yongshuai
    Hu, Yang
    Gong, Zhongying
    Liu, Shoufeng
    Chen, Xiuju
    Jiang, Qinghua
    Hao, Junwei
    MULTIPLE SCLEROSIS JOURNAL, 2017, 23 (02) : 205 - 212
  • [30] Integrity of genome-wide genotype data from low passage lymphoblastoid cell lines
    McCarthy, Nina S.
    Allan, Spencer M.
    Chandler, David
    Jablensky, Assen
    Morar, Bharti
    GENOMICS DATA, 2016, 9 : 18 - 21