Long non-coding RNAs function annotation: a global prediction method based on bi-colored networks

被引:151
作者
Guo, Xingli [1 ,2 ,3 ]
Gao, Lin [1 ]
Liao, Qi [2 ,4 ]
Xiao, Hui [2 ]
Ma, Xiaoke [1 ]
Yang, Xiaofei [1 ]
Luo, Haitao [2 ]
Zhao, Guoguang [2 ,5 ]
Bu, Dechao [2 ,5 ]
Jiao, Fei [6 ]
Shao, Qixiang [7 ]
Chen, RunSheng [8 ,9 ]
Zhao, Yi [2 ]
机构
[1] Xidian Univ, Sch Comp Sci & technol, Xian 710071, Shaanxi, Peoples R China
[2] Chinese Acad Sci, Bioinformat Res Grp, Key Lab Intelligent Informat Proc, Adv Comp Res Ctr,Inst Comp Technol, Beijing 100190, Peoples R China
[3] Xidian Univ, Sch Software Engn, Xian 710071, Shaanxi, Peoples R China
[4] Ningbo Univ, Sch Med, Inst Biochem & Mol Biol, Ningbo 315211, Zhejiang, Peoples R China
[5] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[6] Binzhou Med Coll, Dept Biochem & Mol Biol, Yantai 264003, Shandong, Peoples R China
[7] Jiangsu Univ, Sch Med Sci & Lab Med, Dept Immunol, Zhenjiang 212013, Jiangsu, Peoples R China
[8] Chinese Acad Sci, Natl Lab Biomacromol, Inst Biophys, Beijing 100101, Peoples R China
[9] Chinese Acad Sci, Bioinformat Lab, Inst Biophys, Beijing 100101, Peoples R China
基金
中国国家自然科学基金;
关键词
LARGE-SCALE PREDICTION; PROTEIN FUNCTION; GENE; EXPRESSION; PLURIPOTENCY; REVEALS; TRANSCRIPTOMES; IDENTIFICATION; CONSERVATION; ASSOCIATIONS;
D O I
10.1093/nar/gks967
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
More and more evidences demonstrate that the long non-coding RNAs (lncRNAs) play many key roles in diverse biological processes. There is a critical need to annotate the functions of increasing available lncRNAs. In this article, we try to apply a global network-based strategy to tackle this issue for the first time. We develop a bi-colored network based global function predictor, long non-coding RNA global function predictor ('lnc-GFP'), to predict probable functions for lncRNAs at large scale by integrating gene expression data and protein interaction data. The performance of lnc-GFP is evaluated on protein-coding and lncRNA genes. Cross-validation tests on protein-coding genes with known function annotations indicate that our method can achieve a precision up to 95%, with a suitable parameter setting. Among the 1713 lncRNAs in the bi-colored network, the 1625 (94.9%) lncRNAs in the maximum connected component are all functionally characterized. For the lncRNAs expressed in mouse embryo stem cells and neuronal cells, the inferred putative functions by our method highly match those in the known literature.
引用
收藏
页数:13
相关论文
共 61 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] Noncoding RNA in development
    Amaral, Paulo P.
    Mattick, John S.
    [J]. MAMMALIAN GENOME, 2008, 19 (7-8) : 454 - 492
  • [3] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
  • [4] Predicting protein associations with long noncoding RNAs
    Bellucci, Matteo
    Agostini, Federico
    Masin, Marianela
    Tartaglia, Gian Gaetano
    [J]. NATURE METHODS, 2011, 8 (06) : 444 - 445
  • [5] Genomic analysis of mouse retinal development
    Blackshaw, S
    Harpavat, S
    Trimarchi, J
    Cai, L
    Huang, HY
    Kuo, WP
    Weber, G
    Lee, K
    Fraioli, RE
    Cho, SH
    Yung, R
    Asch, E
    Ohno-Machado, L
    Wong, WH
    Cepko, CL
    [J]. PLOS BIOLOGY, 2004, 2 (09) : 1411 - 1431
  • [6] Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses
    Cabili, Moran N.
    Trapnell, Cole
    Goff, Loyal
    Koziol, Magdalena
    Tazon-Vega, Barbara
    Regev, Aviv
    Rinn, John L.
    [J]. GENES & DEVELOPMENT, 2011, 25 (18) : 1915 - 1927
  • [7] The transcriptional landscape of the mammalian genome
    Carninci, P
    Kasukawa, T
    Katayama, S
    Gough, J
    Frith, MC
    Maeda, N
    Oyama, R
    Ravasi, T
    Lenhard, B
    Wells, C
    Kodzius, R
    Shimokawa, K
    Bajic, VB
    Brenner, SE
    Batalov, S
    Forrest, ARR
    Zavolan, M
    Davis, MJ
    Wilming, LG
    Aidinis, V
    Allen, JE
    Ambesi-Impiombato, X
    Apweiler, R
    Aturaliya, RN
    Bailey, TL
    Bansal, M
    Baxter, L
    Beisel, KW
    Bersano, T
    Bono, H
    Chalk, AM
    Chiu, KP
    Choudhary, V
    Christoffels, A
    Clutterbuck, DR
    Crowe, ML
    Dalla, E
    Dalrymple, BP
    de Bono, B
    Della Gatta, G
    di Bernardo, D
    Down, T
    Engstrom, P
    Fagiolini, M
    Faulkner, G
    Fletcher, CF
    Fukushima, T
    Furuno, M
    Futaki, S
    Gariboldi, M
    [J]. SCIENCE, 2005, 309 (5740) : 1559 - 1563
  • [8] Long noncoding RNA genes: conservation of sequence and brain expression among diverse amniotes
    Chodroff, Rebecca A.
    Goodstadt, Leo
    Sirey, Tamara M.
    Oliver, Peter L.
    Davies, Kay E.
    Green, Eric D.
    Molnar, Zoltan
    Ponting, Chris P.
    [J]. GENOME BIOLOGY, 2010, 11 (07):
  • [9] Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions
    Chua, Hon Nian
    Sung, Wing-Kin
    Wong, Limsoon
    [J]. BIOINFORMATICS, 2006, 22 (13) : 1623 - 1630
  • [10] Genome-wide analysis of long noncoding RNA stability
    Clark, Michael B.
    Johnston, Rebecca L.
    Inostroza-Ponta, Mario
    Fox, Archa H.
    Fortini, Ellen
    Moscato, Pablo
    Dinger, Marcel E.
    Mattick, John S.
    [J]. GENOME RESEARCH, 2012, 22 (05) : 885 - 898