Text Mining in Cancer Gene and Pathway Prioritization

被引:18
作者
Luo, Yuan [1 ]
Riedlinger, Gregory [2 ]
Szolovits, Peter [1 ]
机构
[1] MIT, Comp Sci & Artificial Intelligence Lab, Cambridge, MA 02139 USA
[2] Massachusetts Gen Hosp, Dept Pathol, Boston, MA 02114 USA
关键词
gene prioritization; text mining; cancer omics; pathway prioritization; machine learning;
D O I
10.4137/CIN.S13874
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Prioritization of cancer implicated genes has received growing attention as an effective way to reduce wet lab cost by computational analysis that ranks candidate genes according to the likelihood that experimental verifications will succeed. A multitude of gene prioritization tools have been developed, each integrating different data sources covering gene sequences, differential expressions, function annotations, gene regulations, protein domains, protein interactions, and pathways. This review places existing gene prioritization tools against the backdrop of an integrative Omic hierarchy view toward cancer and focuses on the analysis of their text mining components. We explain the relatively slow progress of text mining in gene prioritization, identify several challenges to current text mining methods, and highlight a few directions where more effective text mining algorithms may improve the overall prioritization task and where prioritizing the pathways may be more desirable than prioritizing only genes.
引用
收藏
页码:69 / 79
页数:11
相关论文
共 125 条
[1]   Speeding disease gene discovery by sequence based candidate prioritization [J].
Adie, EA ;
Adams, RR ;
Evans, KL ;
Porteous, DJ ;
Pickard, BS .
BMC BIOINFORMATICS, 2005, 6 (1)
[2]   SUSPECTS: enabling fast and effective prioritization of positional candidates [J].
Adie, EA ;
Adams, RR ;
Evans, KL ;
Porteous, DJ ;
Pickard, BS .
BIOINFORMATICS, 2006, 22 (06) :773-774
[3]   Gene prioritization through genomic data fusion [J].
Aerts, S ;
Lambrechts, D ;
Maity, S ;
Van Loo, P ;
Coessens, B ;
De Smet, F ;
Tranchevent, LC ;
De Moor, B ;
Marynen, P ;
Hassan, B ;
Carmeliet, P ;
Moreau, Y .
NATURE BIOTECHNOLOGY, 2006, 24 (05) :537-544
[4]   TOUCAN 2: the all-inclusive open source workbench for regulatory sequence analysis [J].
Aerts, S ;
Van Loo, P ;
Thijs, G ;
Mayer, H ;
de Martin, R ;
Moreau, Y ;
De Moor, B .
NUCLEIC ACIDS RESEARCH, 2005, 33 :W393-W396
[5]   The Biomolecular Interaction Network Database and related tools 2005 update [J].
Alfarano, C ;
Andrade, CE ;
Anthony, K ;
Bahroos, N ;
Bajec, M ;
Bantoft, K ;
Betel, D ;
Bobechko, B ;
Boutilier, K ;
Burgess, E ;
Buzadzija, K ;
Cavero, R ;
D'Abreo, C ;
Donaldson, I ;
Dorairajoo, D ;
Dumontier, MJ ;
Dumontier, MR ;
Earles, V ;
Farrall, R ;
Feldman, H ;
Garderman, E ;
Gong, Y ;
Gonzaga, R ;
Grytsan, V ;
Gryz, E ;
Gu, V ;
Haldorsen, E ;
Halupa, A ;
Haw, R ;
Hrvojic, A ;
Hurrell, L ;
Isserlin, R ;
Jack, F ;
Juma, F ;
Khan, A ;
Kon, T ;
Konopinsky, S ;
Le, V ;
Lee, E ;
Ling, S ;
Magidin, M ;
Moniakis, J ;
Montojo, J ;
Moore, S ;
Muskat, B ;
Ng, I ;
Paraiso, JP ;
Parker, B ;
Pintilie, G ;
Pirone, R .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D418-D424
[6]   OMA 2011: orthology inference among 1000 complete genomes [J].
Altenhoff, Adrian M. ;
Schneider, Adrian ;
Gonnet, Gaston H. ;
Dessimoz, Christophe .
NUCLEIC ACIDS RESEARCH, 2011, 39 :D289-D294
[7]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[8]   McKusick's Online Mendelian Inheritance in Man (OMIM®) [J].
Amberger, Joanna ;
Bocchini, Carol A. ;
Scott, Alan F. ;
Hamosh, Ada .
NUCLEIC ACIDS RESEARCH, 2009, 37 :D793-D796
[9]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[10]   The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :45-48