Text Mining in Cancer Gene and Pathway Prioritization

被引:18
作者
Luo, Yuan [1 ]
Riedlinger, Gregory [2 ]
Szolovits, Peter [1 ]
机构
[1] MIT, Comp Sci & Artificial Intelligence Lab, Cambridge, MA 02139 USA
[2] Massachusetts Gen Hosp, Dept Pathol, Boston, MA 02114 USA
关键词
gene prioritization; text mining; cancer omics; pathway prioritization; machine learning;
D O I
10.4137/CIN.S13874
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Prioritization of cancer implicated genes has received growing attention as an effective way to reduce wet lab cost by computational analysis that ranks candidate genes according to the likelihood that experimental verifications will succeed. A multitude of gene prioritization tools have been developed, each integrating different data sources covering gene sequences, differential expressions, function annotations, gene regulations, protein domains, protein interactions, and pathways. This review places existing gene prioritization tools against the backdrop of an integrative Omic hierarchy view toward cancer and focuses on the analysis of their text mining components. We explain the relatively slow progress of text mining in gene prioritization, identify several challenges to current text mining methods, and highlight a few directions where more effective text mining algorithms may improve the overall prioritization task and where prioritizing the pathways may be more desirable than prioritizing only genes.
引用
收藏
页码:69 / 79
页数:11
相关论文
共 125 条
[81]   Association of genes to genetically inherited diseases using data mining [J].
Perez-Iratxeta, C ;
Bork, P ;
Andrade, MA .
NATURE GENETICS, 2002, 31 (03) :316-319
[82]   Update of the G2D tool for prioritization of gene candidates to inherited diseases [J].
Perez-Iratxeta, Carolina ;
Bork, Peer ;
Andrade-Navarro, Miguel A. .
NUCLEIC ACIDS RESEARCH, 2007, 35 :W212-W216
[83]   Development of human protein reference database as an initial platform for approaching systems biology in humans [J].
Peri, S ;
Navarro, JD ;
Amanchy, R ;
Kristiansen, TZ ;
Jonnalagadda, CK ;
Surendranath, V ;
Niranjan, V ;
Muthusamy, B ;
Gandhi, TKB ;
Gronborg, M ;
Ibarrola, N ;
Deshpande, N ;
Shanker, K ;
Shivashankar, HN ;
Rashmi, BP ;
Ramya, MA ;
Zhao, ZX ;
Chandrika, KN ;
Padma, N ;
Harsha, HC ;
Yatish, AJ ;
Kavitha, MP ;
Menezes, M ;
Choudhury, DR ;
Suresh, S ;
Ghosh, N ;
Saravana, R ;
Chandran, S ;
Krishna, S ;
Joy, M ;
Anand, SK ;
Madavan, V ;
Joseph, A ;
Wong, GW ;
Schiemann, WP ;
Constantinescu, SN ;
Huang, LL ;
Khosravi-Far, R ;
Steen, H ;
Tewari, M ;
Ghaffari, S ;
Blobe, GC ;
Dang, CV ;
Garcia, JGN ;
Pevsner, J ;
Jensen, ON ;
Roepstorff, P ;
Deshpande, KS ;
Chinnaiyan, AM ;
Hamosh, A .
GENOME RESEARCH, 2003, 13 (10) :2363-2371
[84]   MetaRanker 2.0: a web server for prioritization of genetic variation data [J].
Pers, Tune H. ;
Dworzynski, Piotr ;
Thomas, Cecilia Engel ;
Lage, Kasper ;
Brunak, Soren .
NUCLEIC ACIDS RESEARCH, 2013, 41 (W1) :W104-W108
[85]   Computational approaches to disease-gene prediction: rationale, classification and successes [J].
Piro, Rosario M. ;
Di Cunto, Ferdinando .
FEBS JOURNAL, 2012, 279 (05) :678-696
[86]   NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins [J].
Pruitt, KD ;
Tatusova, T ;
Maglott, DR .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D501-D504
[87]   An integrated approach to inferring gene-disease associations in humans [J].
Radivojac, Predrag ;
Peng, Kang ;
Clark, Wyatt T. ;
Peters, Brandon J. ;
Mohan, Amrita ;
Boyle, Sean M. ;
Mooney, Sean D. .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2008, 72 (03) :1030-1037
[88]   ALFRED: an allele frequency resource for research and teaching [J].
Rajeevan, Haseena ;
Soundararajan, Usha ;
Kidd, Judith R. ;
Pakstis, Andrew J. ;
Kidd, Kenneth K. .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D1010-D1015
[89]   Representation of Rare Diseases in Health Information Systems: The Orphanet Approach to Serve a Wide Range of End Users [J].
Rath, Ana ;
Olry, Annie ;
Dhombres, Ferdinand ;
Brandt, Maja Milicic ;
Urbero, Bruno ;
Ayme, Segolene .
HUMAN MUTATION, 2012, 33 (05) :803-808
[90]   Identifying Relationships among Genomic Disease Regions: Predicting Genes at Pathogenic SNP Associations and Rare Deletions [J].
Raychaudhuri, Soumya ;
Plenge, Robert M. ;
Rossin, Elizabeth J. ;
Ng, Aylwin C. Y. ;
Purcell, Shaun M. ;
Sklar, Pamela ;
Scolnick, Edward M. ;
Xavier, Ramnik J. ;
Altshuler, David ;
Daly, Mark J. .
PLOS GENETICS, 2009, 5 (06)