Support vector machine model of developmental brain gene expression data for prioritization of Autism risk gene candidates

被引:49
作者
Cogill, S. [1 ]
Wang, L. [1 ]
机构
[1] Clemson Univ, Dept Biochem & Genet, Clemson, SC 29634 USA
关键词
LONG NONCODING RNAS; SPECTRUM DISORDERS; PREDICTION; KNOWLEDGEBASE; IMPLICATE; EVOLUTION; CHILDREN; INSIGHTS; GENCODE; DNA;
D O I
10.1093/bioinformatics/btw498
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Autism spectrum disorders (ASD) are a group of neurodevelopmental disorders with clinical heterogeneity and a substantial polygenic component. High-throughput methods for ASD risk gene identification produce numerous candidate genes that are time-consuming and expensive to validate. Prioritization methods can identify high-confidence candidates. Previous ASD gene prioritization methods have focused on a priori knowledge, which excludes genes with little functional annotation or no protein product such as long non-coding RNAs (lncRNAs). Results: We have developed a support vector machine (SVM) model, trained using brain developmental gene expression data, for the classification and prioritization of ASD risk genes. The selected feature model had a mean accuracy of 76.7%, mean specificity of 77.2% and mean sensitivity of 74.4%. Gene lists comprised of an ASD risk gene and adjacent genes were ranked using the model's decision function output. The known ASD risk genes were ranked on average in the 77.4th, 78.4th and 80.7th percentile for sets of 101, 201 and 401 genes respectively. Of 10,840 lncRNA genes, 63 were classified as ASD-associated candidates with a confidence greater than 0.95. Genes previously associated with brain development and neurodevelopmental disorders were prioritized highly within the lncRNA gene list.
引用
收藏
页码:3611 / 3618
页数:8
相关论文
共 53 条
[21]   An anatomically comprehensive atlas of the adult human brain transcriptome [J].
Hawrylycz, Michael J. ;
Lein, Ed S. ;
Guillozet-Bongaarts, Angela L. ;
Shen, Elaine H. ;
Ng, Lydia ;
Miller, Jeremy A. ;
van de lagemaat, Louie N. ;
Smith, Kimberly A. ;
Ebbert, Amanda ;
Riley, Zackery L. ;
Abajian, Chris ;
Beckmann, Christian F. ;
Bernard, Amy ;
Bertagnolli, Darren ;
Boe, Andrew F. ;
Cartagena, Preston M. ;
Chakravarty, M. Mallar ;
Chapin, Mike ;
Chong, Jimmy ;
Dalley, Rachel A. ;
Daly, Barry David ;
Dang, Chinh ;
Datta, Suvro ;
Dee, Nick ;
Dolbeare, Tim A. ;
Faber, Vance ;
Feng, David ;
Fowler, David R. ;
Goldy, Jeff ;
Gregor, Benjamin W. ;
Haradon, Zeb ;
Haynor, David R. ;
Hohmann, John G. ;
Horvath, Steve ;
Howard, Robert E. ;
Jeromin, Andreas ;
Jochim, Jayson M. ;
Kinnunen, Marty ;
Lau, Christopher ;
Lazarz, Evan T. ;
Lee, Changkyu ;
Lemon, Tracy A. ;
Li, Ling ;
Li, Yang ;
Morris, John A. ;
Overly, Caroline C. ;
Parker, Patrick D. ;
Parry, Sheana E. ;
Reding, Melissa ;
Royall, Joshua J. .
NATURE, 2012, 489 (7416) :391-399
[22]  
Henley SJ, 2014, MMWR-MORBID MORTAL W, V63, P1
[23]  
Hira Zena M., 2015, Advances in Bioinformatics, V2015, P198363, DOI 10.1155/2015/198363
[24]   Association between extreme autistic traits and intellectual disability: insights from a general population twin study [J].
Hoekstra, R. A. ;
Happe, F. ;
Baron-Cohen, S. ;
Ronald, A. .
BRITISH JOURNAL OF PSYCHIATRY, 2009, 195 (06) :531-536
[25]  
Hsu C.-W., 2003, PRACTICAL GUIDE SUPP, DOI DOI 10.1177/02632760022050997
[26]   Genetic Epidemiology and Insights into Interactive Genetic and Environmental Effects in Autism Spectrum Disorders [J].
Kim, Young Shin ;
Leventhal, Bennett L. .
BIOLOGICAL PSYCHIATRY, 2015, 77 (01) :66-74
[27]   Wrappers for feature subset selection [J].
Kohavi, R ;
John, GH .
ARTIFICIAL INTELLIGENCE, 1997, 97 (1-2) :273-324
[28]   Machine learning applications in cancer prognosis and prediction [J].
Kourou, Konstantina ;
Exarchos, Themis P. ;
Exarchos, Konstantinos P. ;
Karamouzis, Michalis V. ;
Fotiadis, Dimitrios I. .
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2015, 13 :8-17
[29]  
Kubat M., 1997, P 14 INT C MACH LEAR, P179
[30]   Class-imbalanced classifiers for high-dimensional data [J].
Lin, Wei-Jiun ;
Chen, James J. .
BRIEFINGS IN BIOINFORMATICS, 2013, 14 (01) :13-26