An evaluation of supervised methods for identifying differentially methylated regions in Illumina methylation arrays

被引:76
作者
Mallik, Saurav [1 ]
Odom, Gabriel J. [1 ]
Gao, Zhen [2 ]
Gomez, Lissette [3 ]
Chen, Xi [1 ]
Wang, Lily [1 ,4 ]
机构
[1] Univ Miami, Miller Sch Med, Dept Publ Hlth Sci, Div Biostat, Miami, FL 33136 USA
[2] Sylvester Comprehens Canc Ctr, Miami, FL USA
[3] Univ Miami, John P Hussman Inst Human Genom, Miller Sch Med, Miami, FL 33136 USA
[4] Univ Miami, Miller Sch Med, Dr John T Macdonald Fdn, Dept Human Genet, Miami, FL 33136 USA
基金
美国国家卫生研究院;
关键词
DMR identification; DNA methylation; epigenome-wide association studies; software comparison; EPIGENOME-WIDE ASSOCIATION; DNA METHYLATION; EPIGENETICS; DISEASE; CANCER;
D O I
10.1093/bib/bby085
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Epigenome-wide association studies (EWASs) have become increasingly popular for studying DNA methylation (DNAm) variations in complex diseases. The Illumina methylation arrays provide an economical, high-throughput and comprehensive platform for measuring methylation status in EWASs. A number of software tools have been developed for identifying disease-associated differentially methylated regions (DMRs) in the epigenome. However, in practice, we found these tools typically had multiple parameter settings that needed to be specified and the performance of the software tools under different parameters was often unclear. To help users better understand and choose optimal parameter settings when using DNAm analysis tools, we conducted a comprehensive evaluation of 4 popular DMR analysis tools under 60 different parameter settings. In addition to evaluating power, precision, area under precision-recall curve, Matthews correlation coefficient, F1 score and type I error rate, we also compared several additional characteristics of the analysis results, including the size of the DMRs, overlap between the methods and execution time. The results showed that none of the software tools performed best under their default parameter settings, and power varied widely when parameters were changed. Overall, the precision of these software tools were good. In contrast, all methods lacked power when effect size was consistent but small. Across all simulation scenarios, comb-p consistently had the best sensitivity as well as good control of false-positive rate.
引用
收藏
页码:2224 / 2235
页数:12
相关论文
共 44 条
[1]   Prediction by supervised principal components [J].
Bair, E ;
Hastie, T ;
Paul, D ;
Tibshirani, R .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2006, 101 (473) :119-137
[2]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[3]   High density DNA methylation array with single CpG site resolution [J].
Bibikova, Marina ;
Barnes, Bret ;
Tsan, Chan ;
Ho, Vincent ;
Klotzle, Brandy ;
Le, Jennie M. ;
Delano, David ;
Zhang, Lu ;
Schroth, Gary P. ;
Gunderson, Kevin L. ;
Fan, Jian-Bing ;
Shen, Richard .
GENOMICS, 2011, 98 (04) :288-295
[4]   Probe Lasso: A novel method to rope in differentially methylated regions with 450K DNA methylation data [J].
Butcher, Lee M. ;
Beck, Stephan .
METHODS, 2015, 72 :21-28
[5]   Functional annotation of the human brain methylome identifies tissue-specific epigenetic variation across brain and blood [J].
Davies, Matthew N. ;
Volta, Manuela ;
Pidsley, Ruth ;
Lunnon, Katie ;
Dixit, Abhishek ;
Lovestone, Simon ;
Coarfa, Cristian ;
Harris, R. Alan ;
Milosavljevic, Aleksandar ;
Troakes, Claire ;
Al-Sarraj, Safa ;
Dobson, Richard ;
Schalkwyk, Leonard C. ;
Mill, Jonathan .
GENOME BIOLOGY, 2012, 13 (06) :R43
[6]   Alzheimer's disease: early alterations in brain DNA methylation at ANK1, BIN1, RHBDF2 and other loci [J].
De Jager, Philip L. ;
Srivastava, Gyan ;
Lunnon, Katie ;
Burgess, Jeremy ;
Schalkwyk, Leonard C. ;
Yu, Lei ;
Eaton, Matthew L. ;
Keenan, Brendan T. ;
Ernst, Jason ;
McCabe, Cristin ;
Tang, Anna ;
Raj, Towfique ;
Replogle, Joseph ;
Brodeur, Wendy ;
Gabriel, Stacey ;
Chai, High S. ;
Younkin, Curtis ;
Younkin, Steven G. ;
Zou, Fanggeng ;
Szyf, Moshe ;
Epstein, Charles B. ;
Schneider, Julie A. ;
Bernstein, Bradley E. ;
Meissner, Alex ;
Ertekin-Taner, Nilufer ;
Chibnik, Lori B. ;
Kellis, Manolis ;
Mill, Jonathan ;
Bennett, David A. .
NATURE NEUROSCIENCE, 2014, 17 (09) :1156-1163
[7]   Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis [J].
Du, Pan ;
Zhang, Xiao ;
Huang, Chiang-Ching ;
Jafari, Nadereh ;
Kibbe, Warren A. ;
Hou, Lifang ;
Lin, Simon M. .
BMC BIOINFORMATICS, 2010, 11
[8]   The role of DNA methylation in coronary artery disease [J].
Duan, Lian ;
Hu, Junyuan ;
Xiong, Xingjiang ;
Liu, Yongmei ;
Wang, Jie .
GENE, 2018, 646 :91-97
[9]   An integrated encyclopedia of DNA elements in the human genome [J].
Dunham, Ian ;
Kundaje, Anshul ;
Aldred, Shelley F. ;
Collins, Patrick J. ;
Davis, CarrieA. ;
Doyle, Francis ;
Epstein, Charles B. ;
Frietze, Seth ;
Harrow, Jennifer ;
Kaul, Rajinder ;
Khatun, Jainab ;
Lajoie, Bryan R. ;
Landt, Stephen G. ;
Lee, Bum-Kyu ;
Pauli, Florencia ;
Rosenbloom, Kate R. ;
Sabo, Peter ;
Safi, Alexias ;
Sanyal, Amartya ;
Shoresh, Noam ;
Simon, Jeremy M. ;
Song, Lingyun ;
Trinklein, Nathan D. ;
Altshuler, Robert C. ;
Birney, Ewan ;
Brown, James B. ;
Cheng, Chao ;
Djebali, Sarah ;
Dong, Xianjun ;
Dunham, Ian ;
Ernst, Jason ;
Furey, Terrence S. ;
Gerstein, Mark ;
Giardine, Belinda ;
Greven, Melissa ;
Hardison, Ross C. ;
Harris, Robert S. ;
Herrero, Javier ;
Hoffman, Michael M. ;
Iyer, Sowmya ;
Kellis, Manolis ;
Khatun, Jainab ;
Kheradpour, Pouya ;
Kundaje, Anshul ;
Lassmann, Timo ;
Li, Qunhua ;
Lin, Xinying ;
Marinov, Georgi K. ;
Merkel, Angelika ;
Mortazavi, Ali .
NATURE, 2012, 489 (7414) :57-74
[10]   A gene-based association method for mapping traits using reference transcriptome data [J].
Gamazon, Eric R. ;
Wheeler, Heather E. ;
Shah, Kaanan P. ;
Mozaffari, Sahar V. ;
Aquino-Michaels, Keston ;
Carroll, Robert J. ;
Eyler, Anne E. ;
Denny, Joshua C. ;
Nicolae, Dan L. ;
Cox, Nancy J. ;
Im, Hae Kyung .
NATURE GENETICS, 2015, 47 (09) :1091-+