Intra-Cluster Distance Minimization in DNA Methylation Analysis Using an Advanced Tabu-Based Iterative k-Medoids Clustering Algorithm (T-CLUST)

被引:9
作者
Damgacioglu, Haluk [1 ]
Celik, Emrah [2 ]
Celik, Nurcin [1 ]
机构
[1] Univ Miami, Dept Ind Engn, Coral Gables, FL 33146 USA
[2] Univ Miami, Dept Mech & Aerosp, Coral Gables, FL 33146 USA
关键词
Biomarker identification; clustering; DNA methylation analysis; k-medoids clustering; outlier detection; CPG-ISLAND METHYLATION; CANCER; MODEL; CLASSIFICATION; EPIGENETICS; DISCOVERY; RECEPTOR;
D O I
10.1109/TCBB.2018.2886006
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Recent advances in DNA methylation profiling have paved the way for understanding the underlying epigenetic mechanisms of various diseases such as cancer. While conventional distance-based clustering algorithms (e.g., hierarchical and k-means clustering) have been heavily used in such profiling owing to their speed in conduct of high-throughput analysis, these methods commonly converge to suboptimal solutions and/or trivial clusters due to their greedy search nature. Hence, methodologies are needed to improve the quality of clusters formed by these algorithms without sacrificing from their speed. In this study, we introduce three related algorithms for a complete high-throughput methylation analysis: a variance-based dimension reduction algorithm to handle high-dimensionality in data, an outlier detection algorithm to identify the outliers of data, and an advanced Tabu-based iterative k-medoids clustering algorithm (T-CLUST) to reduce the impact of initial solutions on the performance of conventional k-medoids algorithm. The performance of the proposed algorithms is demonstrated on nine different real DNA methylation datasets obtained from the Gene Expression Omnibus DataSets database. The accuracy of the cluster identification obtained by our proposed algorithms is higher than those of hierarchical and k-means clustering, as well as the conventional methods. The algorithms are implemented in MATLAB, and available at: http://www.coe.miami.edu/simlab/ tclust.html.
引用
收藏
页码:1241 / 1252
页数:12
相关论文
共 44 条
[1]  
Akalin A, 2012, GENOME BIOL, V13, DOI [10.1186/gb-2012-13-10-R87, 10.1186/gb-2012-13-10-r87]
[2]   Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays [J].
Aryee, Martin J. ;
Jaffe, Andrew E. ;
Corrada-Bravo, Hector ;
Ladd-Acosta, Christine ;
Feinberg, Andrew P. ;
Hansen, Kasper D. ;
Irizarry, Rafael A. .
BIOINFORMATICS, 2014, 30 (10) :1363-1369
[3]   The mammalian epigenome [J].
Bernstein, Bradley E. ;
Meissner, Alexander ;
Lander, Eric S. .
CELL, 2007, 128 (04) :669-681
[4]   High-throughput DNA methylation profiling using universal bead arrays [J].
Bibikova, M ;
Lin, ZW ;
Zhou, LX ;
Chudin, E ;
Garcia, EW ;
Wu, B ;
Doucet, D ;
Thomas, NJ ;
Wang, YH ;
Vollmer, E ;
Goldmann, T ;
Seifart, C ;
Jiang, W ;
Barker, DL ;
Chee, MS ;
Floros, J ;
Fan, JB .
GENOME RESEARCH, 2006, 16 (03) :383-393
[5]   Promoter DNA Methylation Pattern Identifies Prognostic Subgroups in Childhood T-Cell Acute Lymphoblastic Leukemia [J].
Borssen, Magnus ;
Palmqvist, Lars ;
Karrman, Kristina ;
Abrahamsson, Jonas ;
Behrendtz, Mikael ;
Heldrup, Jesper ;
Forestier, Erik ;
Roos, Goran ;
Degerman, Sofie .
PLOS ONE, 2013, 8 (06)
[6]   Copula based factorization in Bayesian multivariate infinite mixture models [J].
Burda, Martin ;
Prokhorov, Artem .
JOURNAL OF MULTIVARIATE ANALYSIS, 2014, 127 :200-213
[7]   DNA methylation-based classification of central nervous system tumours [J].
Capper, David ;
Jones, David T. W. ;
Sill, Martin ;
Hovestadt, Volker ;
Schrimpf, Daniel ;
Sturm, Dominik ;
Koelsche, Christian ;
Sahm, Felix ;
Chavez, Lukas ;
Reuss, David E. ;
Kratz, Annekathrin ;
Wefers, Annika K. ;
Huang, Kristin ;
Pajtler, Kristian W. ;
Schweizer, Leonille ;
Stichel, Damian ;
Olar, Adriana ;
Engel, Nils W. ;
Lindenberg, Kerstin ;
Harter, Patrick N. ;
Braczynski, Anne K. ;
Plate, Karl H. ;
Dohmen, Hildegard ;
Garvalov, Boyan K. ;
Coras, Roland ;
Hoelsken, Annett ;
Hewer, Ekkehard ;
Bewerunge-Hudler, Melanie ;
Schick, Matthias ;
Fischer, Roger ;
Beschorner, Rudi ;
Schittenhelm, Jens ;
Staszewski, Ori ;
Wani, Khalida ;
Varlet, Pascale ;
Pages, Melanie ;
Temming, Petra ;
Lohmann, Dietmar ;
Selt, Florian ;
Witt, Hendrik ;
Milde, Till ;
Witt, Olaf ;
Aronica, Eleonora ;
Giangaspero, Felice ;
Rushing, Elisabeth ;
Scheurlen, Wolfram ;
Geisenberger, Christoph ;
Rodriguez, Fausto J. ;
Becker, Albert ;
Preusser, Matthias .
NATURE, 2018, 555 (7697) :469-+
[8]   Differentiation of Lung Adenocarcinoma, Pleural Mesothelioma, and Nonmalignant Pulmonary Tissues Using DNA Methylation Profiles [J].
Christensen, Brock C. ;
Marsit, Carmen J. ;
Houseman, E. Andres ;
Godleski, John J. ;
Longacker, Jennifer L. ;
Zheng, Shichun ;
Yeh, Ru-Fang ;
Wrensch, Margaret R. ;
Wiemels, Joseph L. ;
Karagas, Margaret R. ;
Bueno, Raphael ;
Sugarbaker, David J. ;
Nelson, Heather H. ;
Wiencke, John K. ;
Kelsey, Karl T. .
CANCER RESEARCH, 2009, 69 (15) :6315-6321
[9]   Aberrant CpG-island methylation has non-random and tumour-type-specific patterns [J].
Costello, JF ;
Frühwald, MC ;
Smiraglia, DJ ;
Rush, LJ ;
Robertson, GP ;
Gao, X ;
Wright, FA ;
Feramisco, JD ;
Peltomäki, P ;
Lang, JC ;
Schuller, DE ;
Yu, L ;
Bloomfield, CD ;
Caligiuri, MA ;
Yates, A ;
Nishikawa, R ;
Huang, HJS ;
Petrelli, NJ ;
Zhang, XL ;
O'Dorisio, MS ;
Held, WA ;
Cavenee, WK ;
Plass, C .
NATURE GENETICS, 2000, 24 (02) :132-138
[10]  
Damgacioglu H., 2018, HDB DYNAMIC DATA DRI, P233