Integrating different data types by regularized unsupervised multiple kernel learning with application to cancer subtype discovery

被引:125
作者
Speicher, Nora K. [1 ,2 ]
Pfeifer, Nico [1 ]
机构
[1] Max Planck Inst Informat, Dept Computat Biol & Appl Algorithm, D-66123 Saarbrucken, Germany
[2] Univ Saarland, Saarbrucken Grad Sch Comp Sci, D-66123 Saarbrucken, Germany
关键词
D O I
10.1093/bioinformatics/btv244
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Despite ongoing cancer research, available therapies are still limited in quantity and effectiveness, and making treatment decisions for individual patients remains a hard problem. Established subtypes, which help guide these decisions, are mainly based on individual data types. However, the analysis of multidimensional patient data involving the measurements of various molecular features could reveal intrinsic characteristics of the tumor. Large-scale projects accumulate this kind of data for various cancer types, but we still lack the computational methods to reliably integrate this information in a meaningful manner. Therefore, we apply and extend current multiple kernel learning for dimensionality reduction approaches. On the one hand, we add a regularization term to avoid overfitting during the optimization procedure, and on the other hand, we show that one can even use several kernels per data type and thereby alleviate the user from having to choose the best kernel functions and kernel parameters for each data type beforehand. Results: We have identified biologically meaningful subgroups for five different cancer types. Survival analysis has revealed significant differences between the survival times of the identified subtypes, with P values comparable or even better than state-of-the-art methods. Moreover, our resulting subtypes reflect combined patterns from the different data sources, and we demonstrate that input kernel matrices with only little information have less impact on the integrated kernel matrix. Our subtypes show different responses to specific therapies, which could eventually assist in treatment decision making.
引用
收藏
页码:268 / 275
页数:8
相关论文
共 18 条
[1]  
Gartner T., 2002, ICML, V2, P7
[2]   Hallmarks of Cancer: The Next Generation [J].
Hanahan, Douglas ;
Weinberg, Robert A. .
CELL, 2011, 144 (05) :646-674
[3]  
He XF, 2004, ADV NEUR IN, V16, P153
[4]  
Hosmer D.W., 2011, Applied survival analysis: regression modeling of time-to-event data, V618
[5]  
Huang HC, 2012, PROC CVPR IEEE, P773, DOI 10.1109/CVPR.2012.6247748
[6]   Multiple Kernel Learning for Dimensionality Reduction [J].
Lin, Yen-Yu ;
Liu, Tyng-Luh ;
Fuh, Chiou-Shann .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (06) :1147-1160
[7]   Consensus clustering: A resampling-based method for class discovery and visualization of gene expression microarray data [J].
Monti, S ;
Tamayo, P ;
Mesirov, J ;
Golub, T .
MACHINE LEARNING, 2003, 52 (1-2) :91-118
[8]   Identification of a CpG Island Methylator Phenotype that Defines a Distinct Subgroup of Glioma [J].
Noushmehr, Houtan ;
Weisenberger, Daniel J. ;
Diefes, Kristin ;
Phillips, Heidi S. ;
Pujara, Kanan ;
Berman, Benjamin P. ;
Pan, Fei ;
Pelloski, Christopher E. ;
Sulman, Erik P. ;
Bhat, Krishna P. ;
Verhaak, Roel G. W. ;
Hoadley, Katherine A. ;
Hayes, D. Neil ;
Perou, Charles M. ;
Schmidt, Heather K. ;
Ding, Li ;
Wilson, Richard K. ;
Van Den Berg, David ;
Shen, Hui ;
Bengtsson, Henrik ;
Neuvial, Pierre ;
Cope, Leslie M. ;
Buckley, Jonathan ;
Herman, James G. ;
Baylin, Stephen B. ;
Laird, Peter W. ;
Aldape, Kenneth .
CANCER CELL, 2010, 17 (05) :510-522
[9]   The Future of Glioblastoma Therapy: Synergism of Standard of Care and Immunotherapy [J].
Patel, Mira A. ;
Kim, Jennifer E. ;
Ruzevick, Jacob ;
Li, Gordon ;
Lim, Michael .
CANCERS, 2014, 6 (04) :1953-1985