A simple model-based approach to variable selection in classification and clustering

被引:3
|
作者
Partovi Nia, Vahid [1 ,2 ]
Davison, Anthony C. [3 ]
机构
[1] Polytech Montreal, GERAD Res Ctr, Montreal, PQ J3T 1J4, Canada
[2] Polytech Montreal, Dept Math & Ind Engn, Montreal, PQ J3T 1J4, Canada
[3] Ecole Polytech Fed Lausanne, EPFL FSB MATHAA STAT, CH-1015 Lausanne, Switzerland
基金
瑞士国家科学基金会; 加拿大自然科学与工程研究理事会;
关键词
Classification; Clustering; high-dimensional data; hierarchical partitioning; Laplace distribution; mixture model; variable selection; MIXTURE MODEL; EXPRESSION; BAYES;
D O I
10.1002/cjs.11241
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Clustering and classification of replicated data is often performed using classical techniques that inappropriately treat the data as unreplicated, or by complex modern ones that are computationally demanding. In this paper, we introduce a simple approach based on a spike-and-slab mixture model that is fast, automatic, allows classification, clustering and variable selection in a single framework, and can handle replicated or unreplicated data. Simulation shows that our approach compares well with other recently proposed methods. The ideas are illustrated by application to microarray and metabolomic data. The Canadian Journal of Statistics 43: 157-175; 2015 (c) 2015 Statistical Society of Canada
引用
收藏
页码:157 / 175
页数:19
相关论文
共 50 条
  • [21] Variable selection for model-based clustering using the integrated complete-data likelihood
    Matthieu Marbac
    Mohammed Sedki
    Statistics and Computing, 2017, 27 : 1049 - 1063
  • [22] Variable Selection for Skewed Model-Based Clustering: Application to the Identification of Novel Sleep Phenotypes
    Wallace, Meredith L.
    Buysse, Daniel J.
    Germain, Anne
    Hall, Martica H.
    Iyengar, Satish
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2018, 113 (521) : 95 - 110
  • [23] A model-based approach to sequence clustering
    Binsztok, H
    Artières, T
    Gallinari, P
    ECAI 2004: 16TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, 110 : 420 - 424
  • [24] Model-based clustering and classification of functional data
    Chamroukhi, Faicel
    Nguyen, Hien D.
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2019, 9 (04)
  • [25] Model-based Clustering and Classification for Data Science
    Unwin, Antony
    INTERNATIONAL STATISTICAL REVIEW, 2020, 88 (01) : 263 - 264
  • [26] Special issue on "Model-based clustering and classification"
    Bock, Hans-Hermann
    Ingrassia, Salvatore
    Vermunt, Jeroen K.
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2013, 7 (03) : 237 - 240
  • [27] On Model-Based Clustering, Classification, and Discriminant Analysis
    McNicholas, Paul D.
    JIRSS-JOURNAL OF THE IRANIAN STATISTICAL SOCIETY, 2011, 10 (02): : 181 - 199
  • [28] Variable selection in model-based discriminant analysis
    Maugis, C.
    Celeux, G.
    Martin-Magniette, M-L
    JOURNAL OF MULTIVARIATE ANALYSIS, 2011, 102 (10) : 1374 - 1387
  • [29] Model-based clustering of high-dimensional data: Variable selection versus facet determination
    Poon, Leonard K. M.
    Zhang, Nevin L.
    Liu, Tengfei
    Liu, April H.
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2013, 54 (01) : 196 - 215
  • [30] Variable selection for model-based high-dimensional clustering and its application to microarray data
    Wang, Sijian
    Zhu, Ji
    BIOMETRICS, 2008, 64 (02) : 440 - 448