An adaptive optimization method for estimating the number of components in a Gaussian mixture model

被引:1
作者
Sun, Shuping [1 ]
Tong, Yaonan [1 ]
Zhang, Biqiang [2 ]
Yang, Bowen [3 ]
He, Peiguang [2 ]
Song, Wei [1 ]
Yang, Wenbo [2 ]
Wu, Yilin [4 ]
Liu, Guangyu [2 ]
机构
[1] Hunan Inst Sci & Technol, Sch Informat Sci & Engn, Yueyang 414006, Peoples R China
[2] Nanyang Inst Technol, Dept Informat Engn, Nanyang 473004, Peoples R China
[3] Univ Chinese Acad Sci UCAS, Sch Integrated Circuits, Beijing 101400, Peoples R China
[4] Nanyang Inst Technol, Dept Intelligent Mfg, Nanyang 473004, Peoples R China
关键词
GMM; MIGMM; chi(2) distribution; Mahalanobis distance; Adaptive optimal number; Adaptive interval; IMPROVED EM ALGORITHM; INFORMATION CRITERION; ORDER SELECTION; CREDIT RISK; K-MEANS; IDENTIFICATION; PREDICTION; APPROXIMATION; SYSTEMS;
D O I
10.1016/j.jocs.2022.101874
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Regarding the determination of the number of components (M) in a Gaussian mixture model (GMM), this study proposes a novel method for adaptively locating an optimal value of M when using a GMM to fit a given dataset; this method avoids underfitting and overfitting due to an unreasonable manually specified interval. The major contributions of this study are highlighted: (1) An adaptive interval for M (denoted as M is an element of [M-Min(Ada), M-Max(Ada)]) based on two procedures of a novel method, the modified incremental Gaussian mixture model (MIGMM), is determined via an adjustable parameter beta. (2) Considering some typical criteria, the optimal number.. within the obtained adaptive interval [M-Min(Ada), M-Max(Ada)], M-Opt(Ada) , is ultimately determined. Regarding the adaptive interval, extensive experiments with typical synthetic datasets show that [M-Min(Ada) M-Max(Ada)], corresponding to the parameter [beta(Min) = 10(-11), beta(Max) = 10(-2)], is determined. The performance of the M-Opt(Ada) determination based on several typical criteria is evaluated on both synthetic and real-world datasets.
引用
收藏
页数:15
相关论文
共 72 条
  • [1] NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION
    AKAIKE, H
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) : 716 - 723
  • [2] Case Report: COVID-19-Associated Bilateral Spontaneous Pneumothorax-A Literature Review
    Alhakeem, Ayat
    Khan, Muhammad Mohsin
    Al Soub, Hussam
    Yousaf, Zohaib
    [J]. AMERICAN JOURNAL OF TROPICAL MEDICINE AND HYGIENE, 2020, 103 (03) : 1162 - 1165
  • [3] Addressing overfitting and underfitting in Gaussian model-based clustering
    Andrews, Jeffrey L.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2018, 127 : 160 - 171
  • [4] Ben Amara A, 2018, 9TH INTERNATIONAL SYMPOSIUM ON SIGNAL, IMAGE, VIDEO AND COMMUNICATIONS (ISIVC 2018), P87, DOI 10.1109/ISIVC.2018.8709191
  • [6] Cavenett, 2013, J CHEM INFORM MODEL, V53, P1689
  • [7] Order Selection in Finite Mixture Models With a Nonsmooth Penalty
    Chen, Jiahua
    Khalili, Abbas
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2008, 103 (484) : 1674 - 1683
  • [8] Diffusion self-triggered square-root cubature information filter for nonlinear non-Gaussian systems and its application to the optic-electric sensor network
    Chen, Ye
    Sheng, Andong
    Qi, Guoqing
    Li, Yinya
    [J]. INFORMATION FUSION, 2020, 55 : 260 - 268
  • [9] Determination of Bathing Water Quality Using Thermal Images Landsat 8 on the West Coast of Tangier: Preliminary Results
    Cherif, El Khalil
    Salmoun, Farida
    Javier Mesas-Carrascosa, Francisco
    [J]. REMOTE SENSING, 2019, 11 (08)
  • [10] Bayesian Multiple Extended Target Tracking Using Labeled Random Finite Sets and Splines
    Daniyan, Abdullahi
    Lambotharan, Sangarapillai
    Deligiannis, Anastasios
    Gong, Yu
    Chen, Wen-Hua
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2018, 66 (22) : 6076 - 6091