Certifiably optimal sparse principal component analysis

被引:0
|
作者
Lauren Berk
Dimitris Bertsimas
机构
[1] Massachusetts Institute of Technology,Operations Research Center
来源
Mathematical Programming Computation | 2019年 / 11卷
关键词
Sparse principal component analysis; Principal component analysis; Mixed integer optimization; Sparse eigenvalues; 62H25; 65F15; 65K05; 90C06; 90C26; 90C27;
D O I
暂无
中图分类号
学科分类号
摘要
This paper addresses the sparse principal component analysis (SPCA) problem for covariance matrices in dimension n aiming to find solutions with sparsity k using mixed integer optimization. We propose a tailored branch-and-bound algorithm, Optimal-SPCA, that enables us to solve SPCA to certifiable optimality in seconds for n=100\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n = 100$$\end{document} s, k=10\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k=10$$\end{document} s. This same algorithm can be applied to problems with n=10,000s\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n=10{,}000\,\mathrm{s}$$\end{document} or higher to find high-quality feasible solutions in seconds while taking several hours to prove optimality. We apply our methods to a number of real data sets to demonstrate that our approach scales to the same problem sizes attempted by other methods, while providing superior solutions compared to those methods, explaining a higher portion of variance and permitting complete control over the desired sparsity. The software that was reviewed as part of this submission has been given the DOI (digital object identifier) https://doi.org/10.5281/zenodo.2027898.
引用
收藏
页码:381 / 420
页数:39
相关论文
共 50 条
  • [1] Certifiably optimal sparse principal component analysis
    Berk, Lauren
    Bertsimasi, Dimitris
    MATHEMATICAL PROGRAMMING COMPUTATION, 2019, 11 (03) : 381 - 420
  • [2] Optimal solutions for sparse principal component analysis
    d'Aspremont, Alexandre
    Bach, Francis
    El Ghaoui, Laurent
    JOURNAL OF MACHINE LEARNING RESEARCH, 2008, 9 : 1269 - 1294
  • [3] Sparse principal component analysis
    Zou, Hui
    Hastie, Trevor
    Tibshirani, Robert
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2006, 15 (02) : 265 - 286
  • [4] Certifiably optimal sparse inverse covariance estimation
    Bertsimas, Dimitris
    Lamperski, Jourdain
    Pauphilet, Jean
    MATHEMATICAL PROGRAMMING, 2020, 184 (1-2) : 491 - 530
  • [5] Certifiably optimal sparse inverse covariance estimation
    Dimitris Bertsimas
    Jourdain Lamperski
    Jean Pauphilet
    Mathematical Programming, 2020, 184 : 491 - 530
  • [6] Robust sparse principal component analysis
    ZHAO Qian
    MENG DeYu
    XU ZongBen
    Science China(Information Sciences), 2014, 57 (09) : 175 - 188
  • [7] Multilinear Sparse Principal Component Analysis
    Lai, Zhihui
    Xu, Yong
    Chen, Qingcai
    Yang, Jian
    Zhang, David
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2014, 25 (10) : 1942 - 1950
  • [8] Robust Sparse Principal Component Analysis
    Croux, Christophe
    Filzmoser, Peter
    Fritz, Heinrich
    TECHNOMETRICS, 2013, 55 (02) : 202 - 214
  • [9] Robust sparse principal component analysis
    Zhao Qian
    Meng DeYu
    Xu ZongBen
    SCIENCE CHINA-INFORMATION SCIENCES, 2014, 57 (09) : 1 - 14
  • [10] Weighted sparse principal component analysis
    Van Deun, Katrijn
    Thorrez, Lieven
    Coccia, Margherita
    Hasdemir, Dicle
    Westerhuis, Johan A.
    Smilde, Age K.
    Van Mechelen, Iven
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2019, 195