High-dimensional penalized Bernstein support vector classifier

被引:0
|
作者
Kharoubi, Rachid [1 ]
Mkhadri, Abdallah [2 ]
Oualkacha, Karim [1 ]
机构
[1] Univ Quebec Montreal, Dept Math, Ave President Kennedy, Montreal, PQ H2X 3Y7, Canada
[2] Univ Cadi Ayyad, Dept Math, Marrakech, Morocco
基金
加拿大自然科学与工程研究理事会;
关键词
SVM; Classification; Bernstein polynomial; Variables selection; Non asymptotic error bound; CANCER; ALGORITHM; SELECTION; MACHINES;
D O I
10.1007/s00180-023-01448-z
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The support vector machine (SVM) is a powerful classifier used for binary classification to improve the prediction accuracy. However, the nondifferentiability of the SVM hinge loss function can lead to computational difficulties in high-dimensional settings. To overcome this problem, we rely on the Bernstein polynomial and propose a new smoothed version of the SVM hinge loss called the Bernstein support vector machine (BernSVC). This extension is suitable for the high dimension regime. As the BernSVC objective loss function is twice differentiable everywhere, we propose two efficient algorithms for computing the solution of the penalized BernSVC. The first algorithm is based on coordinate descent with the maximization-majorization principle and the second algorithm is the iterative reweighted least squares-type algorithm. Under standard assumptions, we derive a cone condition and a restricted strong convexity to establish an upper bound for the weighted lasso BernSVC estimator. By using a local linear approximation, we extend the latter result to the penalized BernSVC with nonconvex penalties SCAD and MCP. Our bound holds with high probability and achieves the so-called fast rate under mild conditions on the design matrix. Simulation studies are considered to illustrate the prediction accuracy of BernSVC relative to its competitors and also to compare the performance of the two algorithms in terms of computational timing and error estimation. The use of the proposed method is illustrated through analysis of three large-scale real data examples.
引用
收藏
页码:1909 / 1936
页数:28
相关论文
共 50 条
  • [1] High-dimensional penalized Bernstein support vector classifier
    Rachid Kharoubi
    Abdallah Mkhadri
    Karim Oualkacha
    Computational Statistics, 2024, 39 : 1909 - 1936
  • [2] A novel support vector classifier for longitudinal high-dimensional data and its application to neuroimaging data
    Chen S.
    Dubois Bowman F.
    Statistical Analysis and Data Mining, 2011, 4 (06): : 604 - 611
  • [3] Penalized high-dimensional empirical likelihood
    Tang, Cheng Yong
    Leng, Chenlei
    BIOMETRIKA, 2010, 97 (04) : 905 - 919
  • [4] High-dimensional penalized arch processes
    Poignard, Benjamin
    Fermanian, Jean-David
    ECONOMETRIC REVIEWS, 2021, 40 (01) : 86 - 107
  • [5] Rank penalized estimators for high-dimensional matrices
    Klopp, Olga
    ELECTRONIC JOURNAL OF STATISTICS, 2011, 5 : 1161 - 1183
  • [6] Robust support vector machine for high-dimensional imbalanced data
    Nakayama, Yugo
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2021, 50 (05) : 1524 - 1540
  • [7] FCM Classifier for High-dimensional Data
    Ichihashi, Hidetomo
    Honda, Katsuhiro
    Notsu, Akira
    Miyamoto, Eri
    2008 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-5, 2008, : 200 - 206
  • [8] PENALIZED LINEAR REGRESSION WITH HIGH-DIMENSIONAL PAIRWISE SCREENING
    Gong, Siliang
    Zhang, Kai
    Liu, Yufeng
    STATISTICA SINICA, 2021, 31 (01) : 391 - 420
  • [9] High-dimensional vector semantics
    Andrecut, M.
    INTERNATIONAL JOURNAL OF MODERN PHYSICS C, 2018, 29 (02):
  • [10] Penalized Independence Rule for Testing High-Dimensional Hypotheses
    Shen, Yanfeng
    Lin, Zhengyan
    Zhu, Jun
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2011, 40 (13) : 2424 - 2435