Variational Bayes Ensemble Learning Neural Networks With Compressed Feature Space

被引:0
作者
Liu, Zihuan [1 ]
Bhattacharya, Shrijita [2 ]
Maiti, Tapabrata [2 ]
机构
[1] Yale Univ, Sch Publ Hlth, Collaborat Ctr Stat Sci, New Haven, CT 06520 USA
[2] Michigan State Univ, Dept Stat & Probabil, E Lansing, MI 48824 USA
基金
美国国家科学基金会;
关键词
Computational modeling; Bayes methods; Artificial neural networks; Uncertainty; Training; Predictive models; Numerical models; Intrinsic dimensionality; model averaging; random compression; variational inference (VI); VARIABLE SELECTION; REGRESSION; MODEL; CLASSIFICATION; INFERENCE; THEOREM;
D O I
10.1109/TNNLS.2022.3172276
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider the problem of nonparametric classification from a high-dimensional input vector (small n large p problem). To handle the high-dimensional feature space, we propose a random projection (RP) of the feature space followed by training of a neural network (NN) on the compressed feature space. Unlike regularization techniques (lasso, ridge, etc.), which train on the full data, NNs based on compressed feature space have significantly lower computation complexity and memory storage requirements. Nonetheless, a random compression-based method is often sensitive to the choice of compression. To address this issue, we adopt a Bayesian model averaging (BMA) approach and leverage the posterior model weights to determine: 1) uncertainty under each compression and 2) intrinsic dimensionality of the feature space (the effective dimension of feature space useful for prediction). The final prediction is improved by averaging models with projected dimensions close to the intrinsic dimensionality. Furthermore, we propose a variational approach to the afore-mentioned BMA to allow for simultaneous estimation of both model weights and model-specific parameters. Since the proposed variational solution is parallelizable across compressions, it preserves the computational gain of frequentist ensemble techniques while providing the full uncertainty quantification of a Bayesian approach. We establish the asymptotic consistency of the proposed algorithm under the suitable characterization of the RPs and the prior parameters. Finally, we provide extensive numerical examples for empirical validation of the proposed method.
引用
收藏
页码:1379 / 1385
页数:7
相关论文
共 65 条
  • [1] AILON N., 2006, P 38 ANN ACM S THEOR, P557, DOI [DOI 10.1145/1132516.1132597, 10.1145/1132516.1132597]
  • [2] [Anonymous], 2012, P 29 INT C MACH LEAR
  • [3] [Anonymous], 2003, ALZHEIMERS DIS NEURO
  • [4] Scikit-Dimension: A Python']Python Package for Intrinsic Dimension Estimation
    Bac, Jonathan
    Mirkes, Evgeny M.
    Gorban, Alexander N.
    Tyukin, Ivan
    Zinovyev, Andrei
    [J]. ENTROPY, 2021, 23 (10)
  • [5] BAI J., 2020, Advances in Neural Information Processing Systems, V33, P466
  • [6] Statistical foundation of Variational Bayes neural networks
    Bhattacharya, Shrijita
    Maiti, Tapabrata
    [J]. NEURAL NETWORKS, 2021, 137 (137) : 151 - 173
  • [7] A CORRELATED TOPIC MODEL OF SCIENCE
    Blei, David M.
    Lafferty, John D.
    [J]. ANNALS OF APPLIED STATISTICS, 2007, 1 (01) : 17 - 35
  • [8] Blundell C, 2015, PR MACH LEARN RES, V37, P1613
  • [9] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [10] Stable signal recovery from incomplete and inaccurate measurements
    Candes, Emmanuel J.
    Romberg, Justin K.
    Tao, Terence
    [J]. COMMUNICATIONS ON PURE AND APPLIED MATHEMATICS, 2006, 59 (08) : 1207 - 1223