Large-Scale Heteroscedastic Regression via Gaussian Process

被引:11
作者
Liu, Haitao [1 ]
Ong, Yew-Soon [2 ]
Cai, Jianfei [3 ]
机构
[1] Nanyang Technol Univ, Rolls Royce NTU Corp Lab, Singapore 637460, Singapore
[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 639798, Singapore
[3] Monash Univ, Dept Data Sci & Artificial Intelligence, Melbourne, Vic 3800, Australia
基金
新加坡国家研究基金会;
关键词
Training; Scalability; Stochastic processes; Kernel; Bayes methods; Standards; Complexity theory; Distributed learning; heteroscedastic GP (HGP); large scale; sparse approximation; stochastic variational inference; LAPLACE APPROXIMATION; VARIATIONAL INFERENCE; FRAMEWORK;
D O I
10.1109/TNNLS.2020.2979188
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Heteroscedastic regression considering the varying noises among observations has many applications in the fields, such as machine learning and statistics. Here, we focus on the heteroscedastic Gaussian process (HGP) regression that integrates the latent function and the noise function in a unified nonparametric Bayesian framework. Though showing remarkable performance, HGP suffers from the cubic time complexity, which strictly limits its application to big data. To improve the scalability, we first develop a variational sparse inference algorithm, named VSHGP, to handle large-scale data sets. Furthermore, two variants are developed to improve the scalability and capability of VSHGP. The first is stochastic VSHGP (SVSHGP) that derives a factorized evidence lower bound, thus enhancing efficient stochastic variational inference. The second is distributed VSHGP (DVSHGP) that follows the Bayesian committee machine formalism to distribute computations over multiple local VSHGP experts with many inducing points and adopts hybrid parameters for experts to guard against overfitting and capture local variety. The superiority of DVSHGP and SVSHGP compared to the existing scalable HGP/homoscedastic GP is then extensively verified on various data sets.
引用
收藏
页码:708 / 721
页数:14
相关论文
共 69 条
  • [1] Abadi M, 2016, ACM SIGPLAN NOTICES, V51, P1, DOI [10.1145/2951913.2976746, 10.1145/3022670.2976746]
  • [2] GPZ: non-stationary sparse Gaussian processes for heteroscedastic uncertainty estimation in photometric redshifts
    Almosallam, Ibrahim A.
    Jarvis, Matt J.
    Roberts, Stephen J.
    [J]. MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2016, 462 (01) : 726 - 739
  • [3] [Anonymous], 2012, Active Learning
  • [4] [Anonymous], 2007, PMLR
  • [5] Bauer M, 2016, ADV NEUR IN, V29
  • [6] Bauza Maria, 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA), P3008, DOI 10.1109/ICRA.2017.7989345
  • [7] Practical Heteroscedastic Gaussian Process Modeling for Large Simulation Experiments
    Binois, Mickael
    Gramacy, Robert B.
    Ludkovski, Mike
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2018, 27 (04) : 808 - 821
  • [8] Bui TD., 2014, ADV NEURAL INFORM PR, V27, P2213
  • [9] Chalupka K, 2013, J MACH LEARN RES, V14, P333
  • [10] Deisenroth MP, 2015, PR MACH LEARN RES, V37, P1481