Hierarchical Routing Mixture of Experts

被引:2
|
作者
Zhao, Wenbo [1 ]
Gao, Yang [1 ]
Memon, Shahan Ali [1 ]
Raj, Bhiksha [1 ]
Singh, Rita [1 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
关键词
SUPPORT VECTOR MACHINES; APPROXIMATION; PREDICTION;
D O I
10.1109/ICPR48806.2021.9412813
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In regression tasks, the data distribution is often too complex to be fitted by a single model. In contrast, partition-based models are developed where data is divided and fitted by local models. These models partition the input space and do not leverage the input-output dependency of multimodal-distributed data, and strong local models are needed to make good predictions. Addressing these problems, we propose a binary tree-structured hierarchical routing mixture of experts (HRME) model that has classifiers as non-leaf node experts and simple regression models as leaf node experts. The classifier nodes jointly soft-partition the input-output space based on the natural separateness of multimodal data. This enables simple leaf experts to be effective for prediction. Further, we develop a probabilistic framework for the HRME model and propose a recursive Expectation-Maximization (EM) based algorithm to learn both the tree structure and the expert models. Experiments on a collection of regression tasks validate our method's effectiveness compared to various other regression models.
引用
收藏
页码:7900 / 7906
页数:7
相关论文
共 50 条
  • [41] Mixture of experts: a literature survey
    Saeed Masoudnia
    Reza Ebrahimpour
    Artificial Intelligence Review, 2014, 42 : 275 - 293
  • [42] Mixture of feature specified experts
    Kheradpisheh, Saeed Reza
    Sharifizadeh, Fatemeh
    Nowzari-Dalini, Abbas
    Ganjtabesh, Mohammad
    Ebrahimpour, Reza
    INFORMATION FUSION, 2014, 20 : 242 - 251
  • [43] Spatial Mixture-of-Experts
    Dryden, Nikoli
    Hoefler, Torsten
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [44] Twenty Years of Mixture of Experts
    Yuksel, Seniha Esen
    Wilson, Joseph N.
    Gader, Paul D.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2012, 23 (08) : 1177 - 1193
  • [45] Laplace mixture of linear experts
    Nguyen, Hien D.
    McLachlan, Geoffrey J.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2016, 93 : 177 - 191
  • [46] On the Adversarial Robustness of Mixture of Experts
    Puigcerver, Joan
    Jenatton, Rodolphe
    Riquelme, Carlos
    Awasthi, Pranjal
    Bhojanapalli, Srinadh
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [47] Fuzzy modification of mixture of experts
    Ahmadi, Abulfazl
    Rasooli, Mehran
    International Journal of Signal Processing, Image Processing and Pattern Recognition, 2011, 4 (04) : 89 - 104
  • [48] Input partitioning to Mixture of Experts
    Tang, B
    Heywood, MI
    Shepherd, M
    PROCEEDING OF THE 2002 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-3, 2002, : 227 - 232
  • [49] Skew t mixture of experts
    Chamroukhi, F.
    NEUROCOMPUTING, 2017, 266 : 390 - 408
  • [50] Merging Experts into One: Improving Computational Efficiency of Mixture of Experts
    He, Shwai
    Fan, Run-Ze
    Ding, Liang
    Shen, Li
    Zhou, Tianyi
    Tao, Dacheng
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 14685 - 14691