Hierarchical Routing Mixture of Experts

被引：2

作者：

Zhao, Wenbo ^{[1
]}

Gao, Yang ^{[1
]}

Memon, Shahan Ali ^{[1
]}

Raj, Bhiksha ^{[1
]}

Singh, Rita ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

来源：

2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) | 2021年

关键词：

SUPPORT VECTOR MACHINES; APPROXIMATION; PREDICTION;

D O I：

10.1109/ICPR48806.2021.9412813

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In regression tasks, the data distribution is often too complex to be fitted by a single model. In contrast, partition-based models are developed where data is divided and fitted by local models. These models partition the input space and do not leverage the input-output dependency of multimodal-distributed data, and strong local models are needed to make good predictions. Addressing these problems, we propose a binary tree-structured hierarchical routing mixture of experts (HRME) model that has classifiers as non-leaf node experts and simple regression models as leaf node experts. The classifier nodes jointly soft-partition the input-output space based on the natural separateness of multimodal data. This enables simple leaf experts to be effective for prediction. Further, we develop a probabilistic framework for the HRME model and propose a recursive Expectation-Maximization (EM) based algorithm to learn both the tree structure and the expert models. Experiments on a collection of regression tasks validate our method's effectiveness compared to various other regression models.

引用

页码：7900 / 7906

页数：7

共 50 条

[41] Mixture of experts: a literature survey
Saeed Masoudnia
Reza Ebrahimpour
Artificial Intelligence Review, 2014, 42 : 275 - 293
[42] Mixture of feature specified experts
Kheradpisheh, Saeed Reza
Sharifizadeh, Fatemeh
Nowzari-Dalini, Abbas
Ganjtabesh, Mohammad
Ebrahimpour, Reza
INFORMATION FUSION, 2014, 20 : 242 - 251
[43] Spatial Mixture-of-Experts
Dryden, Nikoli
Hoefler, Torsten
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[44] Twenty Years of Mixture of Experts
Yuksel, Seniha Esen
Wilson, Joseph N.
Gader, Paul D.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2012, 23 (08) : 1177 - 1193
[45] Laplace mixture of linear experts
Nguyen, Hien D.
McLachlan, Geoffrey J.
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2016, 93 : 177 - 191
[46] On the Adversarial Robustness of Mixture of Experts
Puigcerver, Joan
Jenatton, Rodolphe
Riquelme, Carlos
Awasthi, Pranjal
Bhojanapalli, Srinadh
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[47] Fuzzy modification of mixture of experts
Ahmadi, Abulfazl
Rasooli, Mehran
International Journal of Signal Processing, Image Processing and Pattern Recognition, 2011, 4 (04) : 89 - 104
[48] Input partitioning to Mixture of Experts
Tang, B
Heywood, MI
Shepherd, M
PROCEEDING OF THE 2002 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-3, 2002, : 227 - 232
[49] Skew t mixture of experts
Chamroukhi, F.
NEUROCOMPUTING, 2017, 266 : 390 - 408
[50] Merging Experts into One: Improving Computational Efficiency of Mixture of Experts
He, Shwai
Fan, Run-Ze
Ding, Liang
Shen, Li
Zhou, Tianyi
Tao, Dacheng
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 14685 - 14691

← 1 2 3 4 5 →