Mixture of Experts for Intelligent Networks: A Large Language Model-enabled Approach

被引:0
|
作者
Du, Hongyang [1 ]
Liu, Guangyuan [1 ]
Lin, Yijing [2 ]
Niyato, Dusit [1 ]
Kang, Jiawen [3 ,4 ,5 ]
Xiong, Zehui [6 ]
Kim, Dong In [7 ]
机构
[1] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 639798, Singapore
[2] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 100876, Peoples R China
[3] Guangdong Univ Technol, Sch Automat, Guangzhou 510006, Peoples R China
[4] Minist Educ, Key Lab Intelligent Informat Proc & Syst Integrat, Guangzhou 510006, Peoples R China
[5] Guangdong HongKong Macao Joint Lab Smart Discrete, Guangzhou 510006, Peoples R China
[6] Singapore Univ Technol & Design, Pillar Informat Syst Technol & Design, Singapore 487372, Singapore
[7] Sungkyunkwan Univ, Dept Elect & Comp Engn, Suwon 16419, South Korea
来源
20TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE, IWCMC 2024 | 2024年
基金
中国国家自然科学基金; 新加坡国家研究基金会;
关键词
Generative AI (GAI); large language model; mixture of experts; network optimization;
D O I
10.1109/IWCMC61514.2024.10592370
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Optimizing various wireless user tasks poses a significant challenge for networking systems because of the expanding range of user requirements. Despite advancements in Deep Reinforcement Learning (DRL), the need for customized optimization tasks for individual users complicates developing and applying numerous DRL models, leading to substantial computation resource and energy consumption and can lead to inconsistent outcomes. To address this issue, we propose a novel approach utilizing a Mixture of Experts (MoE) framework, augmented with Large Language Models (LLMs), to analyze user objectives and constraints effectively, select specialized DRL experts, and weigh each decision from the participating experts. Specifically, we develop a gate network to oversee the expert models, allowing a collective of experts to tackle a wide array of new tasks. Furthermore, we innovatively substitute the traditional gate network with an LLM, leveraging its advanced reasoning capabilities to manage expert model selection for joint decisions. Our proposed method reduces the need to train new DRL models for each unique optimization problem, decreasing energy consumption and AI model implementation costs. The LLM-enabled MoE approach is validated through a general maze navigation task and a specific network service provider utility maximization task, demonstrating its effectiveness and practical applicability in optimizing complex networking systems.
引用
收藏
页码:531 / 536
页数:6
相关论文
共 49 条
  • [41] Large language model-based code generation for the control of construction assembly robots: A hierarchical generation approach
    Luo, Hanbin
    Wu, Jianxin
    Liu, Jiajing
    Antwi-Afari, Maxwell Fordjour
    DEVELOPMENTS IN THE BUILT ENVIRONMENT, 2024, 19
  • [42] Amyloid-β Deposition Prediction With Large Language Model Driven and Task-Oriented Learning of Brain Functional Networks
    Liu, Yuxiao
    Liu, Mianxin
    Zhang, Yuanwang
    Guan, Yihui
    Guo, Qihao
    Xie, Fang
    Shen, Dinggang
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2025, 44 (04) : 1809 - 1820
  • [43] Generating Plausible and Context-Appropriate Comments on Social Media Posts: A Large Language Model-Based Approach
    Ha, Taehyun
    IEEE ACCESS, 2024, 12 : 161545 - 161556
  • [44] GPT meets PubMed: a novel approach to literature review using a large language model to crowdsource migraine medication reviews
    Mackenzie, Elyse
    Cheng, Roger
    Zhang, Pengfei
    BMC NEUROLOGY, 2025, 25 (01)
  • [45] Pixel-level spectral aflatoxin B1 content intelligent prediction via fine-tuning large language model (LLM)
    Zhu, Hongfei
    Zhao, Yifan
    Zhao, Longgang
    Yang, Ranbing
    Han, Zhongzhi
    FOOD CONTROL, 2025, 171
  • [46] TOMGPT: Reliable Text-Only Training Approach for Cost-Effective Multi-modal Large Language Model
    Chen, Yunkai
    Wang, Qimeng
    Wu, Shiwei
    Gao, Yan
    Xu, Tong
    Hu, Yao
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2024, 18 (07)
  • [47] Large Language Model Approach for Zero-Shot Information Extraction and Clustering of Japanese Radiology Reports: Algorithm Development and Validation
    Yamagishi, Yosuke
    Nakamura, Yuta
    Hanaoka, Shouhei
    Abe, Osamu
    JMIR CANCER, 2025, 11
  • [48] Claude 2.0 large language model: Tackling a real-world classification problem with a new iterative prompt engineering approach
    Caruccio, Loredana
    Cirillo, Stefano
    Polese, Giuseppe
    Solimando, Giandomenico
    Sundaramurthy, Shanmugam
    Tortora, Genoveffa
    INTELLIGENT SYSTEMS WITH APPLICATIONS, 2024, 21
  • [49] Revisiting time-varying dynamics in stock market forecasting: A multi-source sentiment analysis approach with large language model
    Shao, Zhiqi
    Yao, Xusheng
    Chen, Feng
    Wang, Ze
    Gao, Junbin
    DECISION SUPPORT SYSTEMS, 2025, 190