Attention-based variable-size feature compression module for edge inference

被引：2

作者：

Li, Shibao ^{[1
]}

Ma, Chenxu ^{[1
]}

Zhang, Yunwu ^{[1
]}

Li, Longfei ^{[1
]}

Wang, Chengzhi ^{[1
]}

Cui, Xuerong ^{[1
]}

Liu, Jianhang ^{[2
]}

机构：

[1] China Univ Petr East China, Coll Oceanog & Space Informat, Qingdao 266580, Peoples R China

[2] China Univ Petr East China, Coll Comp Sci & Technol, Qingdao 266580, Peoples R China

来源：

JOURNAL OF SUPERCOMPUTING | 2024年 / 80卷 / 06期

基金：

中国国家自然科学基金;

关键词：

Edge AI; Edge inference; Feature compression; Attention mechanism; SEMANTIC SEGMENTATION; NETWORK;

D O I：

10.1007/s11227-023-05779-y

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Artificial intelligence has made significant breakthroughs in many fields, especially with the broad deployment of edge devices, which provides opportunities to develop and apply various intelligent models in edge networks. Edge device-server co-inference system has gradually become the mainstream of edge intelligent computing. However, the existing feature procession works in the edge inference framework neglect the focus on whether features are important, and the processed features are still redundant, affecting the inference efficiency. In this paper, we propose a novel attention-based variable-size feature compression module to enhance edge systems' inference efficiency by leveraging input data's varying importance levels. First, a multi-scale attention mechanism is introduced, which operates jointly in the channel spatial to effectively compute importance weights from the intermediate output features of the edge devices. These weights are then utilized to assign different transmission probabilities, filtering out irrelevant feature data and prioritizing task-relevant information. Second, the new loss algorithm and progressive model training strategy are designed to optimize the proposed module, enabling the model to adapt to the reduced feature data gradually and effectively. Finally, experimental results on CIFAR-10 and ImageNet datasets demonstrate the effectiveness of our proposed solution, showcasing a significant reduction in the data output volume of edge devices and minimizing communication overhead while ensuring minimal loss in model accuracy.

引用

页码：8469 / 8484

页数：16

共 44 条

[1]

Anwar S, 2015, INT CONF ACOUST SPEE, P1131, DOI 10.1109/ICASSP.2015.7178146

[2]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

[3] Edge Intelligence: The Confluence of Edge Computing and Artificial Intelligence [J].

Deng, Shuiguang ;

Zhao, Hailiang ;

Fang, Weijia ;

Yin, Jianwei ;

Dustdar, Schahram ;

Zomaya, Albert Y. .

IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (08) :7457-7469

[4]

Ding C, 2022, P IEEE T MOB COMP, P1

[5] Towards Transmission-Friendly and Robust CNN Models over Cloud and Device [J].

Ding, Chuntao ;

Lu, Zhichao ;

Juefei-Xu, Felix ;

Boddeti, Vishnu Naresh ;

Li, Yidong ;

Cao, Jiannong .

IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (10) :6176-6189

[6] JointDNN: An Efficient Training and Inference Engine for Intelligent Mobile Cloud Computing Services [J].

Eshratifar, Amir Erfan ;

Abrishami, Mohammad Saeed ;

Pedram, Massoud .

IEEE TRANSACTIONS ON MOBILE COMPUTING, 2021, 20 (02) :565-576

[7]

Gao H, 2023, P IEEE T VEH TECHN, P1

[8] Neural Collaborative Learning for User Preference Discovery From Biased Behavior Sequences [J].

Gao, Honghao ;

Wu, Yinchen ;

Xu, Yueshen ;

Li, Rui ;

Jiang, Zhiping .

IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (04) :5068-5078

[9] CAMRL: A Joint Method of Channel Attention and Multidimensional Regression Loss for 3D Object Detection in Automated Vehicles [J].

Gao, Honghao ;

Fang, Danqing ;

Xiao, Junsheng ;

Hussain, Walayat ;

Kim, Jung Yoon .

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (08) :8831-8845

[10] A Mutually Supervised Graph Attention Network for Few-Shot Segmentation: The Perspective of Fully Utilizing Limited Samples [J].

Gao, Honghao ;

Xiao, Junsheng ;

Yin, Yuyu ;

Liu, Tong ;

Shi, Jiangang .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) :4826-4838

← 1 2 3 4 5 →