Latency-Aware Unified Dynamic Networks for Efficient Image Recognition

被引:1
|
作者
Han, Yizeng [1 ]
Liu, Zeyu [2 ]
Yuan, Zhihang [3 ]
Pu, Yifan [1 ]
Wang, Chaofei [1 ]
Song, Shiji [1 ]
Huang, Gao [4 ]
机构
[1] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China
[2] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
[3] Houmo AI, Beijing 100088, Peoples R China
[4] Tsinghua Univ, Beijing Acad Artificial Intelligence, Dept Automat, Beijing 100084, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Dynamic networks; efficient inference; convolutional neural networks; vision transformers;
D O I
10.1109/TPAMI.2024.3393530
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dynamic networks have become a pivotal area of study in deep learning due to their ability to selectively activate computing units (such as layers or channels) or dynamically allocate computation to information-rich regions. This capability significantly curtails unnecessary computations, adapting to varying inputs. Despite these advantages, the practical efficiency of dynamic models often falls short of theoretical computation. This discrepancy arises from three primary challenges: 1) a lack of a unified framework across different dynamic inference paradigms due to the fragmented research landscape; 2) an excessive focus on algorithm design at the expense of scheduling strategies, which are essential for optimizing resource utilization on hardware; and 3) the complexity of latency evaluation, since most current libraries cater to static operators. To tackle these issues, we introduce Latency-Aware Unified Dynamic Networks (LAUDNet), a general framework that integrates three fundamental dynamic paradigms-spatially-adaptive computation, layer skipping, and channel skipping-into a single unified formulation. LAUDNet not only refines algorithmic design but also enhances scheduling optimization with the aid of a latency predictor. This predictor efficiently and accurately predicts the inference latency of dynamic operators on specific hardware setups. Our empirical assessments across multiple vision tasks-image classification, object detection, and instance segmentation-confirm that LAUDNet significantly bridges the gap between theoretical and practical efficiency. For instance, LAUDNet cuts down the practical latency of its static counterpart, ResNet-101, by over 50% on hardware platforms like V100, RTX 3090, and TX2 GPUs. Additionally, LAUDNet excels in the accuracy-efficiency trade-off compared to other methods.
引用
收藏
页码:7760 / 7774
页数:15
相关论文
共 50 条
  • [31] Intelligent Latency-Aware Virtual Network Embedding for Industrial Wireless Networks
    Li, Mingyan
    Chen, Cailian
    Hua, Cunqing
    Guan, Xinping
    IEEE INTERNET OF THINGS JOURNAL, 2019, 6 (05) : 7484 - 7496
  • [32] Latency-Aware Rate Adaptation in 802.11n Home Networks
    Li, Chi-Yu
    Peng, Chunyi
    Lu, Songwu
    Wang, Xinbing
    Chandra, Ranveer
    2015 IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (INFOCOM), 2015,
  • [33] Broker-placement in latency-aware peer-to-peer networks
    Garbacki, Pawel
    Epema, Dick H. J.
    van Steen, Maarten
    COMPUTER NETWORKS, 2008, 52 (08) : 1617 - 1633
  • [34] Joint Fault Tolerant and Latency-Aware Design of Multilayer Optical Networks
    Pedreno-Manresa, Jose-Juan
    Izquierdo-Zaragoza, Jose-Luis
    Pavon-Marino, Pablo
    20TH INTERNATIONAL CONFERENCE ON OPTICAL NETWORK DESIGN AND MODELING (ONDM 2016), 2016,
  • [35] Latency-Aware Load Distribution Algorithm for Microservice Deployment in UAV Networks
    Garcia-Gil, Santiago
    Ramos-Ramos, Diego
    Gomez-delaHiz, Jose
    Garcia-Lopez, Andres
    Lopez-Lopez, Sergio
    Manuel Murillo, Juan
    Galan-Jimenez, Jaime
    2024 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS, ISCC 2024, 2024,
  • [36] Latency-aware VNF Chain Deployment with Efficient Resource Reuse at Network Edge
    Jin, Panpan
    Fei, Xincai
    Zhang, Qixia
    Liu, Fangming
    Li, Bo
    IEEE INFOCOM 2020 - IEEE CONFERENCE ON COMPUTER COMMUNICATIONS, 2020, : 267 - 276
  • [37] A Latency-Aware Algorithm for Dynamic Service Placement in Large-Scale Overlays
    Famaey, Jeroen
    De Cock, Wouter
    Wauters, Tim
    De Turck, Filip
    Dhoedt, Bart
    Demeester, Piet
    2009 IFIP/IEEE INTERNATIONAL SYMPOSIUM ON INTEGRATED NETWORK MANAGEMENT (IM 2009) VOLS 1 AND 2, 2009, : 414 - 421
  • [38] Latency-Aware Computation Offloading in Multi-RIS-Assisted Edge Networks
    Huang, An
    Qu, Long
    Khabbaz, Maurice J.
    IEEE OPEN JOURNAL OF THE COMMUNICATIONS SOCIETY, 2024, 5 : 1204 - 1221
  • [39] Latency-aware blockage prediction in vision-aided federated wireless networks
    Khan, Ahsan Raza
    Ahmad, Iftikhar
    Mohjazi, Lina
    Hussain, Sajjad
    Bin Rais, Rao Naveed
    Imran, Muhammad Ali
    Zoha, Ahmed
    FRONTIERS IN COMMUNICATIONS AND NETWORKS, 2023, 4
  • [40] Direct ONU Interconnection Schemes Towards Latency-Aware Passive Optical Networks
    Garg, Amit Kumar
    Janyani, Vijay
    2017 INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATIONS AND ELECTRONICS (COMPTELIX), 2017, : 505 - 510