Latency-Aware Unified Dynamic Networks for Efficient Image Recognition

被引:1
|
作者
Han, Yizeng [1 ]
Liu, Zeyu [2 ]
Yuan, Zhihang [3 ]
Pu, Yifan [1 ]
Wang, Chaofei [1 ]
Song, Shiji [1 ]
Huang, Gao [4 ]
机构
[1] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China
[2] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
[3] Houmo AI, Beijing 100088, Peoples R China
[4] Tsinghua Univ, Beijing Acad Artificial Intelligence, Dept Automat, Beijing 100084, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Dynamic networks; efficient inference; convolutional neural networks; vision transformers;
D O I
10.1109/TPAMI.2024.3393530
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dynamic networks have become a pivotal area of study in deep learning due to their ability to selectively activate computing units (such as layers or channels) or dynamically allocate computation to information-rich regions. This capability significantly curtails unnecessary computations, adapting to varying inputs. Despite these advantages, the practical efficiency of dynamic models often falls short of theoretical computation. This discrepancy arises from three primary challenges: 1) a lack of a unified framework across different dynamic inference paradigms due to the fragmented research landscape; 2) an excessive focus on algorithm design at the expense of scheduling strategies, which are essential for optimizing resource utilization on hardware; and 3) the complexity of latency evaluation, since most current libraries cater to static operators. To tackle these issues, we introduce Latency-Aware Unified Dynamic Networks (LAUDNet), a general framework that integrates three fundamental dynamic paradigms-spatially-adaptive computation, layer skipping, and channel skipping-into a single unified formulation. LAUDNet not only refines algorithmic design but also enhances scheduling optimization with the aid of a latency predictor. This predictor efficiently and accurately predicts the inference latency of dynamic operators on specific hardware setups. Our empirical assessments across multiple vision tasks-image classification, object detection, and instance segmentation-confirm that LAUDNet significantly bridges the gap between theoretical and practical efficiency. For instance, LAUDNet cuts down the practical latency of its static counterpart, ResNet-101, by over 50% on hardware platforms like V100, RTX 3090, and TX2 GPUs. Additionally, LAUDNet excels in the accuracy-efficiency trade-off compared to other methods.
引用
收藏
页码:7760 / 7774
页数:15
相关论文
共 50 条
  • [41] LAP: Latency-aware automated pruning with dynamic-based filter selection
    Chen, Zailong
    Liu, Chubo
    Yang, Wangdong
    Li, Kenli
    Li, Keqin
    Neural Networks, 2022, 152 : 407 - 418
  • [42] LAP: Latency-aware automated pruning with dynamic-based filter selection
    Chen, Zailong
    Liu, Chubo
    Yang, Wangdong
    Li, Kenli
    Li, Keqin
    NEURAL NETWORKS, 2022, 152 : 407 - 418
  • [43] Greedy Caching: A Latency-aware Caching Strategy for Information-centric Networks
    Banerjee, Bitan
    Seetharam, Anand
    Tellambura, Chintha
    2017 IFIP NETWORKING CONFERENCE (IFIP NETWORKING) AND WORKSHOPS, 2017,
  • [44] Latency-Aware Segmentation and Trust System Placement in Smart Grid SCADA Networks
    Hasan, Md. Mahmud
    Mouftah, Hussein T.
    2016 IEEE 21ST INTERNATIONAL WORKSHOP ON COMPUTER AIDED MODELLING AND DESIGN OF COMMUNICATION LINKS AND NETWORKS (CAMAD), 2016, : 37 - 42
  • [45] Transmission and Computational Latency-aware Load Balancing for Fog Radio Access Networks
    Mukherjee, Mithun
    Liu, Yejun
    Lloret, Jaime
    Guo, Lei
    Matam, Rakesh
    Aazam, Mohammad
    2018 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2018,
  • [46] Latency-Aware Resource Allocation in Green Fog Networks for Industrial IoT Applications
    Basir, Rabeea
    Qaisar, Saad B.
    Ali, Mudassar
    Naeem, Muhammad
    Joshi, Kishor Chandra
    Rodriguez, Jonathan
    2020 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS WORKSHOPS (ICC WORKSHOPS), 2020,
  • [47] Distributed and latency-aware beaconing for asynchronous duty-cycled IoT networks
    Yi, Ming
    Xie, Qinglin
    Long, Peng
    Wu, Yuhang
    Chen, Quan
    Zhang, Fanlong
    Xu, Wenchao
    PEER-TO-PEER NETWORKING AND APPLICATIONS, 2024, : 3650 - 3668
  • [48] Latency-aware VNF Protection for Network Function Virtualization in Elastic Optical Networks
    Peng, Chengzong
    Zheng, Danyang
    Cao, Xiaojun
    2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2021,
  • [49] Latency-Aware Computation Offloading for 5G Networks in Edge Computing
    Li, Xianwei
    Ye, Baoliu
    SECURITY AND COMMUNICATION NETWORKS, 2021, 2021
  • [50] Reliable Latency-Aware Routing for Clustered WSNs
    Tufail, Ali
    INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2012,