Attentive Temporal Pyramid Network for Dynamic Scene Classification

被引:0
|
作者
Huang, Yuanjun [1 ,2 ,3 ,4 ]
Cao, Xianbin [1 ,3 ,4 ]
Zhen, Xiantong [1 ,3 ,4 ]
Han, Jungong [2 ]
机构
[1] Beihang Univ, Sch Elect & Informat Engn, Beijing 100191, Peoples R China
[2] Univ Lancaster, Lancaster LA1 4YW, England
[3] Beihang Univ, Minist Ind & Informat Technol China, Key Lab Adv technol Near Space Informat Syst, Beijing, Peoples R China
[4] Beijing Adv Innovat Ctr Big Data Based Precis Med, Beijing, Peoples R China
来源
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2019年
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dynamic scene classification is an important yet challenging problem especially with the presence of defected or irrelevant frames due to unconstrained imaging conditions such as illumination, camera motion and irrelevant background. In this paper, we propose the attentive temporal pyramid network (ATP-Net) to establish effective representations of dynamic scenes by extracting and aggregating the most informative and discriminative features. The proposed ATP-Net detects informative features of frames that contain the most relevant information to scenes by a temporal pyramid structure with the incorporated attention mechanism. These frame features are effectively fused by a newly designed kernel aggregation layer based on kernel approximation into a discriminative holistic representations of dynamic scenes. The proposed ATP-Net leverages the strength of attention mechanism to select the most relevant frame features and the ability of kernels to achieve optimal feature fusion for discriminative representations of dynamic scenes. Extensive experiments and comparisons are conducted on three benchmark datasets and the results show our superiority over the state-of-the-art methods on all these three benchmark datasets.
引用
收藏
页码:8497 / 8504
页数:8
相关论文
共 50 条
  • [1] CAM-NET: COMPRESSED ATTENTIVE MULTI-GRANULARITY NETWORK FOR DYNAMIC SCENE CLASSIFICATION
    Li, Yue
    Ding, Wenrui
    Zhu, Yanjun
    Huang, Yuanjun
    Jiang, Yalong
    Zhang, Baochang
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 668 - 672
  • [2] Semantic Information Supplementary Pyramid Network for Dynamic Scene Deblurring
    Liu, Yiming
    Luo, Yifei
    Huang, Wenzhuo
    Qiao, Ying
    Li, Junhui
    Xu, Dahong
    Luo, Duqiang
    IEEE ACCESS, 2020, 8 : 188587 - 188599
  • [3] Feature pyramid attention network for audio-visual scene classification
    Zhou, Liguang
    Zhou, Yuhongze
    Qi, Xiaonan
    Hu, Junjie
    Lam, Tin Lun
    Xu, Yangsheng
    CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2024,
  • [4] Pyramid Scene Parsing Network
    Zhao, Hengshuang
    Shi, Jianping
    Qi, Xiaojuan
    Wang, Xiaogang
    Jia, Jiaya
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6230 - 6239
  • [5] Scale attentive network for scene recognition
    Yuan, Xiaohui
    Qiao, Zhinan
    Meyarian, Abolfazl
    NEUROCOMPUTING, 2022, 492 : 612 - 623
  • [6] Dynamic Scene Classification using Spatial and Temporal Cues
    Vasudevan, Arun Balajee
    Muralidharan, Srikanth
    Chintapalli, Shiva Pratheek
    Raman, Shanmuganathan
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2013, : 803 - 810
  • [7] Multi-stream attentive generative adversarial network for dynamic scene deblurring
    Cui, Jinkai
    Li, Weihong
    Gong, Weiguo
    NEUROCOMPUTING, 2020, 383 (39-56) : 39 - 56
  • [8] Scene classification with context pyramid features
    Jiang Y.
    Wang R.
    Wang C.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2010, 22 (08): : 1366 - 1373
  • [9] DeepScene: Scene classification via convolutional neural network with spatial pyramid pooling
    Yee, Pui Sin
    Lim, Kian Ming
    Lee, Chin Poo
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 193
  • [10] ATReSN-Net: Capturing Attentive Temporal Relations in Semantic Neighborhood for Acoustic Scene Classification
    Zhang, Liwen
    Han, Jiqing
    Shi, Ziqiang
    INTERSPEECH 2020, 2020, : 1181 - 1185