Attentive Temporal Pyramid Network for Dynamic Scene Classification

被引:0
|
作者
Huang, Yuanjun [1 ,2 ,3 ,4 ]
Cao, Xianbin [1 ,3 ,4 ]
Zhen, Xiantong [1 ,3 ,4 ]
Han, Jungong [2 ]
机构
[1] Beihang Univ, Sch Elect & Informat Engn, Beijing 100191, Peoples R China
[2] Univ Lancaster, Lancaster LA1 4YW, England
[3] Beihang Univ, Minist Ind & Informat Technol China, Key Lab Adv technol Near Space Informat Syst, Beijing, Peoples R China
[4] Beijing Adv Innovat Ctr Big Data Based Precis Med, Beijing, Peoples R China
来源
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2019年
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dynamic scene classification is an important yet challenging problem especially with the presence of defected or irrelevant frames due to unconstrained imaging conditions such as illumination, camera motion and irrelevant background. In this paper, we propose the attentive temporal pyramid network (ATP-Net) to establish effective representations of dynamic scenes by extracting and aggregating the most informative and discriminative features. The proposed ATP-Net detects informative features of frames that contain the most relevant information to scenes by a temporal pyramid structure with the incorporated attention mechanism. These frame features are effectively fused by a newly designed kernel aggregation layer based on kernel approximation into a discriminative holistic representations of dynamic scenes. The proposed ATP-Net leverages the strength of attention mechanism to select the most relevant frame features and the ability of kernels to achieve optimal feature fusion for discriminative representations of dynamic scenes. Extensive experiments and comparisons are conducted on three benchmark datasets and the results show our superiority over the state-of-the-art methods on all these three benchmark datasets.
引用
收藏
页码:8497 / 8504
页数:8
相关论文
共 50 条
  • [41] LARGE-SCALE VIDEO EVENT CLASSIFICATION USING DYNAMIC TEMPORAL PYRAMID MATCHING OF VISUAL SEMANTICS
    Codella, Noel C. F.
    Hua, Gang
    Cao, Liangliang
    Merler, Michele
    Gong, Leiguang
    Hill, Matt
    Smith, John R.
    2013 20TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2013), 2013, : 2862 - 2866
  • [42] Road marking extraction in UAV imagery using attentive capsule feature pyramid network
    Guan, Haiyan
    Lei, Xiangda
    Yu, Yongtao
    Zhao, Haohao
    Peng, Daifeng
    Marcato Junior, Jose
    Li, Jonathan
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2022, 107
  • [43] Pyramid-attentive GAN for multimodal brain image complementation in Alzheimer's disease classification
    Zhang, Mengyi
    Sun, Lijing
    Kong, Zhaokai
    Zhu, Wenjun
    Yi, Yang
    Yan, Fei
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 89
  • [44] Towards Accurate Scene Text Detection with Bidirectional Feature Pyramid Network
    Cao, Dongping
    Dang, Jiachen
    Zhong, Yong
    SYMMETRY-BASEL, 2021, 13 (03):
  • [45] A novel pyramid temporal causal network for weather prediction
    Yuan, Minglei
    FRONTIERS IN PLANT SCIENCE, 2023, 14
  • [46] Temporal adaptive feature pyramid network for action detection
    Xiang, Xuezhi
    Yin, Hang
    Qiao, Yulong
    El Saddik, Abdulmotaleb
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 240
  • [47] Continual learning with attentive recurrent neural networks for temporal data classification
    Yin, Shao-Yu
    Huang, Yu
    Chang, Tien-Yu
    Chang, Shih-Fang
    Tseng, Vincent S.
    NEURAL NETWORKS, 2023, 158 : 171 - 187
  • [48] A New Temporal Deconvolutional Pyramid Network for Action Detection
    Ji, Xiangli
    Luo, Guibo
    Zhu, Yuesheng
    COMPUTER VISION - ACCV 2018, PT IV, 2019, 11364 : 696 - 711
  • [49] A CNN-Based Feature Pyramid Segmentation Strategy for Acoustic Scene Classification
    Xi, Ji
    Xie, Yue
    Jiang, Pengxu
    Jiang, Wei
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2024, E107D (08) : 1093 - 1096
  • [50] Temporal Residual Networks for Dynamic Scene Recognition
    Feichtenhofer, Christoph
    Pinz, Axel
    Wildes, Richard P.
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 7435 - 7444