Attentive Temporal Pyramid Network for Dynamic Scene Classification

被引：0

作者：

Huang, Yuanjun ^{[1
,2
,3
,4
]}

Cao, Xianbin ^{[1
,3
,4
]}

Zhen, Xiantong ^{[1
,3
,4
]}

Han, Jungong ^{[2
]}

机构：

[1] Beihang Univ, Sch Elect & Informat Engn, Beijing 100191, Peoples R China

[2] Univ Lancaster, Lancaster LA1 4YW, England

[3] Beihang Univ, Minist Ind & Informat Technol China, Key Lab Adv technol Near Space Informat Syst, Beijing, Peoples R China

[4] Beijing Adv Innovat Ctr Big Data Based Precis Med, Beijing, Peoples R China

来源：

THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2019年

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Dynamic scene classification is an important yet challenging problem especially with the presence of defected or irrelevant frames due to unconstrained imaging conditions such as illumination, camera motion and irrelevant background. In this paper, we propose the attentive temporal pyramid network (ATP-Net) to establish effective representations of dynamic scenes by extracting and aggregating the most informative and discriminative features. The proposed ATP-Net detects informative features of frames that contain the most relevant information to scenes by a temporal pyramid structure with the incorporated attention mechanism. These frame features are effectively fused by a newly designed kernel aggregation layer based on kernel approximation into a discriminative holistic representations of dynamic scenes. The proposed ATP-Net leverages the strength of attention mechanism to select the most relevant frame features and the ability of kernels to achieve optimal feature fusion for discriminative representations of dynamic scenes. Extensive experiments and comparisons are conducted on three benchmark datasets and the results show our superiority over the state-of-the-art methods on all these three benchmark datasets.

引用

页码：8497 / 8504

页数：8

共 50 条

[41] LARGE-SCALE VIDEO EVENT CLASSIFICATION USING DYNAMIC TEMPORAL PYRAMID MATCHING OF VISUAL SEMANTICS
Codella, Noel C. F.
Hua, Gang
Cao, Liangliang
Merler, Michele
Gong, Leiguang
Hill, Matt
Smith, John R.
2013 20TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2013), 2013, : 2862 - 2866
[42] Road marking extraction in UAV imagery using attentive capsule feature pyramid network
Guan, Haiyan
Lei, Xiangda
Yu, Yongtao
Zhao, Haohao
Peng, Daifeng
Marcato Junior, Jose
Li, Jonathan
INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2022, 107
[43] Pyramid-attentive GAN for multimodal brain image complementation in Alzheimer's disease classification
Zhang, Mengyi
Sun, Lijing
Kong, Zhaokai
Zhu, Wenjun
Yi, Yang
Yan, Fei
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 89
[44] Towards Accurate Scene Text Detection with Bidirectional Feature Pyramid Network
Cao, Dongping
Dang, Jiachen
Zhong, Yong
SYMMETRY-BASEL, 2021, 13 (03):
[45] A novel pyramid temporal causal network for weather prediction
Yuan, Minglei
FRONTIERS IN PLANT SCIENCE, 2023, 14
[46] Temporal adaptive feature pyramid network for action detection
Xiang, Xuezhi
Yin, Hang
Qiao, Yulong
El Saddik, Abdulmotaleb
COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 240
[47] Continual learning with attentive recurrent neural networks for temporal data classification
Yin, Shao-Yu
Huang, Yu
Chang, Tien-Yu
Chang, Shih-Fang
Tseng, Vincent S.
NEURAL NETWORKS, 2023, 158 : 171 - 187
[48] A New Temporal Deconvolutional Pyramid Network for Action Detection
Ji, Xiangli
Luo, Guibo
Zhu, Yuesheng
COMPUTER VISION - ACCV 2018, PT IV, 2019, 11364 : 696 - 711
[49] A CNN-Based Feature Pyramid Segmentation Strategy for Acoustic Scene Classification
Xi, Ji
Xie, Yue
Jiang, Pengxu
Jiang, Wei
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2024, E107D (08) : 1093 - 1096
[50] Temporal Residual Networks for Dynamic Scene Recognition
Feichtenhofer, Christoph
Pinz, Axel
Wildes, Richard P.
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 7435 - 7444

← 1 2 3 4 5 →