Global and Graph Encoded Local Discriminative Region Representation for Scene Recognition

被引:3
作者
Lin, Chaowei [1 ]
Lee, Feifei [1 ]
Cai, Jiawei [1 ]
Chen, Hanqing [1 ]
Chen, Qiu [2 ]
机构
[1] Univ Shanghai Sci & Technol, Sch Med Instrument & Food Engn, Shanghai 200093, Peoples R China
[2] Kogakuin Univ, Grad Sch Engn, Elect Engn & Elect, Tokyo 1638677, Japan
来源
CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES | 2021年 / 128卷 / 03期
关键词
Scene  recognition; Convolutional  Neural  Networks; multi-head attention; class activation mapping; graph convolutional  networks; FEATURE FUSION; SCALE; NETWORK;
D O I
10.32604/cmes.2021.014522
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Scene recognition is a fundamental task in computer vision, which generally includes three vital stages, namely feature extraction, feature transformation and classification. Early research mainly focuses on feature extraction, but with the rise of Convolutional Neural Networks (CNNs), more and more feature transformation methods are proposed based on CNN features. In this work, a novel feature transformation algorithm called Graph Encoded Local Discriminative Region Representation (GEDRR) is proposed to find discriminative local representations for scene images and explore the relationship between the discriminative regions. In addition, we propose a method using the multi-head attention module to enhance and fuse convolutional feature maps. Combining the two methods and the global representation, a scene recognition framework called Global and Graph Encoded Local Discriminative Region Representation (G2ELDR2) is proposed. The experimental results on three scene datasets demonstrate the effectiveness of our model, which outperforms many state-of-the-arts.
引用
收藏
页码:985 / 1006
页数:22
相关论文
共 67 条
  • [1] [Anonymous], 2014, P AS C COMP VIS
  • [2] [Anonymous], 2015, DEEP SPATIAL PYRAMID
  • [3] [Anonymous], 2017, INT C LEARN REPR ICL
  • [4] Ba J., 2016, STAT-US, V07
  • [5] Coordinate CNNs and LSTMs to categorize scene images with multi-views and multi-levels of abstraction
    Bai, Shuang
    Tang, Huadong
    An, Shan
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2019, 120 : 298 - 309
  • [6] Speeded-Up Robust Features (SURF)
    Bay, Herbert
    Ess, Andreas
    Tuytelaars, Tinne
    Van Gool, Luc
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2008, 110 (03) : 346 - 359
  • [7] Advanced Feature Fusion Algorithm Based on Multiple Convolutional Neural Network for Scene Recognition
    Chen, Lei
    Bo, Kanghu
    Lee, Feifei
    Chen, Qiu
    [J]. CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2020, 122 (02): : 505 - 523
  • [8] SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning
    Chen, Long
    Zhang, Hanwang
    Xiao, Jun
    Nie, Liqiang
    Shao, Jian
    Liu, Wei
    Chua, Tat-Seng
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6298 - 6306
  • [9] Scene recognition with objectness
    Cheng, Xiaojuan
    Lu, Jiwen
    Feng, Jianjiang
    Yuan, Bo
    Zhou, Jie
    [J]. PATTERN RECOGNITION, 2018, 74 : 474 - 487
  • [10] Histograms of oriented gradients for human detection
    Dalal, N
    Triggs, B
    [J]. 2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, : 886 - 893