Perceptual multi-channel visual feature fusion for scene categorization

被引：14

作者：

Sun, Xiao ^{[1
]}

Liu, Zhenguang ^{[2
]}

Hu, Yuxing ^{[3
]}

Zhang, Luming ^{[1
]}

Zimmermann, Roger ^{[2
]}

机构：

[1] Hefei Univ Technol, Sch Comp & Informat, Hefei, Anhui, Peoples R China

[2] Natl Univ Singapore, Sch Comp, Singapore, Singapore

[3] Tsinghua Univ, Sch Aerosp Engn, Beijing, Peoples R China

来源：

INFORMATION SCIENCES | 2018年 / 429卷

关键词：

Image kernel; Feature fusion; Scene categoriztion; Perception; MACHINE; CLASSIFICATION; MODEL;

D O I：

10.1016/j.ins.2017.10.051

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Effectively recognizing sceneries from a variety of categories is an indispensable but challenging technique in computer vision and intelligent systems. In this work, we propose a novel image kernel based on human gaze shifting, aiming at discovering the mechanism of humans perceiving visually/semantically salient regions within a scenery. More specifically, we first design a weakly supervised embedding algorithm which projects the local image features (i.e., graphlets in this work) onto the pre-defined semantic space. Thereby, we describe each graphlet by multiple visual features at both low-level and high-level. It is generally acknowledged that humans attend to only a few regions within a scenery. Thus we formulate a sparsity-constrained graphlet ranking algorithm which incorporates visual clues at both the low-level and the high-level. According to human visual perception, these top-ranked graphlets are either visually or semantically salient. We sequentially connect them into a path which mimics human gaze shifting. Lastly, a so-called gaze shifting kernel (GSK) is calculated based on the learned paths from a collection of scene images. And a kernel SVM is employed for calculating the scene categories. Comprehensive experiments on a series of well-known scene image sets shown the competitiveness and robustness of our GSK. We also demonstrated the high consistency of the predicted path with real human gaze shifting path. (C) 2017 Published by Elsevier Inc.

引用

页码：37 / 48

页数：12

共 50 条

[1] mCENTRIST: A Multi-Channel Feature Generation Mechanism for Scene Categorization
Xiao, Yang
Wu, Jianxin
Yuan, Junsong
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2014, 23 (02) : 823 - 836
[2] Multi-Channel Attentive Feature Fusion for Radio Frequency Fingerprinting
Zeng, Yuan
Gong, Yi
Liu, Jiawei
Lin, Shangao
Han, Zidong
Cao, Ruoxiao
Huang, Kaibin
Letaief, Khaled B.
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2024, 23 (05) : 4243 - 4254
[3] Perceptual information hiding based on multi-channel visual masking
Li, Duo
Zhai, Guangtao
Yang, Xiaokang
Hu, Menghan
Liu, Jing
NEUROCOMPUTING, 2017, 269 : 170 - 179
[4] Bearing Fault Diagnosis Method Based on Attention Mechanism and Multi-Channel Feature Fusion
Gao, Hongfeng
Ma, Jie
Zhang, Zhonghang
Cai, Chaozhi
IEEE ACCESS, 2024, 12 : 45011 - 45025
[5] Sentiment Analysis of Weibo Posts on Public Health Emergency with Feature Fusion and Multi-Channel
Pu H.
Wei Z.
Zhanpeng Z.
Yuxin W.
Haoyu F.
Data Analysis and Knowledge Discovery, 2021, 5 (11): : 68 - 79
[6] Scene categorization based on local-global feature fusion and multi-scale multi-spatial resolution encoding
Qin, Jianzhao
Deng, Fuqin
Yung, Nelson H. C.
SIGNAL IMAGE AND VIDEO PROCESSING, 2014, 8 : S145 - S154
[7] Multi-channel biomimetic visual transformation for object feature extraction and recognition of complex scenes
Yu, Lingli
Jin, Mingyue
Zhou, Kaijun
APPLIED INTELLIGENCE, 2020, 50 (03) : 792 - 811
[8] Feature fusion within local region using localized maximum-margin learning for scene categorization
Qin, Jianzhao
Yung, Nelson N. C.
PATTERN RECOGNITION, 2012, 45 (04) : 1671 - 1683
[9] A Feature Fusion Method Based on Multi-Classification Losses for Fine-Grained Visual Categorization
Zhu, Mengmeng
Wan, Shouhong
Jin, Peiquan
Tian, Qijun
2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 6072 - 6074
[10] CENTRIST: A Visual Descriptor for Scene Categorization
Wu, Jianxin
Rehg, James M.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (08) : 1489 - 1501

← 1 2 3 4 5 →