Perceptual Visual Feature Learning With Applications in Sports Educational Image Understanding

被引：0

作者：

Liu, Tengsheng ^{[1
]}

Xu, Minghui ^{[2
]}

机构：

[1] Wuhan Inst Technol, Dept Phys Educ, Wuhan 430070, Peoples R China

[2] Jinhua Polytech, Key Lab Crop Harvesting Equipment Technol Zhejiang, Jinhua 321017, Peoples R China

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Perceptual; feature fusion; local-global; active learning; deep architecture; SCENE; CLASSIFICATION; SEGMENTATION; MANIFOLD; MODEL;

D O I：

10.1109/ACCESS.2024.3377657

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Effectively understanding the semantics of sophisticated sceneries is a key module in plenty of artificial intelligence (AI) systems. In this article, we optimally fuse multi-channel perceptual visual features for recognizing scenic pictures with complex spatial configurations, focusing on formulating a deep hierarchical model to actively discover human gaze allocation. In detail, to uncover semantically/visually important patches within each scenery, we utilize the BING objectness descriptor to rapidly and accurately localize multi-scale objects or their components. Subsequently, a local-global feature fusion scenario is proposed to dynamically combine the multiple low-level features from multiple scenic patches. To simulate how humans perceiving semantically/visually important scenic patches, we design a robust deep active learning (RDAL) paradigm that sequentially derives gaze shift path (GSP) and hierarchically learns deep GSP features in a unified architecture. Notably, the key advantage of RDAL is the high tolerance of label noise by adding an elaborately-designed sparse penalty. That is, the contaminated and redundant deep GSP features can be implicitly abandoned. Finally, the refined deep GSP features are integrated into a multi-label SVM for recognizing sceneries of different categories. Empirical comparisons showed that: 1) our method performs competitively on six generic scenery set (average accuracy 2% similar to 4.3% higher than the second best performer), and 2) our deep GSP feature is particularly discriminative to our compiled sport educational image set (average accuracy 7.7% higher than the second best performer).

引用

页码：41168 / 41179

页数：12

共 50 条

[1] Perceptual Feature Integration for Sports Dancing Action Scenery Detection and Optimization
Xiang, Lingjun
Gao, Xiang
IEEE ACCESS, 2024, 12 : 122101 - 122113
[2] Profiles of visual perceptual learning in feature space
Shen, Shiqi
Sun, Yueling
Lu, Jiachen
Li, Chu
Chen, Qinglin
Mo, Ce
Fang, Fang
Zhang, Xilin
ISCIENCE, 2024, 27 (03)
[3] High-Order and Interactive Perceptual Feature Learning for Medical Image Retargeting
Ma, Mingjuan
Zhang, Yuehong
IEEE ACCESS, 2025, 13 : 55358 - 55369
[4] Perceptual multi-channel visual feature fusion for scene categorization
Sun, Xiao
Liu, Zhenguang
Hu, Yuxing
Zhang, Luming
Zimmermann, Roger
INFORMATION SCIENCES, 2018, 429 : 37 - 48
[5] Polysemious visual representation based on feature aggregation for large scale image applications
Song, Xinghang
Jiang, Shuqiang
Wang, Shuhui
Li, Liang
Huang, Qingming
MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (02) : 595 - 611
[6] Unsupervised learning of perceptual feature combinations
Tamosiunaite, Minija
Tetzlaff, Christian
Woergoetter, Florentin
PLOS COMPUTATIONAL BIOLOGY, 2024, 20 (03)
[7] Perceptual visual security assessment by fusing local and global feature similarity
Xiong, Jian
Zhu, Xinzhong
Yuan, Jie
Shi, Ran
Gao, Hao
COMPUTERS & ELECTRICAL ENGINEERING, 2021, 91
[8] Visual Understanding via Multi-Feature Shared Learning With Global Consistency
Zhang, Lei
Zhang, David
IEEE TRANSACTIONS ON MULTIMEDIA, 2016, 18 (02) : 247 - 259
[9] Morphological Feature Extraction for Statistical Learning With Applications To Solar Image Data
Stenning, David C.
Lee, Thomas C. M.
van Dyk, David A.
Kashyap, Vinay
Sandell, Julia
Young, C. Alex
STATISTICAL ANALYSIS AND DATA MINING, 2013, 6 (04) : 329 - 345
[10] Perceptual Hashing With Visual Content Understanding for Reduced-Reference Screen Content Image Quality Assessment
Huang, Ziqing
Liu, Shiguang
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (07) : 2808 - 2823

← 1 2 3 4 5 →