A sketch recognition method based on bi-modal model using cooperative learning paradigm

被引：0

作者：

Zhang S. ^{[1
,2
]}

Wang L. ^{[1
,3
]}

Cui Z. ^{[1
]}

Wang S. ^{[1
,4
]}

机构：

[1] School of Information Science and Engineering, Yanshan University, Hebei Street West Section, Hebei Province, Qinhuangdao

[2] Key Laboratory for Computer Virtual Technology and System Integration of Hebei Province, Yanshan University, Hebei Street West Section, Hebei Province, Qinhuangdao

[3] School of Information Technology, Hebei University of Business and Economics, Xuefu Road, Hebei Province, Shijiazhuang

[4] School of Mathematics and Information Science and Technology, Hebei Normal University of Science and Technology, Hebei Street West Section, Hebei Province, Qinhuangdao

来源：

Neural Computing and Applications | 2024年 / 36卷 / 23期

基金：

中国国家自然科学基金;

关键词：

Cooperative learning paradigm; Sketch recognition; Structural point convolution block;

D O I：

10.1007/s00521-024-09836-2

中图分类号：

学科分类号：

摘要：

Static image is an important form of displaying a sketch, representing the appearance information of the sketch. And a stroke sequence composed of several points can also express the shape and contour information of the sketch. Therefore, it is very reasonable to treat a sketch as point-modal data and image-modal data simultaneously. In this paper, a method based on bi-modal model using cooperative learning paradigm is proposed for the sketch recognition task. Specifically, in the point-modal branch, a structural point convolution block is developed by properly dividing local regions to preserve the structural information. In the image-modal branch, the hierarchical residual structure is used to fully extract image-modal features. To reduce the negative impact of noisy samples on the recognition performance, a cooperative learning paradigm is designed based on different perceptual abilities of two modal branches on noisy samples, that is, when training the two branches, the noisy samples can be filtered out through information exchanges and mutual learning. Extensive experiments on the sketch datasets TU-Berlin and QuickDraw show that the proposed method outperforms most baseline methods and has many advantages such as no dependence on additional data and stroke information. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024.

引用

页码：14275 / 14290

页数：15

共 50 条

[1] Emotion Recognition Based on Meta Bi-Modal Learning Model
Li Z.
Sun Y.
Zhang X.
Zhou Y.
Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2023, 46 (05): : 87 - 105
[2] Bi-Modal Bi-Task Emotion Recognition Based on Transformer Architecture
Song, Yu
Zhou, Qi
APPLIED ARTIFICIAL INTELLIGENCE, 2024, 38 (01)
[3] Application of bi-modal signal in the classification and recognition of drug addiction degree based on machine learning
Gu, Xuelin
Yang, Banghua
Gao, Shouwei
Yan, Lin Feng
Xu, Ding
Wang, Wen
MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2021, 18 (05) : 6926 - 6940
[4] Gait Emotion Recognition Using a Bi-modal Deep Neural Network
Bhatia, Yajury
Bari, A. S. M. Hossain
Gavrilovn, Marina
ADVANCES IN VISUAL COMPUTING, ISVC 2022, PT I, 2022, 13598 : 46 - 60
[5] Bi-modal Regression for Apparent Personality Trait Recognition
Rai, Nishant
2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 55 - 60
[6] Architectural Synergies in Bi-Modal and Bi-Contrastive Learning
Gu, Yujia
Liu, Brian
Zhang, Tianlong
Sha, Xinye
Chen, Shiyong
IEEE ACCESS, 2024, 12 : 187128 - 187140
[7] Bi-Modal Person Recognition on a Mobile Phone: using mobile phone data
McCool, Chris
Marcel, Sebastien
Hadid, Abdenour
Pietikainen, Matti
Matejka, Pavel
Cernocky, Jan
Poh, Norman
Kittler, Josef
Larcher, Anthony
Levy, Christophe
Matrouf, Driss
Bonastre, Jean-Francois
Tresadern, Phil
Cootes, Timothy
2012 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (ICMEW), 2012, : 635 - 640
[8] Study of Wrist Pulse Signals Using a Bi-Modal Gaussian Model
Rangaprakash, D.
Dutt, D. Narayana
2014 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2014, : 2422 - 2425
[9] Enhancing Feature Correlation for Bi-Modal Group Emotion Recognition
Liu, Ningjie
Fang, Yuchun
Guo, Yike
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2018, PT II, 2018, 11165 : 24 - 34
[10] CORAL: Colored structural representation for bi-modal place recognition
Pan, Yiyuan
Xu, Xuecheng
Li, Weijie
Cui, Yunxiang
Wang, Yue
Xiong, Rong
2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 2084 - 2091

← 1 2 3 4 5 →