A sketch recognition method based on bi-modal model using cooperative learning paradigm

被引:0
|
作者
Zhang S. [1 ,2 ]
Wang L. [1 ,3 ]
Cui Z. [1 ]
Wang S. [1 ,4 ]
机构
[1] School of Information Science and Engineering, Yanshan University, Hebei Street West Section, Hebei Province, Qinhuangdao
[2] Key Laboratory for Computer Virtual Technology and System Integration of Hebei Province, Yanshan University, Hebei Street West Section, Hebei Province, Qinhuangdao
[3] School of Information Technology, Hebei University of Business and Economics, Xuefu Road, Hebei Province, Shijiazhuang
[4] School of Mathematics and Information Science and Technology, Hebei Normal University of Science and Technology, Hebei Street West Section, Hebei Province, Qinhuangdao
基金
中国国家自然科学基金;
关键词
Cooperative learning paradigm; Sketch recognition; Structural point convolution block;
D O I
10.1007/s00521-024-09836-2
中图分类号
学科分类号
摘要
Static image is an important form of displaying a sketch, representing the appearance information of the sketch. And a stroke sequence composed of several points can also express the shape and contour information of the sketch. Therefore, it is very reasonable to treat a sketch as point-modal data and image-modal data simultaneously. In this paper, a method based on bi-modal model using cooperative learning paradigm is proposed for the sketch recognition task. Specifically, in the point-modal branch, a structural point convolution block is developed by properly dividing local regions to preserve the structural information. In the image-modal branch, the hierarchical residual structure is used to fully extract image-modal features. To reduce the negative impact of noisy samples on the recognition performance, a cooperative learning paradigm is designed based on different perceptual abilities of two modal branches on noisy samples, that is, when training the two branches, the noisy samples can be filtered out through information exchanges and mutual learning. Extensive experiments on the sketch datasets TU-Berlin and QuickDraw show that the proposed method outperforms most baseline methods and has many advantages such as no dependence on additional data and stroke information. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024.
引用
收藏
页码:14275 / 14290
页数:15
相关论文
共 50 条
  • [1] Emotion Recognition Based on Meta Bi-Modal Learning Model
    Li Z.
    Sun Y.
    Zhang X.
    Zhou Y.
    Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2023, 46 (05): : 87 - 105
  • [2] Bi-Modal Bi-Task Emotion Recognition Based on Transformer Architecture
    Song, Yu
    Zhou, Qi
    APPLIED ARTIFICIAL INTELLIGENCE, 2024, 38 (01)
  • [3] Application of bi-modal signal in the classification and recognition of drug addiction degree based on machine learning
    Gu, Xuelin
    Yang, Banghua
    Gao, Shouwei
    Yan, Lin Feng
    Xu, Ding
    Wang, Wen
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2021, 18 (05) : 6926 - 6940
  • [4] Gait Emotion Recognition Using a Bi-modal Deep Neural Network
    Bhatia, Yajury
    Bari, A. S. M. Hossain
    Gavrilovn, Marina
    ADVANCES IN VISUAL COMPUTING, ISVC 2022, PT I, 2022, 13598 : 46 - 60
  • [5] Bi-modal Regression for Apparent Personality Trait Recognition
    Rai, Nishant
    2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 55 - 60
  • [6] Architectural Synergies in Bi-Modal and Bi-Contrastive Learning
    Gu, Yujia
    Liu, Brian
    Zhang, Tianlong
    Sha, Xinye
    Chen, Shiyong
    IEEE ACCESS, 2024, 12 : 187128 - 187140
  • [7] Bi-Modal Person Recognition on a Mobile Phone: using mobile phone data
    McCool, Chris
    Marcel, Sebastien
    Hadid, Abdenour
    Pietikainen, Matti
    Matejka, Pavel
    Cernocky, Jan
    Poh, Norman
    Kittler, Josef
    Larcher, Anthony
    Levy, Christophe
    Matrouf, Driss
    Bonastre, Jean-Francois
    Tresadern, Phil
    Cootes, Timothy
    2012 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (ICMEW), 2012, : 635 - 640
  • [8] Study of Wrist Pulse Signals Using a Bi-Modal Gaussian Model
    Rangaprakash, D.
    Dutt, D. Narayana
    2014 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2014, : 2422 - 2425
  • [9] Enhancing Feature Correlation for Bi-Modal Group Emotion Recognition
    Liu, Ningjie
    Fang, Yuchun
    Guo, Yike
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2018, PT II, 2018, 11165 : 24 - 34
  • [10] CORAL: Colored structural representation for bi-modal place recognition
    Pan, Yiyuan
    Xu, Xuecheng
    Li, Weijie
    Cui, Yunxiang
    Wang, Yue
    Xiong, Rong
    2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 2084 - 2091