A Thangka cultural element classification model based on self-supervised contrastive learning and MS Triplet Attention

被引：1

作者：

Tang, Wenjing ^{[1
]}

Xie, Qing ^{[1
,2
]}

机构：

[1] Wuhan Univ Technol, Sch Comp Sci & Artificial Intelligence, Wuhan 430070, Peoples R China

[2] Minist Educ, Engn Res Ctr Intelligent Serv Technol Digital Publ, Wuhan, Peoples R China

来源：

VISUAL COMPUTER | 2024年 / 40卷 / 06期

基金：

中国国家自然科学基金;

关键词：

Tibetan Thangka classification; Sample imbalance problem; Self-supervised contrastive learning; Gradient Harmonizing Mechanism Loss; Attention mechanism;

D O I：

10.1007/s00371-024-03397-0

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Being a significant repository of Buddhist imagery, Thangka images are valuable historical materials of Tibetan studies, which covers many domains such as Tibetan history, politics, culture, social life and even traditional medicine and astronomy. Thangka cultural element images are the essence of Thangka images. Hence, Thangka cultural element images classification is one of the most important works of knowledge representation and mining in the field of Thangka and is the foundation of digital protection of Thangka images. However, due to the limited quantity, high complexity and the intricate textures of Thangka images, the classification of Thangka images is limited to a small number of categories and coarse granularity. Thus, a novel fusion texture feature dual-branch Thangka cultural elements classification model based on the attention mechanism and self-supervised contrastive learning has been proposed in this paper. Specifically, to address the issue of insufficient labeled samples and improve the classification performance, this method utilizes a large amount of unlabeled irrelevant data to pre-train the feature extractor through self-supervised learning. During the fine-tuning stage of the downstream task, a dual-branch feature extraction structure incorporating texture features has been designed, and MS Triplet Attention proposed by us is used for the integration of important features. Additionally, to address the problem of sample imbalance and the existence of a large number of difficult samples in the Thangka cultural element dataset, the Gradient Harmonizing Mechanism Loss has been adopted, and it has been improved by introducing a self-designed adaptive mechanism. The experimental results on Thangka cultural elements dataset prove the superiority of the proposed method over the state-of-the-art methods. The source code of our proposed algorithm and the related datasets is available at https://github.com/WiniTang/MS-BiCLR.

引用

页码：3919 / 3935

页数：17

共 50 条

[31] Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism
Haiyuan Chen
Lianglun Cheng
Guoheng Huang
Ganghan Zhang
Jiaying Lan
Zhiwen Yu
Chi-Man Pun
Wing-Kuen Ling
Applied Intelligence, 2022, 52 : 15673 - 15689
[32] Contrastive Self-Supervised Two-Domain Residual Attention Network with Random Augmentation Pool for Hyperspectral Change Detection
Huang, Yixiang
Zhang, Lifu
Qi, Wenchao
Huang, Changping
Song, Ruoxi
REMOTE SENSING, 2023, 15 (15)
[33] Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism
Chen, Haiyuan
Cheng, Lianglun
Huang, Guoheng
Zhang, Ganghan
Lan, Jiaying
Yu, Zhiwen
Pun, Chi-Man
Ling, Wing-Kuen
APPLIED INTELLIGENCE, 2022, 52 (13) : 15673 - 15689
[34] A Rapid Adaptation Approach for Dynamic Air-Writing Recognition Using Wearable Wristbands with Self-Supervised Contrastive Learning
Guo, Yunjian
Li, Kunpeng
Yue, Wei
Kim, Nam-Young
Li, Yang
Shen, Guozhen
Lee, Jong-Chul
NANO-MICRO LETTERS, 2025, 17 (01)
[35] Anomalous Sound Detection Using Self-Supervised Classification Deep Hierarchical Reconstruction Network with Symmetric Fusion Attention
Wang, Hui
Shen, Kuan
Wang, Fuquan
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2025,
[36] Dual-attention-based semantic-aware self-supervised monocular depth estimation
Xu, Jinze
Ye, Feng
Lai, Yizong
MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (24) : 65579 - 65601
[37] Three-Dimension Attention Mechanism and Self-Supervised Pretext Task for Augmenting Few-Shot Learning
Liang, Yong
Chen, Zetao
Lin, Daoqian
Tan, Junwen
Yang, Zhenhao
Li, Jie
Li, Xinhai
IEEE ACCESS, 2023, 11 : 59428 - 59437
[38] Mineral Prospectivity Prediction Based on Self-Supervised Contrastive Learning and Geochemical Data: A Case Study of the Gold Deposit in the Malanyu District, Hebei Province, China
Miao, Qunfeng
Wang, Pan
Zhao, Hengqian
Li, Zhibin
Qi, Yunfei
Mao, Jihua
Li, Meiyu
Tang, Guanglong
NATURAL RESOURCES RESEARCH, 2024, 33 (04) : 1377 - 1391
[39] Triplet attention and dual-pool contrastive learning for clinic-driven multi-label medical image classification
Zhang, Yuhan
Luo, Luyang
Dou, Qi
Heng, Pheng-Ann
MEDICAL IMAGE ANALYSIS, 2023, 86
[40] OrchidNet: A Self-Supervised Learning-Based Efficient Multiscale Feature Fusion Convolutional Neural Network With a Lightweight Architecture for Orchid Classification
Hu, Wu-Chih
Chen, Liang-Bi
Huang, Xiang-Rui
Huang, Guan-Zhi
IEEE INTERNET OF THINGS JOURNAL, 2025, 12 (05): : 5859 - 5875

← 1 2 3 4 5 →