Towards better long-tailed oracle character recognition with adversarial data augmentation

被引：22

作者：

Li, Jing ^{[1
,4
]}

Wang, Qiu-Feng ^{[1
]}

Huang, Kaizhu ^{[2
]}

Yang, Xi ^{[1
]}

Zhang, Rui ^{[3
]}

Goulermas, John Y. ^{[4
]}

机构：

[1] Xian Jiaotong Liverpool Univ, Sch Adv Technol, Suzhou, Peoples R China

[2] Duke Kunshan Univ, Data Sci Res Ctr, Suzhou, Peoples R China

[3] Xian Jiaotong Liverpool Univ, Sch Sci, Suzhou, Peoples R China

[4] Univ Liverpool, Dept Comp Sci, Liverpool, England

来源：

PATTERN RECOGNITION | 2023年 / 140卷

基金：

中国国家自然科学基金;

关键词：

Oracle character recognition; Long tail; Data imbalance; Data augmentation; Mixup strategy; Generative adversarial networks;

D O I：

10.1016/j.patcog.2023.109534

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deciphering oracle bone script is of great significance to the study of ancient Chinese culture as well as archaeology. Although recent studies on oracle character recognition have made substantial progress, they still suffer from the long-tailed data situation that results in a noticeable performance drop on the tail classes. To mitigate this issue, we propose a generative adversarial framework to augment oracle characters in the problematic classes. In this framework, the generator produces synthetic data through convex combinations of all the available samples in the corresponding classes, and is further optimized through adversarial learning with the classifier and simultaneously the discriminator. Meanwhile, we in-troduce Repatch to generalize samples in the generator. Since tail classes do not have sufficient data for convex combinations, we propose the TailMix mechanism to generate suitable tail class samples from other classes. Experimental results show that our proposed algorithm obtains remarkable performance in oracle character recognition and achieves new state-of-the-art average (total) accuracy with 86.03% (89.46%), 86.54% (93.86%), 95.22% (96.17%) on the three datasets Oracle-AYNU, OBC306 and Oracle-20K, respectively.(c) 2023 Elsevier Ltd. All rights reserved.

引用

页数：13

共 50 条

[21] Long-tailed visual recognition with deep models: A methodological survey and evaluation [J].

Fu, Yu ;

Xiang, Liuyu ;

Zahid, Yumna ;

Ding, Guiguang ;

Mei, Tao ;

Shen, Qiang ;

Han, Jungong .

NEUROCOMPUTING, 2022, 509 :290-309

[22] FFD Augmentor: Towards Few-Shot Oracle Character Recognition from Scratch [J].

Zhao, Xinyi ;

Liu, Siyuan ;

Wang, Yikai ;

Fu, Yanwei .

COMPUTER VISION - ACCV 2022, PT V, 2023, 13845 :37-53

[23] Data augmentation using generative adversarial networks for robust speech recognition [J].

Qian, Yanmin ;

Hu, Hu ;

Tan, Tian .

SPEECH COMMUNICATION, 2019, 114 :1-9

[24] Medical long-tailed learning for imbalanced data: Bibliometric analysis [J].

Wu, Zheng ;

Guo, Kehua ;

Luo, Entao ;

Wang, Tian ;

Wang, Shoujin ;

Yang, Yi ;

Zhu, Xiangyuan ;

Ding, Rui .

COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2024, 247

[25] GENERATIVE ADVERSARIAL NETWORKS BASED DATA AUGMENTATION FOR NOISE ROBUST SPEECH RECOGNITION [J].

Hu, Hu ;

Tan, Tian ;

Qian, Yanmin .

2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, :5044-5048

[26] Improving GANs for Long-Tailed Data Through Group Spectral Regularization [J].

Rangwani, Harsh ;

Jaswani, Naman ;

Karmali, Tejan ;

Jampani, Varun ;

Babu, R. Venkatesh .

COMPUTER VISION - ECCV 2022, PT XV, 2022, 13675 :426-442

[27] Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech Recognition [J].

Jin, Zengrui ;

Geng, Mengzhe ;

Deng, Jiajun ;

Wang, Tianzi ;

Hu, Shujie ;

Li, Guinan ;

Liu, Xunying .

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 :413-429

[28] Data Augmentation for Imbalanced HRRP Recognition Using Deep Convolutional Generative Adversarial Network [J].

Song, Yiheng ;

Li, Yang ;

Wang, Yanhua ;

Hu, Cheng .

IEEE ACCESS, 2020, 8 :201686-201695

[29] Enhancing Activity Recognition After Stroke: Generative Adversarial Networks for Kinematic Data Augmentation [J].

Hadley, Aaron J. ;

Pulliam, Christopher L. .

SENSORS, 2024, 24 (21)

[30] DPA-EI: Long-tailed classification by dual progressive augmentation from explicit and implicit perspectives [J].

Zhao, Yan ;

He, Wenwei ;

Zhao, Hong .

KNOWLEDGE-BASED SYSTEMS, 2025, 311

← 1 2 3 4 5 →