Towards better long-tailed oracle character recognition with adversarial data augmentation

被引:22
作者
Li, Jing [1 ,4 ]
Wang, Qiu-Feng [1 ]
Huang, Kaizhu [2 ]
Yang, Xi [1 ]
Zhang, Rui [3 ]
Goulermas, John Y. [4 ]
机构
[1] Xian Jiaotong Liverpool Univ, Sch Adv Technol, Suzhou, Peoples R China
[2] Duke Kunshan Univ, Data Sci Res Ctr, Suzhou, Peoples R China
[3] Xian Jiaotong Liverpool Univ, Sch Sci, Suzhou, Peoples R China
[4] Univ Liverpool, Dept Comp Sci, Liverpool, England
基金
中国国家自然科学基金;
关键词
Oracle character recognition; Long tail; Data imbalance; Data augmentation; Mixup strategy; Generative adversarial networks;
D O I
10.1016/j.patcog.2023.109534
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deciphering oracle bone script is of great significance to the study of ancient Chinese culture as well as archaeology. Although recent studies on oracle character recognition have made substantial progress, they still suffer from the long-tailed data situation that results in a noticeable performance drop on the tail classes. To mitigate this issue, we propose a generative adversarial framework to augment oracle characters in the problematic classes. In this framework, the generator produces synthetic data through convex combinations of all the available samples in the corresponding classes, and is further optimized through adversarial learning with the classifier and simultaneously the discriminator. Meanwhile, we in-troduce Repatch to generalize samples in the generator. Since tail classes do not have sufficient data for convex combinations, we propose the TailMix mechanism to generate suitable tail class samples from other classes. Experimental results show that our proposed algorithm obtains remarkable performance in oracle character recognition and achieves new state-of-the-art average (total) accuracy with 86.03% (89.46%), 86.54% (93.86%), 95.22% (96.17%) on the three datasets Oracle-AYNU, OBC306 and Oracle-20K, respectively.(c) 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页数:13
相关论文
共 50 条
[31]   Data Augmentation for Imbalanced HRRP Recognition Using Deep Convolutional Generative Adversarial Network [J].
Song, Yiheng ;
Li, Yang ;
Wang, Yanhua ;
Hu, Cheng .
IEEE ACCESS, 2020, 8 :201686-201695
[32]   Enhancing Activity Recognition After Stroke: Generative Adversarial Networks for Kinematic Data Augmentation [J].
Hadley, Aaron J. ;
Pulliam, Christopher L. .
SENSORS, 2024, 24 (21)
[33]   DPA-EI: Long-tailed classification by dual progressive augmentation from explicit and implicit perspectives [J].
Zhao, Yan ;
He, Wenwei ;
Zhao, Hong .
KNOWLEDGE-BASED SYSTEMS, 2025, 311
[34]   Recognition of Oracle Bone Inscriptions Using Deep Learning based on Data Augmentation [J].
Meng, Lin ;
Kamitoku, Nauki ;
Yamazaki, Katsuhiro .
2018 IEEE INTERNATIONAL CONFERENCE ON METROLOGY FOR ARCHAEOLOGY AND CULTURAL HERITAGE (METROARCHAEO 2018), 2018, :33-38
[35]   Data augmentation for handwritten digit recognition using generative adversarial networks [J].
Jha, Ganesh ;
Cecotti, Hubert .
MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (47-48) :35055-35068
[36]   Generative Adversarial Network (GAN) based Data Augmentation for Palmprint Recognition [J].
Wang, Gengxing ;
Kang, Wenxiong ;
Wu, Qiuxia ;
Wang, Zhiyong ;
Gao, Junbin .
2018 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), 2018, :156-162
[37]   Data augmentation for handwritten digit recognition using generative adversarial networks [J].
Ganesh Jha ;
Hubert Cecotti .
Multimedia Tools and Applications, 2020, 79 :35055-35068
[38]   TOWARDS AUTOMATIC DATA AUGMENTATION FOR DISORDERED SPEECH RECOGNITION [J].
Jin, Zengrui ;
Xie, Xurong ;
Wang, Tianzi ;
Geng, Mengzhe ;
Deng, Jiajun ;
Li, Guinan ;
Hu, Shujie ;
Liu, Xunying .
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2024), 2024, :10626-10630
[39]   Speech emotion recognition using data augmentation method by cycle-generative adversarial networks [J].
Shilandari, Arash ;
Marvi, Hossein ;
Khosravi, Hossein ;
Wang, Wenwu .
SIGNAL IMAGE AND VIDEO PROCESSING, 2022, 16 (07) :1955-1962
[40]   Speech emotion recognition using data augmentation method by cycle-generative adversarial networks [J].
Arash Shilandari ;
Hossein Marvi ;
Hossein Khosravi ;
Wenwu Wang .
Signal, Image and Video Processing, 2022, 16 :1955-1962