Syntactically Coherent Text Augmentation for Sequence Classification

被引:7
作者
Pandey, Suraj [1 ]
Akhtar, Md. Shad [1 ]
Chakraborty, Tanmoy [1 ]
机构
[1] Indraprastha Inst Informat Technol Delhi, Dept Comp Sci & Engn, New Delhi 110020, India
关键词
Generators; Task analysis; Syntactics; Computational modeling; Training; Computer architecture; Data models; Data augmentation; generative adversarial network (GAN); sequence classification; SENTIMENT ANALYSIS; NETWORK;
D O I
10.1109/TCSS.2021.3075774
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, we address the problem of data scarcity for the sequence classification tasks. We propose AugmentGAN, a simple-yet-effective generative adversarial network-based text augmentation model, which ensures syntactic coherency in the newly generated samples. Given an input with a label, AugmentGAN aims to generate a semantically similar sequence that follows the syntactic structure of the original sample. Exhaustive task-based evaluation is conducted to show the efficacy of AugmentGAN-we employ 12 different datasets across five classification tasks, i.e., sentiment analysis, emotion recognition, sarcasm detection, intent classification, and spam detection. We observe that, compared to the existing text augmentation techniques, AugmentGAN yields an improved performance across datasets for all the tasks. AugmentGAN also turns out to be effective for multiple languages, i.e., English, Hindi, and Bengali.
引用
收藏
页码:1323 / 1332
页数:10
相关论文
共 63 条
[1]  
Akhtar M. S., 2016, P COLING 2016 26 IN, P482
[2]   All-in-One: Emotion, Sentiment and Intensity Prediction Using a Multi-Task Ensemble Framework [J].
Akhtar, Md Shad ;
Ghosal, Deepanway ;
Ekbal, Asif ;
Bhattacharyya, Pushpak ;
Kurohashi, Sadao .
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2022, 13 (01) :285-297
[3]   How Intense Are You? Predicting Intensities of Emotions and Sentiments using Stacked Ensemble [J].
Akhtar, Md Shad ;
Ekbal, Asif ;
Cambria, Erik .
IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 2020, 15 (01) :64-75
[4]  
Alihosseini D., 2019, P WORKSH METH OPT EV, P90, DOI DOI 10.18653/V1/W19-2311
[5]  
Almeida TA, 2011, DOCENG 2011: PROCEEDINGS OF THE 2011 ACM SYMPOSIUM ON DOCUMENT ENGINEERING, P259
[6]  
[Anonymous], 2014, PROC C EMPIRICAL MET, DOI DOI 10.3115/V1/D14-1181
[7]  
[Anonymous], 2017, PR MACH LEARN RES
[8]   Online Public Shaming on Twitter: Detection, Analysis, and Mitigation [J].
Basak, Rajesh ;
Sural, Shamik ;
Ganguly, Niloy ;
Ghosh, Soumya K. .
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2019, 6 (02) :208-220
[9]  
Bhattacharyya P., 2015, MACH TRANSL
[10]  
Braun D, 2017, 18TH ANNUAL MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE (SIGDIAL 2017), P174