TABAS: Text augmentation based on attention score for text classification model

被引:3
作者
Yu, Yeong Jae [1 ]
Yoon, Seung Joo [2 ]
Jun, So Young [2 ]
Kim, Jong Woo [1 ]
机构
[1] Hanyang Univ, Sch Business, 222 Wangsimni Ro, Seoul 04763, South Korea
[2] Hanyang Univ, Dept Business Informat, Seoul, South Korea
来源
ICT EXPRESS | 2022年 / 8卷 / 04期
关键词
Attention mechanism; Data augmentation; Natural language processing; Text classification;
D O I
10.1016/j.icte.2021.11.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To improve the performance of text classification, we propose text augmentation based on attention score (TABAS). We recognized that a criterion for selecting a replacement word rather than a random selection was necessary. Therefore, TABAS utilizes attention scores for text modification, processing only words with the same entity and part-of-speech tags to consider informational aspects. To verify this approach, we used two benchmark tasks. As a result, TABAS can significantly improve performance, both recurrent and convolutional neural networks. Furthermore, we confirm that it provides a practical way to develop deep-learning models by saving costs on making additional datasets. (C) 2021 The Author(s). Published by Elsevier B.V. on behalf of The Korean Institute of Communications and Information Sciences.
引用
收藏
页码:549 / 554
页数:6
相关论文
共 27 条
[1]  
Anaby-Tavor A., 2019, ARXIV, DOI [10.1609/aaai.v34i05.6233, DOI 10.1609/AAAI.V34I05.6233]
[2]  
Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
[3]  
Bloice M.D., 2017, ARXIV170804680, DOI [10.21105/joss.00432, DOI 10.21105/JOSS.00432]
[4]  
Cho K., 2014, INT C MACH LEARN ICM, P1724, DOI [DOI 10.3115/V1/D14-1179, 10.3115/v1/D14-1179]
[5]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[6]  
Edunov S, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P489
[7]  
Kim Y., 2014, P C EMPIRICAL METHOD, DOI [10.3115/v1/D14-1181, DOI 10.3115/V1/D14-1181]
[8]  
Kobayashi S, 2018, ASS COMPUTATIONAL LI, V2, P452, DOI DOI 10.18653/V1/N18-2072
[9]  
Kumar A, 2016, PR MACH LEARN RES, V48
[10]  
Kumar V, 2020, arXiv