Two New Large Corpora for Vietnamese Aspect-based Sentiment Analysis at Sentence Level

被引:11
作者
Dang Van Thin [1 ]
Ngan Luu-Thuy Nguyen [1 ]
Tri Minh Truong [1 ]
Lac Si Le [1 ]
Duy Tin Vo [2 ,3 ]
机构
[1] Vietnam Natl Univ, Univ Informat Technol, Ho Chi Minh City, Vietnam
[2] Lakehead Univ, Dept Comp Sci, Thunder Bay, ON P7B 5E1, Canada
[3] VinAI Res, Hanoi, Vietnam
关键词
Aspect-based sentiment analysis; deep neural network; multi-task learning; Vietnamese corpora;
D O I
10.1145/3446678
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Aspect-based sentiment analysis has been studied in both research and industrial communities over recent years. For the low-resource languages, the standard benchmark corpora play an important role in the development of methods. In this article, we introduce two benchmark corpora with the largest sizes at sentence-level for two tasks: Aspect Category Detection and Aspect Polarity Classification in Vietnamese. Our corpora are annotated with high inter-annotator agreements for the restaurant and hotel domains. The release of our corpora would push forward the low-resource language processing community. In addition, we deploy and compare the effectiveness of supervised learning methods with a single and multi-task approach based on deep learning architectures. Experimental results on our corpora show that the multi-task approach based on BERT architecture outperforms the neural network architectures and the single approach. Our corpora and source code are published on this footnoted site.(1)
引用
收藏
页数:22
相关论文
共 51 条
[1]  
[Anonymous], 2017, P C EMP METH NAT LAN, DOI [10.18653/v1/d17-1217, DOI 10.18653/V1/D17-1217]
[2]  
[Anonymous], 2015, ARXIV151108630
[3]  
[Anonymous], 2013, P 5 C QUANT INV THEO
[4]  
[Anonymous], 2013, P 27 PACIFIC ASIA C
[5]  
Bhowmick P. K., 2008, COLING 2008 P WORKSH, P58, DOI DOI 10.3115/1611628.1611637
[6]  
Chen X, 2020, 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), P3667
[7]  
Dang Thin, 2019, J COMPUT SCI CYBERNE, V34, P323
[8]  
Thin DV, 2018, PROCEEDINGS OF 2018 5TH NAFOSTED CONFERENCE ON INFORMATION AND COMPUTER SCIENCE (NICS 2018), P104, DOI 10.1109/NICS.2018.8606857
[9]  
Ghadery E., 2019, ASPECT CATEGORY DETE
[10]  
Ghadery E, 2019, AAAI CONF ARTIF INTE, P6441