Zero-shot learning based cross-lingual sentiment analysis for sanskrit text with insufficient labeled data

被引:0
作者
Puneet Kumar
Kshitij Pathania
Balasubramanian Raman
机构
[1] Indian Institute of Technology Roorkee,Department of Computer Science and Engineering
[2] Indian Institute of Technology Roorkee,Department of Mathematics
来源
Applied Intelligence | 2023年 / 53卷
关键词
Labeled data insufficiency; Cross-lingual sentiment analysis; Sanskrit language analysis; Machine translation;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper, a novel method for analyzing the sentiments portrayed by Sanskrit text has been proposed. Sanskrit is one of the world’s most ancient languages; however, natural language processing tasks such as machine translation and sentiment analysis have not been explored for it to the full potential because of the unavailability of sufficient labeled data. We solved this issue using a zero-shot learning-based cross-lingual sentiment analysis (CLSA) approach. The CLSA uses the resources from the source language to enhance the sentiment analysis of the target language having insufficient resources. The proposed work translates the text from Sanskrit, a language with insufficient labeled data, to English, with sufficient labeled data for sentiment analysis using a transformer model. A generative adversarial network-based strategy has been proposed to evaluate the maturity of the translations. Then a bidirectional long short-term memory-based model has been implemented to classify the sentiments using the embeddings obtained through translations. The proposed technique has achieved 87.50% accuracy for machine translation and 92.83% accuracy for sentiment classification. Sanskrit-English translations used in this work have been collected through web scraping techniques. In the absence of the ground-truth sentiment class labels, a strategy for evaluating the sentiment scores of the proposed sentiment analysis model has also been presented. A new dataset of Sanskrit text, along with their English translations and sentiment scores, has been constructed.
引用
收藏
页码:10096 / 10113
页数:17
相关论文
共 3 条