Prompt-based for Low-Resource Tibetan Text Classification

被引:4
作者
An, Bo [1 ]
机构
[1] Chinese Acad Social Sci, Inst Ethnol & Anthropol, South Tweenty 7 St,Bldg 6,Zhongguancun Nandajie 2, Beijing, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Tibetan text classification; prompt learning; deep learning; pre-trained language model;
D O I
10.1145/3603168
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text classification is a critical and foundational task in Tibetan natural language processing, it plays a crucial role in various applications, such as sentiment analysis and information extraction. However, the limited availability of annotated data poses a significant challenge to Tibetan natural language processing. This paper proposes a prompt learning-based method for low-resource Tibetan text classification to overcome this challenge. This method utilizes pre-trained language models to learn text representation and generation capabilities on a large-scale unsupervised Tibetan corpus, enabling few-shot Tibetan text classification. Experimental results demonstrate that the proposed method significantly improves the performance of Tibetan text classification in low-resource scenarios. This work provides a new research idea and method for low-resource language processing, such as Tibetan natural language processing. Hopefully, it will inspire subsequent work on low-resource language processing.
引用
收藏
页数:13
相关论文
共 46 条
  • [1] An Bo, 2022, Journal of Chinese Information Processing
  • [2] [Anonymous], 2012, 24 INT C COMP LING
  • [3] A Survey on Aspect-Based Sentiment Classification
    Brauwers, Gianni
    Frasincar, Flavius
    [J]. ACM COMPUTING SURVEYS, 2023, 55 (04)
  • [4] Cai JJ, 2018, I COMP CONF WAVELET, P123, DOI 10.1109/ICCWAMTIP.2018.8632592
  • [5] Tibetan Text Classification Based on the Feature of Position Weight
    Cao, Hui
    Jia, Huiqiang
    [J]. 2013 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2013), 2013, : 220 - 223
  • [6] Emerging Trends Word2Vec
    Church, Kenneth Ward
    [J]. NATURAL LANGUAGE ENGINEERING, 2017, 23 (01) : 155 - 162
  • [7] Pre-Training With Whole Word Masking for Chinese BERT
    Cui, Yiming
    Che, Wanxiang
    Liu, Ting
    Qin, Bing
    Yang, Ziqing
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 3504 - 3514
  • [8] Plant recolonization in the Himalaya from the southeastern Qinghai-Tibetan Plateau: Geographical isolation contributed to high population differentiation
    Cun, Yu-Zhi
    Wang, Xiao-Quan
    [J]. MOLECULAR PHYLOGENETICS AND EVOLUTION, 2010, 56 (03) : 972 - 982
  • [9] Grave E, 2018, Arxiv, DOI [arXiv:1802.06893, DOI 10.48550/ARXIV.1802.06893, 10.48550/arxiv.1802.06893]
  • [10] Graves A, 2012, STUD COMPUT INTELL, V385, P1, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]