Classification Method for Tibetan Texts Based on In-depth Learning

被引:0
作者
Wang, Lili [1 ]
Wang, Hongyuan [1 ]
Yang, Hongwu [1 ,2 ,3 ]
机构
[1] Northwest Normal Univ, Coll Phys & Elect Engn, Lanzhou 730070, Gansu, Peoples R China
[2] Engn Res Ctr Gansu Prov Intelligent Informat Tech, Lanzhou 730070, Gansu, Peoples R China
[3] Natl & Prov Joint Engn Lab Learning Anal Technol, Lanzhou 730070, Gansu, Peoples R China
来源
PROCEEDINGS OF 2019 IEEE 8TH JOINT INTERNATIONAL INFORMATION TECHNOLOGY AND ARTIFICIAL INTELLIGENCE CONFERENCE (ITAIC 2019) | 2019年
基金
中国国家自然科学基金;
关键词
Tibetan text classification; word vector space; deep neural network; machine learning model;
D O I
10.1109/itaic.2019.8785789
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text classification is a key technology in the field of information retrieval and data mining, which can effectively sort messy information and locate needed information. This paper focuses on the comparative study of depth neural network CNN model, RNN model and LSTM model on the effect of Tibetan text classification. Firstly, we train BiISTM_CRF model to segment Tibetan categorized text. We construct a word vector space model to get word vectors by removing stop words, calculating word frequency and extracting feature words. Secondly, the word vector is transmitted to the classification model to train the Tibetan text classifier. Finally, we use the Tibetan text classifier to classify Tibetan texts, Experiments show that deep neural network has better classification effect than traditional text classification method when the amount of data is large. Among them, CNN classifier has the best classification effect. When the amount of data is small, the SVM model is effective.
引用
收藏
页码:1231 / 1235
页数:5
相关论文
共 50 条
  • [1] Tibetan Text Classification based on Prompt Learning and Ensemble Learning
    Tang, Chao
    Tan, Zelin
    Zhao, Xiaobing
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2025, 24 (02)
  • [2] Research on Tibetan Text Classification Method Based on Neural Network
    Li, Zhensong
    Zhu, Jie
    Luo, Zhixiang
    Liu, Saihu
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2019, : 379 - 383
  • [3] A Tibetan Text Classification Method Based on Hybrid Model and Channel Attention Mechanism
    Hao, Minghui
    Yan, Xiaodong
    Ouyang, Xinpeng
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 1522 - 1527
  • [4] A Novel Method of Chinese Herbal Medicine Classification Based on Mutual Learning
    Han, Meng
    Zhang, Jilin
    Zeng, Yan
    Hao, Fei
    Ren, Yongjian
    MATHEMATICS, 2022, 10 (09)
  • [5] An improved method for water depth mapping in turbid waters based on a machine learning model
    Liang, Yitao
    Cheng, Zhixin
    Du, Yixiao
    Song, Dehai
    You, Zaijin
    ESTUARINE COASTAL AND SHELF SCIENCE, 2024, 296
  • [6] A Remote Sensing Image Classification Method based on Deep Transitive Transfer Learning
    Lin Y.
    Zhao Q.
    Li Y.
    Journal of Geo-Information Science, 2022, 24 (03) : 495 - 507
  • [7] Prompt-based for Low-Resource Tibetan Text Classification
    An, Bo
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (08)
  • [8] Automatic Modulation Classification Based on Decentralized Learning and Ensemble Learning
    Fu, Xue
    Gui, Guan
    Wang, Yu
    Gacanin, Haris
    Adachi, Fumiyuki
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2022, 71 (07) : 7942 - 7946
  • [9] Classification of Biomedical Texts for Cardiovascular Diseases with Deep Neural Network Using a Weighted Feature Representation Method
    Ahmed, Nizar
    Dilmac, Fatih
    Alpkocak, Adil
    HEALTHCARE, 2020, 8 (04)
  • [10] An Improved Tibetan Lhasa Speech Recognition Method Based on Deep Neural Network
    Ruan, Wenbin
    Gan, Zhenye
    Liu, Bin
    Guo, Yin
    2017 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION TECHNOLOGY AND AUTOMATION (ICICTA 2017), 2017, : 303 - 306