Research on Semantic Prediction Analysis of Tibetan Text Based on Word2Vec

被引:0
作者
Ding Hai-lan [1 ]
Yu Hong-zhi [1 ]
Qi Kun-yu [1 ]
机构
[1] Northwest Minzu Univ, China Natl Inst Informat Technol, Lanzhou 730030, Gansu, Peoples R China
来源
2018 INTERNATIONAL SYMPOSIUM ON POWER ELECTRONICS AND CONTROL ENGINEERING (ISPECE 2018) | 2019年 / 1187卷
关键词
D O I
10.1088/1742-6596/1187/5/052058
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This article uses Google's open source Word2Vec tool to input the corpus of the Tibetan text "Sage wedding" after the word segmentation. The words are mapped to a K-dimensional space in the text and transformed into word vectors by using the context information of the vocabulary. The Word2Vec tool then learns to get a vector model, each of which is represented by a unique word vector. A vocabulary is constructed through training text data and then the vector is represented by learning the words. Word vectors capture the laws of many languages, which results the similarity of the distance between words and words. The experimental results show that the accuracy and recall rate based on the Word2Vec training model are very high.
引用
收藏
页数:7
相关论文
共 4 条
  • [1] Ding Hailan, 2016, LINGUISTICS
  • [2] Ma Jinwu, 2008, 4 CLEAR STRUCTURE TI
  • [3] Tang Ming, 2016, COMPUTER SCI COMPUTE
  • [4] Xie Rimin, 2018, RES CHINESE BOOK CLA