Research on Semantic Prediction Analysis of Tibetan Text Based on Word2Vec

被引：0

作者：

Ding Hai-lan ^{[1
]}

Yu Hong-zhi ^{[1
]}

Qi Kun-yu ^{[1
]}

机构：

[1] Northwest Minzu Univ, China Natl Inst Informat Technol, Lanzhou 730030, Gansu, Peoples R China

来源：

2018 INTERNATIONAL SYMPOSIUM ON POWER ELECTRONICS AND CONTROL ENGINEERING (ISPECE 2018) | 2019年 / 1187卷

关键词：

D O I：

10.1088/1742-6596/1187/5/052058

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This article uses Google's open source Word2Vec tool to input the corpus of the Tibetan text "Sage wedding" after the word segmentation. The words are mapped to a K-dimensional space in the text and transformed into word vectors by using the context information of the vocabulary. The Word2Vec tool then learns to get a vector model, each of which is represented by a unique word vector. A vocabulary is constructed through training text data and then the vector is represented by learning the words. Word vectors capture the laws of many languages, which results the similarity of the distance between words and words. The experimental results show that the accuracy and recall rate based on the Word2Vec training model are very high.

引用

页数：7