Research on Semantic Prediction Analysis of Tibetan Text Based on Word2Vec
被引:0
作者:
Ding Hai-lan
论文数: 0引用数: 0
h-index: 0
机构:
Northwest Minzu Univ, China Natl Inst Informat Technol, Lanzhou 730030, Gansu, Peoples R ChinaNorthwest Minzu Univ, China Natl Inst Informat Technol, Lanzhou 730030, Gansu, Peoples R China
Ding Hai-lan
[1
]
Yu Hong-zhi
论文数: 0引用数: 0
h-index: 0
机构:
Northwest Minzu Univ, China Natl Inst Informat Technol, Lanzhou 730030, Gansu, Peoples R ChinaNorthwest Minzu Univ, China Natl Inst Informat Technol, Lanzhou 730030, Gansu, Peoples R China
Yu Hong-zhi
[1
]
Qi Kun-yu
论文数: 0引用数: 0
h-index: 0
机构:
Northwest Minzu Univ, China Natl Inst Informat Technol, Lanzhou 730030, Gansu, Peoples R ChinaNorthwest Minzu Univ, China Natl Inst Informat Technol, Lanzhou 730030, Gansu, Peoples R China
Qi Kun-yu
[1
]
机构:
[1] Northwest Minzu Univ, China Natl Inst Informat Technol, Lanzhou 730030, Gansu, Peoples R China
来源:
2018 INTERNATIONAL SYMPOSIUM ON POWER ELECTRONICS AND CONTROL ENGINEERING (ISPECE 2018)
|
2019年
/
1187卷
关键词:
D O I:
10.1088/1742-6596/1187/5/052058
中图分类号:
TM [电工技术];
TN [电子技术、通信技术];
学科分类号:
0808 ;
0809 ;
摘要:
This article uses Google's open source Word2Vec tool to input the corpus of the Tibetan text "Sage wedding" after the word segmentation. The words are mapped to a K-dimensional space in the text and transformed into word vectors by using the context information of the vocabulary. The Word2Vec tool then learns to get a vector model, each of which is represented by a unique word vector. A vocabulary is constructed through training text data and then the vector is represented by learning the words. Word vectors capture the laws of many languages, which results the similarity of the distance between words and words. The experimental results show that the accuracy and recall rate based on the Word2Vec training model are very high.