Mining Tibetan-Chinese Bilingual Entities from Wikipedia

被引:0
作者
Jiang, Tao [1 ]
Yu, Hongzhi [1 ]
He, Xiangzhen [1 ]
Meng, Xianghe [1 ]
机构
[1] Northwest Minzu Univ, Key Lab Natl Language Intelligent Proc Gansu Prov, Lanzhou, Gansu, Peoples R China
来源
2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP) | 2017年
关键词
Tibetan-Chinese bilingual; entity translation; Wikipedia; Tibetan information processing; Cross-Lingual;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Entity translation pairs play an important role in NLP applications, such as cross language information retrieval and machine translation. The named entity and domain entity are key factors that affect the performance of the system. However, the entity translations can hardly he found in the present bilingual dictionary or parallel corpus. There are lots of Tibetan new neologisms and named entities in Tibetan Wikipedia, and this paper proposes a new method to automatically mining method of Tibetan and Chinese bilingual entity translation from Wikipedia based on the language interlink and page feature. We construct an extract pattern of Tibetan and Chinese entity translation pairs gained from the previous work, and adopt multi-feature candidate translation pairs to distinguish the selection model. The results verify that the entity translation mining method can achieve high accuracy.
引用
收藏
页码:9 / 12
页数:4
相关论文
共 17 条
[1]  
Cai Rangjia, 2011, Computer Engineering and Applications, V47, P138, DOI 10.3778/j.issn.1002-8331.2011.06.038
[2]  
Changlong Sun, 2015, J COMPUTER RES DEV, V48, P1067
[3]  
Cucerzan S, 2010, JOINT C EMNLP CNLL, P708
[4]   Named Entity Recognition with Word Embeddings and Wikipedia Categories for a Low-Resource Language [J].
Das, Arjun ;
Ganguly, Debasis ;
Garain, Utpal .
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2017, 16 (03)
[5]  
Hua Quecairang, 2014, Computer Engineering and Applications, V50, P172, DOI 10.3778/j.issn.1002-8331.1308-0196
[6]  
[加羊吉 Jia Yangji], 2014, [中文信息学报, Journal of Chinese Information Processing], V28, P107
[7]  
Jianyong Duan, J CHINESE INFORM PRO, V29, P190
[8]  
Kittur A, 2009, CHI2009: PROCEEDINGS OF THE 27TH ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, VOLS 1-4, P1509
[9]  
[刘汇丹 Liu Huidan], 2015, [中文信息学报, Journal of Chinese Information Processing], V29, P170
[10]  
MILNE D, 2006, IEEE WIC ACM INT C W, P442