Parsing Modern Standard Arabic using Treebank Resources

被引:0
作者
Al-Emran, Mostafa [1 ,2 ]
Zaza, Sarween [2 ]
Shaalan, Khaled [2 ]
机构
[1] Al Buraimi Univ Coll, Al Buraimi, Oman
[2] British Univ Dubai, Dubai, U Arab Emirates
来源
2015 INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY RESEARCH (ICTRC) | 2015年
关键词
Statistical Parsing; Treebank; Arabic;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A Treebank is a linguistic resource that is composed of a large collection of manually annotated and verified syntactically analyzed sentences. Statistical Natural Language Processing ( NLP) approaches have been successful in using these annotations for developing basic NLP tasks such as tokenization, diacritization, part-of-speech tagging, parsing, among others. In this paper, we address the problem of exploiting Treebank resources for statistical parsing of Modern Standard Arabic ( MSA) sentences. Statistical parsing is significant for NLP tasks that use parsed text as an input such as Information Retrieval, and Machine Translation. We conducted an experiment on Pen Arabic Treebank ( PATB) and the parsing performance obtained in terms of Precision, Recall, and F-measure was 82.4%, 86.6%, 84.4%, respectively.
引用
收藏
页码:80 / 83
页数:4
相关论文
共 18 条
[1]  
Al Emran M., 2014, ADV COMP COMM INF IC, P393
[2]  
Al-taher A., 2014, ARABIC NLP J KING SA, V26, P441
[3]  
[Anonymous], 2002, INT S PROC AR
[4]  
[Anonymous], 2006, P TREEB LING THEOR C
[5]  
[Anonymous], 2003, P MT SUMM 9 WORKSH M
[6]  
ATTIA M, 2012, P 8 INT C LANG RES E, P1947
[7]  
Bikel D. M., 2004, EMNLP, P182
[8]  
Bikel D. M., 2002, P 2 INT C HUM LANG T, P178
[9]  
Collins M.J., 1996, P 34 ANN M ASS COMP, P184, DOI [DOI 10.3115/981863.981888, 10.3115/981863.981888]
[10]  
Farghaly A., 2009, ACM T ASIAN LANGUAGE, V8, P14