Using Annotation Projection for Semantic Role Labeling of Low-Resourced Language: Sinhala

被引:0
作者
Gunasekara, Sandun [1 ]
Chathura, Dulanjaya [1 ]
Jeewantha, Chamoda [1 ]
Dias, Gihan [1 ]
机构
[1] Univ Moratuwa, Dept Comp Sci & Engn, Moratuwa, Sri Lanka
来源
2020 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2020) | 2020年
关键词
SRL; Semantics; Semantic Role Labeling; Sinhala; Annotation; Projection; Labeller; Roles;
D O I
10.1109/ialp51396.2020.9310468
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present SinSRL, the first-ever semantic role labeller (SRL) for Sinhala, an Indo-European language spoken mainly in Sri Lanka. SinSRL takes parallel text in English (or any other language for which a suitable SRL exists) and Sinhala and outputs semantically annotated Sinhala text. We have enhanced existing tools to address several issues related to the target language. This will also be useful for labeling other Indic languages. In addition, we have manually semantically labeled a small Sinhala-English parallel dataset. The accuracy of our system is similar to that of manually labeled data. Our implementation can be used to generate a SRL dataset which may be used to train a direct semantic role labeller. SinSRL may be easily modified to annotate other low-resource languages for which parallel corpora are available.
引用
收藏
页码:98 / 103
页数:6
相关论文
共 29 条
  • [1] Abend Omri, 2013, Long Papers, V1, P228
  • [2] Akbik A., 2018, COMPUTATIONAL LINGUI
  • [3] Akbik A., 2015, 53 ANN M ASS COMP LI, V1, P397
  • [4] Akbik R., 2017, P 2017 C EMPIRICAL M, P43
  • [5] [Anonymous], 2020, CLIR CLEARNLP GUID
  • [6] [Anonymous], 2018, UN VERB IND
  • [7] [Anonymous], 2015, P 20 NORDIC C COMPUT
  • [8] Banarescu L., 2013, P 7 LING ANN WORKSH, P178
  • [9] Bentivogli L., 2005, Natural Language Engineering, V11, P247, DOI 10.1017/S1351324905003839
  • [10] Bjorkelund A., 2009, P CONLL