Universal Semantic Tagging for English and Mandarin Chinese

被引:0
作者
Li, Wenxi [1 ]
Hou, Yiyang [1 ]
Ye, Yajie [2 ]
Liang, Li [1 ]
Sun, Weiwei [3 ]
机构
[1] Peking Univ, Dept Chinese Language & Literature, Beijing, Peoples R China
[2] Peking Univ, Wangxuan Inst Comp Technol, Beijing, Peoples R China
[3] Univ Cambridge, Dept Comp Sci & Technol, Cambridge, England
来源
2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021) | 2021年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Universal Semantic Tagging aims to provide lightweight unified analysis for all languages at the word level. Though the proposed annotation scheme is conceptually promising, the feasibility is only examined in four Indo- European languages. This paper is concerned with extending the annotation scheme to handle Mandarin Chinese and empirically study the plausibility of unifying meaning representations for multiple languages. We discuss a set of language-specific semantic phenomena, propose new annotation specifications and build a richly annotated corpus. The corpus consists of 1100 English-Chinese parallel sentences, where compositional semantic analysis is available for English, and another 1000 Chinese sentences which has enriched syntactic analysis. By means of the new annotations, we also evaluate a series of neural tagging models to gauge how successful semantic tagging can be: accuracies of 92.7% and 94.6% are obtained for Chinese and English respectively. The English tagging performance is remarkably better than the state-ofthe-art by 7.7%.
引用
收藏
页码:5554 / 5566
页数:13
相关论文
共 47 条
  • [1] Abdou M, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P4881
  • [2] Abend Omri., 2013, Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), P228
  • [3] Abzianidze L, 2017, P 15 C EUR CHAPT ASS, V2, P242, DOI DOI 10.18653/V1/E17-2039
  • [4] Abzianidze Lasha, 2017, IWCS 2017 12 INT C C
  • [5] Akbik A, 2018, P 27 INT C COMP LING, P1638
  • [6] [Anonymous], 1993, Computational linguistics
  • [7] Baker C. F., 1998, 36 ANN M ASS COMP LI, V1, P86
  • [8] Baker CollinF., 2017, Proceedings of TextGraphs-11: the Workshop on Graph-based Methods for Natural Language Processing, P45
  • [9] Bjerva J, 2016, P COLING 2016 26 INT, P3531
  • [10] Blloshmi R, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P2487