Corpus;
(Creation;
Annotation etc.);
Parsing;
Grammar;
Syntax;
Treebank;
Social Media Processing;
Italian;
Normalization;
D O I:
暂无
中图分类号:
TP39 [计算机的应用];
学科分类号:
081203 ;
0835 ;
摘要:
Lexical normalization is the task of translating non-standard social media data to a standard form. Previous work has shown that this is beneficial for many downstream tasks in multiple languages. However, for Italian, there is no benchmark available for lexical normalization, despite the presence of many benchmarks for other tasks involving social media data. In this paper, we discuss the creation of a lexical normalization dataset for Italian. After two rounds of annotation, a Cohen's kappa score of 78.64 is obtained. During this process, we also analyze the inter-annotator agreement for this task, which is only rarely done on datasets for lexical normalization, and when it is reported, the analysis usually remains shallow. Furthermore, we utilize this dataset to train a lexical normalization model and show that it can be used to improve dependency parsing of social media data. All annotated data and the code to reproduce the results are available at: http://bitbucket.org/robvanderg/normit.
机构:
Hanoi Univ Civil Engn, Fac Bldg & Ind Construction, Hanoi, VietnamHanoi Univ Civil Engn, Fac Bldg & Ind Construction, Hanoi, Vietnam
Nguyen, Trung-Kien
Nguyen, Nhu H. T.
论文数: 0引用数: 0
h-index: 0
机构:
Deakin Univ, Fac Sci Engn & Built Environm, Sch Engn, Waurn Ponds, AustraliaHanoi Univ Civil Engn, Fac Bldg & Ind Construction, Hanoi, Vietnam
Nguyen, Nhu H. T.
Vo, Thanh-Trung
论文数: 0引用数: 0
h-index: 0
机构:
Danang Architecture Univ, Sch Transportat Engn, Da Nang, Vietnam
Danang Architecture Univ, Off Res Adm, Da Nang, VietnamHanoi Univ Civil Engn, Fac Bldg & Ind Construction, Hanoi, Vietnam
Vo, Thanh-Trung
Chen, Liuxin
论文数: 0引用数: 0
h-index: 0
机构:
Monash Univ, Dept Civil Engn, SPARC Hub, ARC Ind Transformat Res Hub (ITRH), Clayton Campus, Clayton, Vic 3800, AustraliaHanoi Univ Civil Engn, Fac Bldg & Ind Construction, Hanoi, Vietnam
机构:Tel Aviv Univ, George S Wise Fac Life Sci, Dept Neurobiochem, IL-69978 Tel Aviv, Israel
Rimler, A
Jockers, R
论文数: 0引用数: 0
h-index: 0
机构:Tel Aviv Univ, George S Wise Fac Life Sci, Dept Neurobiochem, IL-69978 Tel Aviv, Israel
Jockers, R
Lupowitz, Z
论文数: 0引用数: 0
h-index: 0
机构:Tel Aviv Univ, George S Wise Fac Life Sci, Dept Neurobiochem, IL-69978 Tel Aviv, Israel
Lupowitz, Z
Sampson, SR
论文数: 0引用数: 0
h-index: 0
机构:Tel Aviv Univ, George S Wise Fac Life Sci, Dept Neurobiochem, IL-69978 Tel Aviv, Israel
Sampson, SR
Zisapel, N
论文数: 0引用数: 0
h-index: 0
机构:
Tel Aviv Univ, George S Wise Fac Life Sci, Dept Neurobiochem, IL-69978 Tel Aviv, IsraelTel Aviv Univ, George S Wise Fac Life Sci, Dept Neurobiochem, IL-69978 Tel Aviv, Israel