Integrating graph embedding and neural models for improving transition-based dependency parsing

被引:1
作者
Le-Hong, Phuong [1 ]
Cambria, Erik [2 ]
机构
[1] Vietnam Natl Univ, Hanoi, Vietnam
[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
关键词
Dependency parsing; recurrent neural networks; transformers; transition-based parsing; English; Indonesian; Vietnamese;
D O I
10.1007/s00521-023-09223-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces an effective method for improving dependency parsing which is based on a graph embedding model. The model helps extract local and global connectivity patterns between tokens. This method allows neural network models to perform better on dependency parsing benchmarks. We propose to incorporate node embeddings trained by a graph embedding algorithm into a bidirectional recurrent neural network scheme. The new model outperforms a baseline reference using a state-of-the-art method on three dependency treebanks for both low-resource and high-resource natural languages, namely Indonesian, Vietnamese and English. We also show that the popular pretraining technique of BERT would not pick up on the same kind of signal as graph embeddings. The new parser together with all trained models is made available under an open-source license, facilitating community engagement and advancement of natural language processing research for two low-resource languages with around 300 million users worldwide in total.
引用
收藏
页码:2999 / 3016
页数:18
相关论文
共 51 条
[1]  
Alves Mark J., 1999, P 9 ANN M SE AS LING, P221, DOI 10.1.1.694.9318
[2]  
[Anonymous], 2016, Trans. Assoc. Comput. Linguistics, DOI [DOI 10.1162/TACLA00110.3, 10.1162/tacl_a_00110]
[3]  
[Anonymous], 2014, P 2014 C EMPIRICAL M, DOI 10.3115/v1/d14-1082
[4]  
[Anonymous], 2008, P ACL
[5]  
Astudillo RF, 2020, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, P1001
[6]   Distributional Memory: A General Framework for Corpus-Based Semantics [J].
Baroni, Marco ;
Lenci, Alessandro .
COMPUTATIONAL LINGUISTICS, 2010, 36 (04) :673-721
[7]  
Bjorkelund Anders., 2017, P CONLL 2017 SHAR TA, P40, DOI DOI 10.18653/V1/K17-3004
[8]  
Bordes A., 2013, NEURAL INFORM PROCES, P1
[9]   Embedding Both Finite and Infinite Communities on Graphs [J].
Cavallari, Sandro ;
Cambria, Erik ;
Cai, Hongyun ;
Chang, Kevin Chen-Chuan ;
Zheng, Vincent W. .
IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 2019, 14 (03) :39-50
[10]   A combined syntactic-semantic embedding model based on lexicalized tree-adjoining grammar [J].
Dang, Hoang-Vu ;
Le-Hong, Phuong .
COMPUTER SPEECH AND LANGUAGE, 2021, 68