Enhancing LSTM-based Word Segmentation Using Unlabeled Data

被引：3

作者：

Zheng, Bo ^{[1
]}

Che, Wanxiang ^{[1
]}

Guo, Jiang ^{[1
]}

Liu, Ting ^{[1
]}

机构：

[1] Harbin Inst Technol, Res Ctr Social Comp & Informat Retrieval, Harbin, Heilongjiang, Peoples R China

来源：

CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA, CCL 2017 | 2017年 / 10565卷

基金：

中国国家自然科学基金;

关键词：

Word segmentation; Statistics-based features; Neural network; Unlabeled data;

D O I：

10.1007/978-3-319-69005-6_6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Word segmentation problem is widely solved as the sequence labeling problem. The traditional way to this kind of problem is machine learning method like conditional random field with hand-crafted features. Recently, deep learning approaches have achieved state-of-the-art performance on word segmentation task and a popular method of them is LSTM networks. This paper gives a method to introduce numerical statistics-based features counted on unlabeled data into LSTM networks and analyzes how it enhances the performance of word segmentation model. We add pre-trained character-bigram embedding, pointwise mutual information, accessor variety and punctuation variety into our model and compare their performances on different datasets including three datasets from CoNLL-2017 shared task and three datasets of simplified Chinese. We achieve the state-of-the-art performance on two of them and get comparable results on the rest.

引用

页码：60 / 70

页数：11

共 16 条

[1]

[Anonymous], 2005, P 4 SIGHAN WORKSH CH

[2]

[Anonymous], 2009, P HUMAN LANGUAGE TEC

[3]

Cai D, 2016, PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, P409

[4] Accessor variety criteria for Chinese word extraction [J].

Feng, HD ;

Chen, K ;

Deng, XT ;

Zheng, WM .

COMPUTATIONAL LINGUISTICS, 2004, 30 (01) :75-93

[5]

Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]

[6]

Kong Lingpeng, 2015, ARXIV151106018

[7]

Lafferty John, 2001, INT C MACH LEARN ICM

[8]

Liang P., 2005, SEMISUPERVISED LEARN

[9]

Liu Y., 2016, P 25 INT JOINT C ART, P2880

[10]

Mikolov T., 2013, ARXIV, DOI [10.48550/arXiv.1301.3781, DOI 10.48550/ARXIV.1301.3781]

← 1 2 →