Unsupervised Learning of Word Segmentation Rules with Genetic Algorithms and Inductive Logic Programming

被引：0

作者：

Dimitar Kazakov

Suresh Manandhar

机构：

[1] University of York,

来源：

Machine Learning | 2001年 / 43卷

关键词：

unsupervised machine learning; inductive logic programming; natural language; word segmentation;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

This article presents a combination of unsupervised and supervised learning techniques for the generation of word segmentation rules from a raw list of words. First, a language bias for word segmentation is introduced and a simple genetic algorithm is used in the search for a segmentation that corresponds to the best bias value. In the second phase, the words segmented by the genetic algorithm are used as an input for the first order decision list learner CLOG. The result is a set of first order rules which can be used for segmentation of unseen words. When applied on either the training data or unseen data, these rules produce segmentations which are linguistically meaningful, and to a large degree conforming to the annotation provided.

引用

页码：121 / 162

页数：41

共 14 条

[1]

Antworth E.(1991)Introduction to two-level phonology Notes on Linguistics 53 4-18

[2]

Brent M.(1999)An efficient, probabilistically sound algorithm for segmentation and word discovery Machine Learning 34 71-106

[3]

Daelamans W.(1997)IGTree: Using trees for compression and classification in lazy learning algorithms Artificial Intelligence Review 11 407-423

[4]

van den Bosch A.(1994)L'approche á deux niveaux en morphologie computationnelle et les dévelopments récents de la morphologie Traitement automatique des langues 35 9-48

[5]

Weijters A.(1994)Regular models of phonological rule systems Computational Linguistics 20 331-379

[6]

Fradin B.(1994)Learning the past tense of English verbs: The symbolic pattern associatior vs. connectionist models Journal of Artificial Intelligence Research 1 209-229

[7]

Kaplan R. M.(1997)Automatic rule induction for unknown word guessing Computational Linguistics 23 405-423

[8]

Kay M.(1995)Induction of first-order decision lists: Results on learning the past tense of English verbs Journal of Artificial Intelligence Research 3 1-24

[9]

Ling C. X.(1995)Inverse entailment and Progol New Generation Computing 13 245-286

[10]

Mikheev A.(1986)Induction of decision trees Machine Learning 1 81-106

← 1 2 →