Improving Basic Natural Language Processing Tools for the Ainu Language

被引:3
作者
Nowakowski, Karol [1 ]
Ptaszynski, Michal [1 ]
Masui, Fumito [1 ]
Momouchi, Yoshio [2 ]
机构
[1] Kitami Inst Technol, Dept Comp Sci, 165 Koen Cho, Kitami, Hokkaido 0908507, Japan
[2] Hokkai Gakuen Univ, Fac Engn, Dept Elect & Informat Engn, Chuo Ku, 1-1,Nishi 11 Chome,Minami 26 Jo, Sapporo, Hokkaido 0640926, Japan
关键词
Ainu language; endangered languages; normalization; word segmentation; part-of-speech tagging;
D O I
10.3390/info10110329
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Ainu is a critically endangered language spoken by the native inhabitants of northern Japan. This paper describes our research aimed at the development of technology for automatic processing of text in Ainu. In particular, we improved the existing tools for normalizing old transcriptions, word segmentation, and part-of-speech tagging. In the experiments we applied two Ainu language dictionaries from different domains (literary and colloquial) and created a new data set by combining them. The experiments revealed that expanding the lexicon had a positive impact on the overall performance of our tools, especially with test data unrelated to any of the training sets used.
引用
收藏
页数:21
相关论文
共 50 条
[31]   Linking sounds to meanings: Infant statistical learning in a natural language [J].
Hay, Jessica F. ;
Pelucchi, Bruna ;
Estes, Katharine Graf ;
Saffran, Jenny R. .
COGNITIVE PSYCHOLOGY, 2011, 63 (02) :93-106
[32]   Natural language processing algorithms for mapping clinical text fragments onto ontology concepts: a systematic review and recommendations for future studies [J].
Kersloot, Martijn G. ;
van Putten, Florentien J. P. ;
Abu-Hanna, Ameen ;
Cornet, Ronald ;
Arts, Derk L. .
JOURNAL OF BIOMEDICAL SEMANTICS, 2020, 11 (01)
[33]   Improving POS Tagging of German Learner Language in a Reading Comprehension Scenario [J].
Keiper, Lena ;
Horbach, Andrea ;
Thater, Stefan .
LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, :198-205
[34]   Research on Nature Language Processing in the Application of Computer-assisted Teaching [J].
Chen, Xuqian .
2017 IEEE 9TH INTERNATIONAL CONFERENCE ON COMMUNICATION SOFTWARE AND NETWORKS (ICCSN), 2017, :1457-1460
[35]   Processing prosodic structure by adults with language-based learning disability [J].
Bahl, Megha ;
Plante, Elena ;
Gerken, LouAnn .
JOURNAL OF COMMUNICATION DISORDERS, 2009, 42 (05) :313-323
[36]   Cerebellar Atrophy and Language Processing in Chronic Left-Hemisphere Stroke [J].
Newman-Norlund, Roger D. ;
Gibson, Makayla ;
Johnson, Lisa ;
Teghipco, Alex ;
Rorden, Chris ;
Bonilha, Leonardo ;
Fridriksson, Julius .
NEUROBIOLOGY OF LANGUAGE, 2024, 5 (03) :722-735
[37]   Dynamic changes in network activations characterize early learning of a natural language [J].
Plante, Elena ;
Patterson, Dianne ;
Dailey, Natalie S. ;
Almyrde, Kyle R. ;
Fridriksson, Julius .
NEUROPSYCHOLOGIA, 2014, 62 :77-86
[38]   Statistical Learning in a Natural Language by 8-Month-Old Infants [J].
Pelucchi, Bruna ;
Hay, Jessica F. ;
Saffran, Jenny R. .
CHILD DEVELOPMENT, 2009, 80 (03) :674-685
[39]   Deep Learning Based Part-of-Speech Tagging for Malayalam Twitter Data (Special Issue: Deep Learning Techniques for Natural Language Processing) [J].
Kumar, S. ;
Kumar, M. Anand ;
Soman, K. P. .
JOURNAL OF INTELLIGENT SYSTEMS, 2019, 28 (03) :423-435
[40]   The effect of lexical status on prosodic processing in infants learning a fixed stress language [J].
Rago, Anett ;
Varga, Zsuzsanna ;
Garami, Linda ;
Honbolygo, Ferenc ;
Csepe, Valeria .
PSYCHOPHYSIOLOGY, 2021, 58 (12)