Phonetic String Matching for Languages with Cyrillic Alphabet

被引:0
作者
Paramonov, Viacheslav [1 ,2 ]
Shigarov, Alexey [1 ,2 ]
Ruzhnikov, Gennady [1 ]
Cherkashin, Evgeny [1 ,2 ,3 ]
机构
[1] RAS, Matrosov Inst Syst Dynam & Control Theory SB, Irkutsk, Russia
[2] Irkutsk State Univ, Inst Math Econ & Informat, Irkutsk, Russia
[3] Natl Res Irkutsk State Tech Univ, Irkutsk, Russia
来源
INFORMATION SYSTEMS ARCHITECTURE AND TECHNOLOGY, ISAT 2018, PT I | 2019年 / 852卷
关键词
Natural language processing; Phonetic algorithms; String comparison; Cyrillic letters;
D O I
10.1007/978-3-319-99981-4_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The usage of phonetic similarity in comparison of textual strings and elimination of misprints is one of significant issues in philology. It is widely used in automatic text checking. Nowadays most of phonetic algorithms are designed for English language words processing. The quality of comparison may be decreased for non-English languages especially for languages, which have rich morphology and use non-Latin alphabet symbols, e.g. East Slavic languages with Cyrillic letters. We propose an approach to phonetic comparison of Russian language words. It is based on detection letters and letter sequences that have similar pronunciation according to rules of the language. The resultant phonetic representation of the words are coded by prime numbers. The efficiency of the reviewed algorithm is considered in the paper. The algorithm was adopted for Mongolian language phonetic processing.
引用
收藏
页码:301 / 311
页数:11
相关论文
共 17 条
[1]   Review of distinctive phonetic features and the Arabic share in related modern research [J].
Alotaibi, Yousef ;
Meftah, Ali .
TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2013, 21 (05) :1426-1439
[2]  
[Anonymous], R5253512006 GOST
[3]  
Budnjam S., 2017, SCI NOTES U SCI MONG, P40
[4]  
Cubberley Russian P, 2002, LINGUISTIC INTRO
[5]  
Damasevicius R., 2017, P 9 INT JOINT C KNOW, P310
[6]  
Ivanova T. F., 2005, PRONUNCIATION ACCENT
[7]  
Kasatkin L. L., 1999, MODERN RUSSIAN DIALE
[8]   Data quality and systems theory [J].
Orr, K .
COMMUNICATIONS OF THE ACM, 1998, 41 (02) :66-71
[9]  
Ozhegov SI, 2007, DICT RUSSIAN LANGUAG
[10]   Polyphon: An Algorithm for Phonetic String Matching in Russian Language [J].
Paramonov, Viacheslav V. ;
Shigarov, Alexey O. ;
Ruzhnikov, Gennagy M. ;
Belykh, Polina V. .
INFORMATION AND SOFTWARE TECHNOLOGIES, ICIST 2016, 2016, 639 :568-579