A Hybrid Approach For Word Segmentation

被引:0
|
作者
Mohammed, Ammar [1 ,2 ]
Karam, Mohamed [3 ]
Hefny, Hesham [3 ]
机构
[1] Arab East Coll, Dept Comp Sci, Riyadh, Saudi Arabia
[2] Cairo Univ, Dept Comp Sci, ISSR, Giza, Egypt
[3] Cairo Univ, Inst Stat Studies & Res, Dept Comp Sci, Giza, Egypt
关键词
Word segmentation; Word statistics; Maximum matching; Hybrid methods; RECOGNITION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic word segmentation is the process of finding the most likely sequence of words from a sequence of characters without spaces. The central issues of the word segmentation process are the complexity and accuracy. This paper proposes a hybrid method for automatic word segmentation depending on a dictionary based approach, word-statistics and the length of the word. In comparison to the word segmentation using Maximum Length Descending Frequency and Entropy Rate method, the paper shows that the proposed method gives a better accuracy.
引用
收藏
页码:232 / 238
页数:7
相关论文
共 50 条
  • [21] AntSeg: An ant approach to disambiguation of Chinese word segmentation
    Lv, Qiang
    Wang, Hongling
    Qian, Peide
    Luo, Xiaohu
    IRI 2006: PROCEEDINGS OF THE 2006 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION, 2006, : 420 - +
  • [22] Trimming approach for word segmentation with focus on overlapping characters
    Gomathi, S.
    Devi, Rs Uma
    Mohanavel, S.
    2013 International Conference on Computer Communication and Informatics, ICCCI 2013, 2013,
  • [23] Is there a bilingual disadvantage for word segmentation? A computational modeling approach
    Fibla, Laia
    Sebastian-Galles, Nuria
    Cristia, Alejandrina
    JOURNAL OF CHILD LANGUAGE, 2022, 49 (06) : 1119 - 1146
  • [24] Trimming Approach for Word Segmentation with focus on Overlapping Characters
    Rohini, S. Gomathi
    Devi, R. S. Uma
    Mohanavel, S.
    2013 INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND INFORMATICS, 2013,
  • [25] AN UNSUPERVISED NON-ITERATIVE APPROACH TO WORD SEGMENTATION
    Wang, Hanshi
    Zhu, Jian
    Liu, Lizhen
    Wang, Xuren
    4TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER THEORY AND ENGINEERING ( ICACTE 2011), 2011, : 135 - 137
  • [26] A Hybrid Approach to Analyze the Morphology of an Assamese Word
    Rahman, Mirzanur
    Sarma, Shikhar Kumar
    RECENT DEVELOPMENTS IN MACHINE LEARNING AND DATA ANALYTICS, 2019, 740 : 201 - 209
  • [27] LED English Patents Analysis Using Term Segmentation and Word Segmentation System Approach
    Lin, Zong-Ching
    Lin, Wu-Hsien
    Du, Chia-Hong
    JOURNAL OF THE CHINESE SOCIETY OF MECHANICAL ENGINEERS, 2012, 33 (01): : 1 - 10
  • [28] A Hybrid Method for Word Segmentation with English-Vietnamese Bilingual Text
    Quoc Hung Ngo
    Dinh Dien
    Winiwarter, Werner
    2013 INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND INFORMATION SCIENCES (ICCAIS), 2013,
  • [29] A practical approach to resolving combination ambiguity in Chinese word segmentation
    Qin, Ying
    Zhang, Suxiang
    Wang, Xiaojie
    2006 8TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-4, 2006, : 1859 - +
  • [30] A BiLSTM-CRF Based Approach to Word Segmentation in Chinese
    Jin, Yuanyuan
    Tao, Shiyu
    Liu, Qi
    Liu, Xiaodong
    2022 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/CBDCOM/CYBERSCITECH), 2022, : 568 - 571