Automatic Assamese Text Categorization Using WordNet

被引:0
|
作者
Sarmah, Jumi [1 ]
Barman, Anup Kumar [1 ]
Sarma, Shikhar Kr. [1 ]
机构
[1] Gauhati Univ, Dept Informat Technol, Gauhati, India
来源
2013 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI) | 2013年
关键词
Text Categorization; Assamese WordNet; Word Sense Disambiguation;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The increasing rate of Assamese text contents in digital format encourages us to generate a system that automatically categorizes them. This paper discusses a system that will perform the categorization of texts automatically based on the knowledge from Assamese WordNet. In WordNet, synset correspond to the words which implies the same concept and words having more than one sense in a particular text content is disambiguated in this approach. This approach extracts words occurred in the document and uses them to create a synset vector with union to its corresponding synsets from WordNet. To increase our performance, we present a process where it increases the weight of not only the terms but also that of the synsets corresponding to the terms. We later count the occurrences of the senses that help in disambiguation tasks by propagating the relationship between synsets. The proposed method outcomes with a reasonable state of art accuracy when measured with Precision and Recall.
引用
收藏
页码:85 / 89
页数:5
相关论文
共 50 条
  • [31] Exploiting Ontology Recommendation Using Text Categorization Approach
    Sarwar, Muhammad Azeem
    Ahmed, Mansoor
    Habib, Asad
    Khalid, Muhammad
    Ali, M. Akhtar
    Raza, Mohsin
    Hussain, Shahid
    Ahmed, Ghufran
    IEEE ACCESS, 2021, 9 : 27304 - 27322
  • [32] Improving Arabic Text Categorization using Decision Trees
    Harrag, Fouzi
    El-Qawasmeh, Eyas
    Pichappan, Pit
    NDT: 2009 FIRST INTERNATIONAL CONFERENCE ON NETWORKED DIGITAL TECHNOLOGIES, 2009, : 110 - +
  • [33] Using bigrams detection for text categorization in scientific domain
    Montejo Raez, Arturo
    Perea Ortega, Jose Manuel
    Martin Valdivia, Maria Teresa
    Urena Lopez, L. Alfonso
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2010, (44): : 91 - 98
  • [34] Text Categorization of Marathi Documents using Modified LINGO
    Narhari, Shraddha A.
    Shedge, Rajashree
    2017 IEEE INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATION AND CONTROL (ICAC3), 2017,
  • [35] Using typical testors for feature selection in text categorization
    Pons-Porratal, Aurora
    Gil-Garcia, Reynaldo
    Berlanga-Liavori, Rafael
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS AND APPLICATIONS, PROCEEDINGS, 2007, 4756 : 643 - +
  • [36] The Automated Arabic Text Categorization Using SVM and KNN
    Hadi, Wa'el Musa
    Eljinini, Mohammad Ali H.
    Alhawari, Samer
    KNOWLEDGE MANAGEMENT AND INNOVATION: A BUSINESS COMPETITIVE EDGE PERSPECTIVE, VOLS 1-3, 2010, : 757 - +
  • [37] Fast text categorization using concise semantic analysis
    Li Zhixing
    Xiong Zhongyang
    Zhang Yufang
    Liu Chunyong
    Li Kuan
    PATTERN RECOGNITION LETTERS, 2011, 32 (03) : 441 - 448
  • [38] Text categorization using distributional clustering and concept extraction
    He, Yifan
    Jiang, Minghu
    ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS: WITH ASPECTS OF THEORETICAL AND METHODOLOGICAL ISSUES, 2007, 4681 : 720 - +
  • [39] Text categorization with ILA
    Sever, H
    Gorur, A
    Tolun, MR
    COMPUTER AND INFORMATION SCIENCES - ISCIS 2003, 2003, 2869 : 300 - 307
  • [40] Noisy text categorization
    Vinciarelli, A
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2005, 27 (12) : 1882 - 1895