A Support Vector Machines Approach to Vietnamese Key Phrase Extraction

被引:0
作者
Nguyen, Chau Q. [1 ]
Hong, Luan T. [2 ]
Phan, Tuoi T. [2 ]
机构
[1] Ho Chi Minh Univ Ind, 12 Nguyen Van Bao St, Go Vap Dist, Hcmc, Vietnam
[2] HCMC Univ Technol, Go Vap Dist, Hcmc, Vietnam
来源
2009 IEEE-RIVF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES: RESEARCH, INNOVATION AND VISION FOR THE FUTURE | 2009年
关键词
Key phrase; Vietnamese key phrase extraction; natural language processing; part-of-speech; word segmentation; support vector machines;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Automatic key phrase extraction is the task of automatically selecting a set of phrases that describe the content of a simple sentence. That a key phrase is extracted means that it is present verbatim the sentence to which it is assigned. Accurate key phrase extraction is fundamental to the success of many recent digital library applications, clustering, and semantic information retrieval techniques. The present research discusses; a support vector machines (SVMs) approach for Vietnamese key phrase extraction and presents a number of experiments in which performance is incrementally improved. In general, the Vietnamese key Phrase extracting process consists of three steps: word segmentation for identifying lexical units in an input sentence, part-of-speech tagging for words, and key phrase extraction for phrases. The performance of Vietnamese key phrase extraction systems is generally measured by the precision rate attained. This depends strongly on the nature and the size of it training set of key phrases. Most results are superior to 70.30% with a training set of 9,006 Vietnamese key phrases with of 2,000 sentences which was selected from the corpus of Vietnamese Lexicography Center (www.vietlex.com.vn).
引用
收藏
页码:131 / +
页数:2
相关论文
共 11 条
  • [1] [Anonymous], 1998, Advances in Kernel Methods-Support Vector Learning
  • [2] CHENG A, 2002, BASE NOUN PHRASE CHU
  • [3] Hearst M. A., 1998, SUPPORT VECTOR MACHI
  • [4] JOACHIMS T, 1998, 23 U DORTM
  • [5] Matsumoto, 2001, P 2 M N AM CHAPT ASS
  • [6] MAYER D, 2002, BENCHMARKING SUPPORT
  • [7] NGUYEN CQ, 2007, ADD CONTR 5 INT IEEE, P41
  • [8] NGUYEN CQ, 2005, P 2 NAT S FUND APPL, P106
  • [9] Learning algorithms for keyphrase extraction
    Turney P.D.
    [J]. Information Retrieval, 2000, 2 (4): : 303 - 336
  • [10] Turney PeterD., 1999, Learning to extract keyphrases from text