Text Summarization of Hindi Documents using Rule Based Approach

被引:9
作者
Gupta, Manisha [1 ]
Garg, Naresh Kumar [1 ]
机构
[1] GZSCCET, Dept Comp Sci, Bathinda, Punjab, India
来源
2016 INTERNATIONAL CONFERENCE ON MICRO-ELECTRONICS AND TELECOMMUNICATION ENGINEERING (ICMETE) | 2016年
关键词
Text Summarization; NLP; Rule based approach; Dead Wood Removal; Dead Phrase Removal;
D O I
10.1109/ICMETE.2016.104
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Automatic summarization plays an important role in document processing system and information retrieval system. Generation of summary of a text document is a very important part of NLP. There are a number of scenarios where automatic construction of such summaries is useful. Text summarization is that process which convert a larger text into its shorter form maintaining its information. Summary of a longer text saves the reading time as it contain lesser number of lines but all important information of the original text document. In this paper we present a novel approach for text summarization of Hindi text document based on some linguistic rules. Dead wood words and phrases are also removed from the original document to generate the lesser number of words from the original text. Proposed system is tested on various Hindi inputs and accuracy of the system in form of number of lines extracted from original text containing important information of the original text document.
引用
收藏
页码:366 / 370
页数:5
相关论文
共 9 条
[1]  
Adithan M., 1996, Manufacturing Technology
[2]  
Gupta V., P COLING 2012, V2, P191
[3]  
Gupta V., 2012, P COLING 2012 DEM PA, P199
[4]  
Gupta Vishal, 2013, J EMERGING TECHNOLOG, V5
[5]  
Laroiya S.C., 1994, P INT C ADV MAN TECH, P203
[6]  
MandeepKaur and Jagroop Singh, 2013, INT J ENG SCI, V8
[7]  
Singh Gurmeet, 2014, INT C ADV COMP COMM, P424
[8]   Development of magneto abrasive flow machining process [J].
Singh, S ;
Shan, HS .
INTERNATIONAL JOURNAL OF MACHINE TOOLS & MANUFACTURE, 2002, 42 (08) :953-959
[9]  
Thaokar C, 2013, 2013 IEEE CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES (ICT 2013), P1138