Text Summarization Technique for Punjabi Language Using Neural Networks

被引:5
作者
Jain, Arti [1 ]
Arora, Anuja [1 ]
Yadav, Divakar [2 ]
Morato, Jorge [3 ]
Kaur, Amanpreet [1 ]
机构
[1] Jaypee Inst Informat Technol, Dept Comp Sci & Engn, Noida, India
[2] Natl Inst Informat Technol, Dept Comp Sci & Engn, Mathura, India
[3] Univ Carlos III Madrid, Dept Comp Sci & Engn, Madrid, Spain
关键词
Extractive method; Indian languages corpora initiative; natural language processing; neural networks; Punjabi language; text summarization; SYSTEM;
D O I
10.34028/iajit/18/6/8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the contemporary world, utilization of digital content has risen exponentially. For example, newspaper and web articles, status updates, advertisements etc. have become an integral part of our daily routine. Thus, there is a need to build an automated system to summarize such large documents of text in order to save time and effort. Although, there are summarizers for languages such as English since the work has started in the 1950s and at present has led it up to a matured stage but there are several languages that still need special attention such as Punjabi language. The Punjabi language is highly rich in morphological structure as compared to English and other foreign languages. In this work, we provide three phase extractive summarization methodology using neural networks. It induces compendious summary of Punjabi single text document. The methodology incorporates pre-processing phase that cleans the text; processing phase that extracts statistical and linguistic features; and classification phase. The classification based neural network applies an activation functionsigmoid and weighted error reduction-gradient descent optimization to generate the resultant output summary. The proposed summarization system is applied over monolingual Punjabi text corpus from Indian languages corpora initiative phase-II. The precision, recall and F-measure are achieved as 90.0%, 89.28% an 89.65% respectively which is reasonably good in comparison to the performance of other existing Indian languages' summarizers.
引用
收藏
页码:807 / 818
页数:12
相关论文
共 49 条
[1]   Arabic Single-Document Text Summarization Using Particle Swarm Optimization Algorithm [J].
Al-Abdallah, Raed Z. ;
Al-Taani, Ahmad T. .
ARABIC COMPUTATIONAL LINGUISTICS (ACLING 2017), 2017, 117 :30-37
[2]  
Aries A., 2019, ARXIV PREPRINT ARXIV
[3]  
Aslam J.A., 2015, NIST Special Publication
[4]  
Dalal, 2015, INT RES J ENG TECHNO, V2, P113
[5]  
Dalal V., 2017, P INT C INF COMM TEC, P284
[6]  
Dalal V., 2017, INT J ADV RES COMPUT, V6, P682, DOI [10.17148/IJARCCE.2017.64130, DOI 10.17148/IJARCCE.2017.64130]
[7]  
Gill M., 2009, LINGUISTIC J, V4, P6
[8]  
Gulati AN, 2017, 2017 INTERNATIONAL CONFERENCE ON NASCENT TECHNOLOGIES IN ENGINEERING (ICNTE-2017)
[9]  
Gupta Vishal, 2013, Journal of Emerging Technologies in Web Intelligence, V5, P257, DOI 10.4304/jetwi.5.3.257-271
[10]  
Gupta Vishal, 2013, Mining Intelligence and Knowledge Exploration. First International Conference, MIKE 2013. Proceedings: LNCS 8284, P717, DOI 10.1007/978-3-319-03844-5_70