A Novel Hybrid Text Summarization System for Punjabi Text

被引:18
作者
Gupta, Vishal [1 ]
Kaur, Narvinder [1 ]
机构
[1] Panjab Univ, Univ Inst Engn & Technol, Chandigarh 160014, India
关键词
Summarization system; Natural language processing; Punjabi summarizer; Hybrid method; Text mining; Information retrieval; MAXIMUM COVERAGE;
D O I
10.1007/s12559-015-9359-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text summarization is the task of shortening text documents but retaining their overall meaning and information. A good summary should highlight the main concepts of any text document. Many statistical-based, location-based and linguistic-based techniques are available for text summarization. This paper has described a novel hybrid technique for automatic summarization of Punjabi text. Punjabi is an official language of Punjab State in India. There are very few linguistic resources available for Punjabi. The proposed summarization system is hybrid of conceptual-, statistical-, location- and linguistic-based features for Punjabi text. In this system, four new location-based features and two new statistical features (entropy measure and Z score) are used and results are very much encouraging. Support vector machine-based classifier is also used to classify Punjabi sentences into summary and non-summary sentences and to handle imbalanced data. Synthetic minority over-sampling technique is applied for over-sampling minority class data. Results of proposed system are compared with different baseline systems, and it is found that F score, Precision, Recall and ROUGE-2 score of our system are reasonably well as compared to other baseline systems. Moreover, summary quality of proposed system is comparable to the gold summary.
引用
收藏
页码:261 / 277
页数:17
相关论文
共 50 条
[21]   Hybrid Approach for Punjabi Question Answering System [J].
Gupta, Poonam ;
Gupta, Vishal .
ADVANCES IN SIGNAL PROCESSING AND INTELLIGENT RECOGNITION SYSTEMS, 2014, 264 :133-149
[22]   A Light-Weight Text Summarization System for Fast Access to Medical Evidence [J].
Sarker, Abeed ;
Yang, Yuan-Chi ;
Al-Garadi, Mohammed Ali ;
Abbas, Aamir .
FRONTIERS IN DIGITAL HEALTH, 2020, 2
[23]   A novel extractive text summarization system with self-organizing map clustering and entity recognition [J].
M RAHUL RAJ ;
ROSNA P HAROON ;
N V SOBHANA .
Sādhanā, 2020, 45
[24]   An automatic arabic text summarization system based on genetic algorithms [J].
Tanfouri, Imen ;
Tlik, Ghassen ;
Jarray, Fethi .
AI IN COMPUTATIONAL LINGUISTICS, 2021, 189 :195-202
[25]   A novel extractive text summarization system with self-organizing map clustering and entity recognition [J].
Raj, M. Rahul ;
Haroon, Rosna P. ;
Sobhana, N., V .
SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2020, 45 (01)
[26]   Extractive Odia Text Summarization System: An OCR Based Approach [J].
Pattnaik, Priyanka ;
Mallick, Debasish Kumar ;
Parida, Shantipriya ;
Dash, Satya Ranjan .
BIOLOGICALLY INSPIRED TECHNIQUES IN MANY-CRITERIA DECISION MAKING, 2020, 10 :136-143
[27]   Indian Legal Text Summarization: A Text Normalization-based Approach [J].
Ghosh, Satyajit ;
Dutta, Mousumi ;
Das, Tanaya .
2022 IEEE 19TH INDIA COUNCIL INTERNATIONAL CONFERENCE, INDICON, 2022,
[28]   Text Mining-Implementation of Extract Summarization in a Text Mining Application [J].
Akbar, Ali ;
Sultan, Ahmer ;
Mustafa, Atika .
INTERNATIONAL SYMPOSIUM OF INFORMATION TECHNOLOGY 2008, VOLS 1-4, PROCEEDINGS: COGNITIVE INFORMATICS: BRIDGING NATURAL AND ARTIFICIAL KNOWLEDGE, 2008, :698-703
[29]   Application of Extractive Text Summarization Algorithms to Speech-to-Text Media [J].
Victor, Dominguez M. ;
Eduardo, Fidalgo F. ;
Biswas, Rubel ;
Alegre, Enrique ;
Fernandez-Robles, Laura .
HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, HAIS 2019, 2019, 11734 :540-550
[30]   A Relation Mining and Visualization Framework for Automated Text Summarization [J].
Abulaish, Muhammad ;
Jahiruddin ;
Dey, Lipika .
PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PROCEEDINGS, 2009, 5909 :249-+