A Novel Hybrid Text Summarization System for Punjabi Text

被引:19
作者
Gupta, Vishal [1 ]
Kaur, Narvinder [1 ]
机构
[1] Panjab Univ, Univ Inst Engn & Technol, Chandigarh 160014, India
关键词
Summarization system; Natural language processing; Punjabi summarizer; Hybrid method; Text mining; Information retrieval; MAXIMUM COVERAGE;
D O I
10.1007/s12559-015-9359-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text summarization is the task of shortening text documents but retaining their overall meaning and information. A good summary should highlight the main concepts of any text document. Many statistical-based, location-based and linguistic-based techniques are available for text summarization. This paper has described a novel hybrid technique for automatic summarization of Punjabi text. Punjabi is an official language of Punjab State in India. There are very few linguistic resources available for Punjabi. The proposed summarization system is hybrid of conceptual-, statistical-, location- and linguistic-based features for Punjabi text. In this system, four new location-based features and two new statistical features (entropy measure and Z score) are used and results are very much encouraging. Support vector machine-based classifier is also used to classify Punjabi sentences into summary and non-summary sentences and to handle imbalanced data. Synthetic minority over-sampling technique is applied for over-sampling minority class data. Results of proposed system are compared with different baseline systems, and it is found that F score, Precision, Recall and ROUGE-2 score of our system are reasonably well as compared to other baseline systems. Moreover, summary quality of proposed system is comparable to the gold summary.
引用
收藏
页码:261 / 277
页数:17
相关论文
共 50 条
[41]   NoteSum: An integrated note summarization system by using text mining algorithms [J].
Wang, Hei-Chia ;
Chen, Wei-Fan ;
Lin, Chen-Yu .
INFORMATION SCIENCES, 2020, 513 :536-552
[42]   Text Summarization and Multilingual Text to Audio Translation using Deep Learning Models [J].
Soni, Binjalben ;
Bharti, Santosh Kumar ;
Choudhury, Amitava .
2024 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND EMERGING COMMUNICATION TECHNOLOGIES, ICEC, 2024, :56-61
[43]   ArA*summarizer: An Arabic text summarization system based on subtopic segmentation and using an A* algorithm for reduction [J].
Bahloul, Belahcene ;
Aliane, Hassina ;
Benmohammed, Mohamed .
EXPERT SYSTEMS, 2020, 37 (02)
[44]   Ontology-based text summarization for business news articles [J].
Wu, CW ;
Liu, CL .
COMPUTERS AND THEIR APPLICATIONS, 2003, :389-392
[45]   A novel approach for text summarization using optimal combination of sentence scoring methods [J].
Verma, Pradeepika ;
Om, Hari .
SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2019, 44 (05)
[46]   An Extensive study of Symantic and Syntatic Approaches to Automatic Text Summarization [J].
Gaikwad, Manisha ;
Shinde, Gitanjali ;
Mahalle, Parikshit ;
Sable, Nilesh ;
Kharate, Namrata .
JOURNAL OF ELECTRICAL SYSTEMS, 2024, 20 (01) :455-468
[47]   COSUM: Text summarization based on clustering and optimization [J].
Alguliyev, Rasim M. ;
Aliguliyev, Ramiz M. ;
Isazade, Nijat R. ;
Abdi, Asad ;
Idris, Norisma .
EXPERT SYSTEMS, 2019, 36 (01)
[48]   A Survey of Advanced Methods for Efficient Text Summarization [J].
Antony, Dinu ;
Abhishek, Sumit ;
Singh, Sujata ;
Kodagali, Siddu ;
Darapaneni, Narayana ;
Rao, Mukesh ;
Paduri, Anwesh Reddy ;
Sudha, B. G. .
2023 IEEE 13TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE, CCWC, 2023, :962-968
[49]   Analyzing the capabilities of crowdsourcing services for text summarization [J].
Elena Lloret ;
Laura Plaza ;
Ahmet Aker .
Language Resources and Evaluation, 2013, 47 :337-369
[50]   Candidate sentence selection for extractive text summarization [J].
Mutlu, Begum ;
Sezer, Ebru A. ;
Akcayol, M. Ali .
INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (06)