A Novel Hybrid Text Summarization System for Punjabi Text

被引:18
|
作者
Gupta, Vishal [1 ]
Kaur, Narvinder [1 ]
机构
[1] Panjab Univ, Univ Inst Engn & Technol, Chandigarh 160014, India
关键词
Summarization system; Natural language processing; Punjabi summarizer; Hybrid method; Text mining; Information retrieval; MAXIMUM COVERAGE;
D O I
10.1007/s12559-015-9359-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text summarization is the task of shortening text documents but retaining their overall meaning and information. A good summary should highlight the main concepts of any text document. Many statistical-based, location-based and linguistic-based techniques are available for text summarization. This paper has described a novel hybrid technique for automatic summarization of Punjabi text. Punjabi is an official language of Punjab State in India. There are very few linguistic resources available for Punjabi. The proposed summarization system is hybrid of conceptual-, statistical-, location- and linguistic-based features for Punjabi text. In this system, four new location-based features and two new statistical features (entropy measure and Z score) are used and results are very much encouraging. Support vector machine-based classifier is also used to classify Punjabi sentences into summary and non-summary sentences and to handle imbalanced data. Synthetic minority over-sampling technique is applied for over-sampling minority class data. Results of proposed system are compared with different baseline systems, and it is found that F score, Precision, Recall and ROUGE-2 score of our system are reasonably well as compared to other baseline systems. Moreover, summary quality of proposed system is comparable to the gold summary.
引用
收藏
页码:261 / 277
页数:17
相关论文
共 50 条
  • [1] A Novel Hybrid Text Summarization System for Punjabi Text
    Vishal Gupta
    Narvinder Kaur
    Cognitive Computation, 2016, 8 : 261 - 277
  • [2] Text Summarization Technique for Punjabi Language Using Neural Networks
    Jain, Arti
    Arora, Anuja
    Yadav, Divakar
    Morato, Jorge
    Kaur, Amanpreet
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2021, 18 (06) : 807 - 818
  • [3] A Turkish automatic text summarization system
    Altan, Z
    PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND APPLICATIONS, VOLS 1AND 2, 2004, : 311 - 316
  • [4] On text summarization
    Wang, YC
    Liu, GS
    Shen, Z
    WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL XVII, PROCEEDINGS: CYBERNETICS AND INFORMATICS: CONCEPTS AND APPLICATIONS (PT II), 2001, : 266 - 270
  • [5] A Novel Approach for Semantic Extractive Text Summarization
    Waseemullah
    Fatima, Zainab
    Zardari, Shehnila
    Fahim, Muhammad
    Andleeb Siddiqui, Maria
    Ibrahim, Ag. Asri Ag.
    Nisar, Kashif
    Naz, Laviza Falak
    APPLIED SCIENCES-BASEL, 2022, 12 (09):
  • [6] Recent automatic text summarization techniques: a survey
    Mahak Gambhir
    Vishal Gupta
    Artificial Intelligence Review, 2017, 47 : 1 - 66
  • [7] Recent automatic text summarization techniques: a survey
    Gambhir, Mahak
    Gupta, Vishal
    ARTIFICIAL INTELLIGENCE REVIEW, 2017, 47 (01) : 1 - 66
  • [8] HNTSumm: Hybrid text summarization of transliterated news articles
    Muniraj P.
    Sabarmathi K.R.
    Leelavathi R.
    Balaji B S.
    International Journal of Intelligent Networks, 2023, 4 : 53 - 61
  • [9] A Novel Technique for Efficient Text Document Summarization as a Service
    Bagalkotkar, Anusha
    Khandelwal, Ashesh
    Pandey, Shivam
    Kamath, Sowmya S.
    2013 THIRD INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING AND COMMUNICATIONS (ICACC 2013), 2013, : 50 - 53
  • [10] Enhancing text summarization and audio generation using hybrid model
    Koreddi, Venkatesh
    Chandini, Shaik
    Challa, B. V. T. Kalyan
    Teja, M. Sai Ram
    ENGINEERING RESEARCH EXPRESS, 2025, 7 (01):