Graph-based extractive text summarization method for Hausa text

被引:2
作者
Bichi, Abdulkadir Abubakar [1 ]
Samsudin, Ruhaidah [1 ]
Hassan, Rohayanti [1 ]
Hasan, Layla Rasheed Abdallah [1 ]
Rogo, Abubakar Ado [2 ]
机构
[1] Univ Teknol Malaysia, Sch Comp, Skudai, Johor, Malaysia
[2] Yusuf Maitama Sule Univ, Dept Comp Sci, Kano, Nigeria
来源
PLOS ONE | 2023年 / 18卷 / 05期
关键词
RANKING;
D O I
10.1371/journal.pone.0285376
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Automatic text summarization is one of the most promising solutions to the ever-growing challenges of textual data as it produces a shorter version of the original document with fewer bytes, but the same information as the original document. Despite the advancements in automatic text summarization research, research involving the development of automatic text summarization methods for documents written in Hausa, a Chadic language widely spoken in West Africa by approximately 150,000,000 people as either their first or second language, is still in early stages of development. This study proposes a novel graph-based extractive single-document summarization method for Hausa text by modifying the existing PageRank algorithm using the normalized common bigrams count between adjacent sentences as the initial vertex score. The proposed method is evaluated using a primarily collected Hausa summarization evaluation dataset comprising of 113 Hausa news articles on ROUGE evaluation toolkits. The proposed approach outperformed the standard methods using the same datasets. It outperformed the TextRank method by 2.1%, LexRank by 12.3%, centroid-based method by 19.5%, and BM25 method by 17.4%.
引用
收藏
页数:15
相关论文
共 73 条
  • [1] Agrima A., 2021, International Journal of Electrical and Computer Engineering (IJECE), V11, P5438, DOI [10.11591/ijece.v11i6.pp5438-5449, DOI 10.11591/IJECE.V11I6.PP5438-5449]
  • [2] Al-Taani AT, 2014, INT ARAB C INFORM TE
  • [3] Alami N., 2016, P IEEE ACS INT C COM
  • [4] Alami N., 2021, EXPERT SYST APPL, P172
  • [5] Hybrid method for text summarization based on statistical and semantic treatment
    Alami, Nabil
    El Mallahi, Mostafa
    Amakdouf, Hicham
    Qjidaa, Hassan
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (13) : 19567 - 19600
  • [6] Alia ZH., 2020, APPL COMPUTING SUPPO
  • [7] Alquliti WH., 2019, INT J ADV COMPUT SC, V10
  • [8] Semantic Graph Based Automatic Summarization of Multiple Related Work Sections of Scientific Articles
    Altmami, Nouf Ibrahim
    Menai, Mohamed El Bachir
    [J]. ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, AND APPLICATIONS, AIMSA 2018, 2018, 11089 : 255 - 259
  • [9] alZahir S, 2015, IEEE PAC RIM CONF CO, P396, DOI 10.1109/PACRIM.2015.7334869
  • [10] Anusha BS., 2019, INT J COMPUTER APPL, P181