Extractive text summarization of arabic multi-document using fuzzy C-means and Latent Dirichlet Allocation

被引:5
作者
Al-Taani, Ahmad T. T. [1 ]
Al-Sayadi, Sami H. H. [1 ]
机构
[1] Yarmouk Univ, Dept Comp Sci, Irbid, Jordan
关键词
Multi-document text summarization; Arabic Language; Extractive-based summarization; Singular value decomposition (SVD); Fuzzy C-Means algorithm; Latent Dirichlet allocation (LDA) algorithm; RANKING;
D O I
10.1007/s13198-022-01783-2
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In this research, we investigated the performance of the combination of fuzzy c-means and latent Dirichlet allocation algorithms for Arabic multi-document summarization. The summary should include the most essential sentences from multi-documents with the same topic. The TAC-2011 corpus is used for experiments, first, the documents in the corpus are clustered using fuzzy c-means algorithm. The aim of the clustering process here is to classify the documents according to their topics, e.g., economic, politic, sport, etc. The results are compared against some recent Arabic summarization approaches that used ant colony and discriminant analysis algorithms. The proposed approach has obtained competitive results compared to those recent approaches.
引用
收藏
页码:713 / 726
页数:14
相关论文
共 62 条
  • [1] Multidocument Arabic Text Summarization Based on Clustering and Word2Vec to Reduce Redundancy
    Abdulateef, Samer
    Khan, Naseer Ahmed
    Chen, Bolin
    Shang, Xuequn
    [J]. INFORMATION, 2020, 11 (02)
  • [2] Afsharizadeh M., 2022, J. Inf. Syst. Telecommun., V1, P68
  • [3] StarSum: A Simple Star Graph for Multi-document Summarization
    Al-Dhelaan, Mohammed
    [J]. SIGIR 2015: PROCEEDINGS OF THE 38TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2015, : 715 - 718
  • [4] Al-Saleh Asma Bader, 2018, COLING 2018 27 INT C, P734
  • [5] Al-Taani A, 2012, INT ARAB J INF TECHN, V9, P109
  • [6] Al-Taani AT, 2017, 2017 INTERNATIONAL CONFERENCE ON INFOCOM TECHNOLOGIES AND UNMANNED SYSTEMS (TRENDS AND FUTURE DIRECTIONS) (ICTUS), P93
  • [7] Al-Taani AT., 2021, INT J ADV SOFT COMPU, V13, P59, DOI [10.15849/IJASCA.211128.05, DOI 10.15849/IJASCA.211128.05]
  • [8] Ali ZH, 2019, THESIS U BAGHDAD IRA
  • [9] A new sentence similarity measure and sentence based extractive technique for automatic text summarization
    Aliguliyev, Ramiz M.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (04) : 7764 - 7772
  • [10] Extractive Multi-Document Arabic Text Summarization Using Evolutionary Multi-Objective Optimization With K-Medoid Clustering
    Alqaisi, Rana
    Ghanem, Wasel
    Qaroush, Aziz
    [J]. IEEE ACCESS, 2020, 8 : 228206 - 228224