Using Unsupervised Deep Learning for Automatic Summarization of Arabic Documents

被引:17
作者
Alami, Nabil [1 ]
En-nahnahi, Noureddine [1 ]
Ouatik, Said Alaoui [1 ]
Meknassi, Mohammed [1 ]
机构
[1] Sidi Mohamed Ben Abdellah Univ, Fac Sci Dhar EL Mahraz, LIM, Fes, Morocco
关键词
Arabic text summarization; Deep learning; Unsupervised feature learning; Variational auto-encoder; Graph-based summarization; Query-based summarization; RECOGNITION; CLASSIFICATION; ALGORITHM;
D O I
10.1007/s13369-018-3198-y
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Traditional Arabic text summarization (ATS) systems are based on bag-of-words representation, which involve a sparse and high-dimensional input data. Thus, dimensionality reduction is greatly needed to increase the power of features discrimination. In this paper, we present a new method for ATS using variational auto-encoder (VAE) model to learn a feature space from a high-dimensional input data. We explore several input representations such as term frequency (tf), tf-idf and both local and global vocabularies. All sentences are ranked according to the latent representation produced by the VAE. We investigate the impact of using VAE with two summarization approaches, graph-based and query-based approaches. Experiments on two benchmark datasets specifically designed for ATS show that the VAE using tf-idf representation of global vocabularies clearly provides a more discriminative feature space and improves the recall of other models. Experiment results confirm that the proposed method leads to better performance than most of the state-of-the-art extractive summarization approaches for both graph-based and query-based summarization approaches.
引用
收藏
页码:7803 / 7815
页数:13
相关论文
共 50 条
  • [31] Arabic Ontology Learning Using Deep Learning
    Albukhitan, Saeed
    Helmy, Tarek
    Alnazer, Ahmed
    2017 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2017), 2017, : 1138 - 1142
  • [32] Arabic Handwritten Recognition Using Deep Learning: A Survey
    Alrobah, Naseem
    Albahli, Saleh
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2022, 47 (08) : 9943 - 9963
  • [33] Arabic spam tweets classification using deep learning
    Kaddoura, Sanaa
    Alex, Suja A.
    Itani, Maher
    Henno, Safaa
    AlNashash, Asma
    Hemanth, D. Jude
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (23) : 17233 - 17246
  • [34] Arabic Handwritten Recognition Using Deep Learning: A Survey
    Naseem Alrobah
    Saleh Albahli
    Arabian Journal for Science and Engineering, 2022, 47 : 9943 - 9963
  • [35] Arabic Text Classification Using Deep Learning Technics
    Boukil, Samir
    Biniz, Mohamed
    El Adnani, Fatiha
    Cherrat, Loubna
    El Moutaouakkil, Abd Elmaj Id
    INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2018, 11 (09): : 103 - 114
  • [36] Arabic spam tweets classification using deep learning
    Sanaa Kaddoura
    Suja A. Alex
    Maher Itani
    Safaa Henno
    Asma AlNashash
    D. Jude Hemanth
    Neural Computing and Applications, 2023, 35 : 17233 - 17246
  • [37] Determining the meter of classical Arabic poetry using deep learning: a performance analysis
    Mutawa, A. M.
    Alrumaih, Ayshah
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2025, 8
  • [38] Automatic Classification of Turner Syndrome Using Unsupervised Feature Learning
    Liu, Lu
    Sun, Jingchao
    Li, Jianqiang
    Pei, Yan
    2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 1578 - 1583
  • [39] A Systematic Review on Automatic Insect Detection Using Deep Learning
    Teixeira, Ana Claudia
    Ribeiro, Jose
    Morais, Raul
    Sousa, Joaquim J.
    Cunha, Antonio
    AGRICULTURE-BASEL, 2023, 13 (03):
  • [40] Extractive Text Summarization using Deep Learning
    Shirwandkar, Nikhil S.
    Kulkarni, Samidha
    2018 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2018,