Using Unsupervised Deep Learning for Automatic Summarization of Arabic Documents

被引:17
|
作者
Alami, Nabil [1 ]
En-nahnahi, Noureddine [1 ]
Ouatik, Said Alaoui [1 ]
Meknassi, Mohammed [1 ]
机构
[1] Sidi Mohamed Ben Abdellah Univ, Fac Sci Dhar EL Mahraz, LIM, Fes, Morocco
关键词
Arabic text summarization; Deep learning; Unsupervised feature learning; Variational auto-encoder; Graph-based summarization; Query-based summarization; RECOGNITION; CLASSIFICATION; ALGORITHM;
D O I
10.1007/s13369-018-3198-y
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Traditional Arabic text summarization (ATS) systems are based on bag-of-words representation, which involve a sparse and high-dimensional input data. Thus, dimensionality reduction is greatly needed to increase the power of features discrimination. In this paper, we present a new method for ATS using variational auto-encoder (VAE) model to learn a feature space from a high-dimensional input data. We explore several input representations such as term frequency (tf), tf-idf and both local and global vocabularies. All sentences are ranked according to the latent representation produced by the VAE. We investigate the impact of using VAE with two summarization approaches, graph-based and query-based approaches. Experiments on two benchmark datasets specifically designed for ATS show that the VAE using tf-idf representation of global vocabularies clearly provides a more discriminative feature space and improves the recall of other models. Experiment results confirm that the proposed method leads to better performance than most of the state-of-the-art extractive summarization approaches for both graph-based and query-based summarization approaches.
引用
收藏
页码:7803 / 7815
页数:13
相关论文
共 50 条
  • [1] Using Unsupervised Deep Learning for Automatic Summarization of Arabic Documents
    Nabil Alami
    Noureddine En-nahnahi
    Said Alaoui Ouatik
    Mohammed Meknassi
    Arabian Journal for Science and Engineering, 2018, 43 : 7803 - 7815
  • [2] Text summarization using unsupervised deep learning
    Yousefi-Azar, Mahmood
    Hamey, Len
    EXPERT SYSTEMS WITH APPLICATIONS, 2017, 68 : 93 - 105
  • [3] Automatic Summarization of the Arabic Documents using NMF: A Preliminary Study
    Mohamed, A. A.
    PROCEEDINGS OF 2016 11TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS (ICCES), 2016, : 235 - 240
  • [4] Unsupervised neural networks for automatic Arabic text summarization using document clustering and topic modeling
    Alami, Nabil
    Meknassi, Mohammed
    En-nahnahi, Noureddine
    El Adlouni, Yassine
    Ammor, Ouafae
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 172
  • [5] Arabic text summarization using deep learning approach
    Al-Maleh, Molham
    Desouki, Said
    JOURNAL OF BIG DATA, 2020, 7 (01)
  • [6] Arabic text summarization using deep learning approach
    Molham Al-Maleh
    Said Desouki
    Journal of Big Data, 7
  • [7] Automatic Transcription of Ottoman Documents Using Deep Learning
    Tasdemir, Esma F. Bilgin
    Tandogan, Zeynep
    Akansu, S. Dogan
    Kizilirmak, Firat
    Sen, M. Umut
    Akcan, Aysu
    Kuru, Mehmet
    Yanikoglu, Berrin
    DOCUMENT ANALYSIS SYSTEMS, DAS 2024, 2024, 14994 : 422 - 435
  • [8] Semantic Annotation of Arabic Web Documents using Deep Learning
    Albukhitan, Saeed
    Alnazer, Ahmed
    Helmy, Tarek
    9TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT 2018) / THE 8TH INTERNATIONAL CONFERENCE ON SUSTAINABLE ENERGY INFORMATION TECHNOLOGY (SEIT-2018) / AFFILIATED WORKSHOPS, 2018, 130 : 589 - 596
  • [9] Towards Unsupervised Learning for Arabic Handwritten Recognition Using Deep Architectures
    Elleuch, Mohamed
    Tagougui, Najiba
    Kherallah, Monji
    NEURAL INFORMATION PROCESSING, PT I, 2015, 9489 : 363 - 372