Automatic Summarization of the Arabic Documents using NMF: A Preliminary Study

被引:0
|
作者
Mohamed, A. A. [1 ]
机构
[1] Prince Sattam bin Abdulaziz Univ, Al Kharj, Saudi Arabia
来源
PROCEEDINGS OF 2016 11TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS (ICCES) | 2016年
关键词
Arabic Text Summarization; Text Mining; Information Retrieving; Natural Language Processing (NLP); \on negative Matrix Factorization (NMT); TEXT;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The exponential growth of the Internet produces a huge amount of documents online. Finding the desired documents from amongst these huge resources is a difficult task. This problem is known as "Information Overloading". Automatic Text Summarization techniques (ATS) try to solve this problem by extracting the essential sentences that cover most of the main issues in the document. So the user will spend less time and effort to identify the main ideas of the document. Research in this field in the Arabic language is relatively new compared with the available research in English. This paper presents a preliminary study that investigates the effectiveness of using Non negative Matrix Factorization (NMF) algorithm to summarize the Arabic documents. The researcher of the present study has built an Arabic corpus of 150 documents manually and conducted extensive experiments by using different sentences scoringalgorithms and term weighting schemes. The performance of the proposed algorithm has been measured, and the extensive experiments have shown that the NMF algorithm yields promising results.
引用
收藏
页码:235 / 240
页数:6
相关论文
共 39 条
  • [31] Preliminary study on the automatic lessons-learned file generator
    Yu, Wen-der
    Wei, Yu-teh
    Liu, Shen-jung
    Chang, Pei-lun
    25TH INTERNATIONAL SYMPOSIUM ON AUTOMATION AND ROBOTICS IN CONSTRUCTION - ISARC-2008, 2008, : 422 - 428
  • [32] A method for the automatic extraction of keywords in legislative documents using statistical, semantic, and clustering relationships
    Naseri, Jaber
    Hassanpour, Hamid
    Ghanbari, Ali
    INTERNATIONAL JOURNAL OF NONLINEAR ANALYSIS AND APPLICATIONS, 2021, 12 : 265 - 278
  • [33] Automatic Language Identification and Content Separation from Indian Multilingual Documents Using Unicode Transformation Format
    Rakholia, Rajnish M.
    Saini, Jatinderkumar R.
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON DATA ENGINEERING AND COMMUNICATION TECHNOLOGY, ICDECT 2016, VOL 1, 2017, 468 : 369 - 378
  • [34] Theophrastus: On demand and real-time automatic annotation and exploration of (web) documents using open linked data
    Fafalios, Pavlos
    Papadakos, Panagiotis
    JOURNAL OF WEB SEMANTICS, 2014, 29 : 31 - 38
  • [35] Justifying Arabic Text Sentiment Analysis Using Explainable AI (XAI): LASIK Surgeries Case Study
    Abdelwahab, Youmna
    Kholief, Mohamed
    Sedky, Ahmed Ahmed Hesham
    INFORMATION, 2022, 13 (11)
  • [36] Extraction of mitigation-related text from Endangered Species Act documents using machine learning: a case study
    Varghese A.
    Allen K.
    Agyeman-Badu G.
    Haire J.
    Madsen R.
    Environment Systems and Decisions, 2022, 42 (1) : 63 - 74
  • [37] Impact of COVID-19 research: a study on predicting influential scholarly documents using machine learning and a domain-independent knowledge graph
    Rabby, Gollam
    D'Souza, Jennifer
    Oelen, Allard
    Dvorackova, Lucie
    Svatek, Vojtech
    Auer, Soeren
    JOURNAL OF BIOMEDICAL SEMANTICS, 2023, 14 (01)
  • [38] Impact of COVID-19 research: a study on predicting influential scholarly documents using machine learning and a domain-independent knowledge graph
    Gollam Rabby
    Jennifer D’Souza
    Allard Oelen
    Lucie Dvorackova
    Vojtěch Svátek
    Sören Auer
    Journal of Biomedical Semantics, 14
  • [39] Impact of Automatic Query Generation and Quality Recognition Using Deep Learning to Curate Evidence From Biomedical Literature: Empirical Study
    Afzal, Muhammad
    Hussain, Maqbool
    Malik, Khalid Mahmood
    Lee, Sungyoung
    JMIR MEDICAL INFORMATICS, 2019, 7 (04) : 111 - 135