Data summarization: a survey

被引:44
作者
Ahmed, Mohiuddin [1 ]
机构
[1] Canberra Inst Technol, Dept ICT & Lib Studies, Reid, Australia
关键词
Summarization; Structured data; Unstructured data; Machine learning; Statistics; Semantics; Natural language processing; Cyber security; OUTLIERS; SUPPORT;
D O I
10.1007/s10115-018-1183-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Summarization has been proven to be a useful and effective technique supporting data analysis of large amounts of data. Knowledge discovery from data (KDD) is time consuming, and summarization is an important step to expedite KDD tasks by intelligently reducing the size of processed data. In this paper, different summarization techniques for structured and unstructured data are discussed. The key finding of this survey is that not all summarization techniques create a summary suitable for further analysis. It is highlighted that sampling techniques are a viable way of creating a summary for further knowledge discovery such as anomaly detection from summary. Also different summary evaluation metrics are discussed.
引用
收藏
页码:249 / 273
页数:25
相关论文
共 50 条
  • [31] Fast Machine Learning in Data Science with a Comprehensive Data Summarization
    Al-Amin, Sikder Tahsin
    Ordonez, Carlos
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 2941 - 2948
  • [32] Data based segmentation and summarization for sensor data in semiconductor manufacturing
    Park, Eunjeong L.
    Park, Jooseoung
    Yang, Jiwon
    Cho, Sungzoon
    Lee, Young-Hak
    Park, Hae-Sang
    EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (06) : 2619 - 2629
  • [33] From Information Overload to Lucidity: A Survey on Leveraging GPTs for Systematic Summarization of Medical and Biomedical Artifacts
    Palanisamy, Balamurugan
    Chakrabarti, Arjab
    Singh, Anushka
    Hassija, Vikas
    Chalapathi, G. S. S.
    Singh, Amit
    IEEE ACCESS, 2025, 13 : 7902 - 7922
  • [34] Algorithms and estimators for summarization of unaggregated data streams
    Cohen, Edith
    Duffield, Nick
    Kaplan, Haim
    Lund, Carstent
    Thorup, Mikkel
    JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2014, 80 (07) : 1214 - 1244
  • [35] Towards a Flexible Experience of Data Provenance Summarization
    Pei, Jisheng
    Ye, Xiaojun
    JOURNAL OF INTERNET TECHNOLOGY, 2018, 19 (05): : 1555 - 1565
  • [36] Data summarization method for chronic disease tracking
    Aleksic, Dejan
    Rajkovic, Petar
    Vukovic, Dusan
    Jankovic, Dragan
    Milenkovic, Aleksandar
    JOURNAL OF BIOMEDICAL INFORMATICS, 2017, 69 : 188 - 202
  • [37] Data-driven enabled approaches for criteria-based video summarization: a comprehensive survey, taxonomy, and future directions
    Sabha, Ambreen
    Selwal, Arvind
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (21) : 32635 - 32709
  • [38] Data-driven enabled approaches for criteria-based video summarization: a comprehensive survey, taxonomy, and future directions
    Ambreen Sabha
    Arvind Selwal
    Multimedia Tools and Applications, 2023, 82 : 32635 - 32709
  • [39] Neurosymbolic Learning on Activity Summarization of Video Data
    Kommrusch, Steve
    Bhave, Sanket
    Banik, Mridul
    Minsky, Henry
    INTERNATIONAL WORKSHOP ON SELF-SUPERVISED LEARNING, VOL 192, 2022, 192 : 108 - 119
  • [40] COMPREHENSIVE REVIEW OF AUTOMATIC TEXT SUMMARIZATION TECHNIQUES
    Cajueiro, Daniel O.
    Nery, Arthur G.
    Tavares, Igor
    De Melo, Maisa K.
    Dos Reis, Silvia A.
    Weigang, Li
    Celestino, Victor R. R.
    COMPUTING AND INFORMATICS, 2024, 43 (05) : 1185 - 1218