Clustering to Improve Microblog Stream Summarization

被引:2
作者
Olariu, Andrei [1 ]
机构
[1] Univ Bucharest, Fac Math & Comp Sci, Bucharest, Romania
来源
14TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC 2012) | 2012年
关键词
microblog; summarization; text clustering;
D O I
10.1109/SYNASC.2012.10
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Microblogging has shown a massive increase in use over the past couple of years. According to recent statistics, Twitter (the most popular microblogging platform) has over 340 million posts per day coming from its 140 million active users. In order to help users manage this information overload or to assess the full information potential of such microblogging streams (sequences of posts), a few summarization algorithms have been proposed. However, they are designed to work on a stream of posts filtered on a particular keyword, whereas most streams suffer from noise or have posts referring to more than one topic. Because of this, the generated summary is incomplete and even meaningless. We approach the problem of summarizing a stream and propose adding a layer of text clustering as a preprocessing step. We show how, by clustering posts into related groups and then applying a summarization algorithm, the quality of the summary improves.
引用
收藏
页码:220 / 226
页数:7
相关论文
共 44 条
  • [21] Trainable Summarization to Improve Breast Tomosynthesis Classification
    Tardy, Mickael
    Mateus, Diana
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT VII, 2021, 12907 : 140 - 149
  • [22] Summarization of Text Clustering based Vector Space Model
    Chen, Mingzhen
    Song, Yu
    2009 IEEE 10TH INTERNATIONAL CONFERENCE ON COMPUTER-AIDED INDUSTRIAL DESIGN & CONCEPTUAL DESIGN, VOLS 1-3: E-BUSINESS, CREATIVE DESIGN, MANUFACTURING - CAID&CD'2009, 2009, : 2362 - 2365
  • [23] Modelling on microblog posts clustering based on iteration feature selection and abstractive summarisation
    Gao, Kai
    Zhang, Bao-quan
    INTERNATIONAL JOURNAL OF MODELLING IDENTIFICATION AND CONTROL, 2015, 24 (02) : 110 - 119
  • [24] Online Burst Events Detection Oriented Real-Time Microblog Message Stream
    Dong, Guozhong
    Gao, Jun
    Huang, Liang
    Shi, Chunlei
    CMC-COMPUTERS MATERIALS & CONTINUA, 2019, 60 (01): : 213 - 225
  • [25] Extract Chinese Summarization Based on Concept-Obtained and Clustering Algorithm
    Wang, Meng
    INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL, 2012, 15 (09): : 3735 - 3740
  • [26] Unsupervised graph-clustering learning framework for financial news summarization
    Wang, Jun
    Tan, Jinghua
    Jin, Hanlei
    Qi, Shuo
    21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS ICDMW 2021, 2021, : 719 - 726
  • [27] Automatic Text Summarization using Fuzzy C-Means Clustering
    Anam, Shakil Ashraful
    Rahman, A. M. Muntasir
    Saleheen, Nasif Noor
    Arif, Hossain
    2018 JOINT 7TH INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV) AND 2018 2ND INTERNATIONAL CONFERENCE ON IMAGING, VISION & PATTERN RECOGNITION (ICIVPR), 2018, : 180 - 184
  • [28] INCORPORATING PARAGRAPH EMBEDDINGS AND DENSITY PEAKS CLUSTERING FOR SPOKEN DOCUMENT SUMMARIZATION
    Chen, Kuan-Yu
    Shih, Kai-Wun
    Liu, Shih-Hung
    Chen, Berlin
    Wang, Hsin-Min
    2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 207 - 214
  • [29] UpdateNews: A News Clustering and Summarization System Using Efficient Text Processing
    Takeda, Takaharu
    Takasu, Atsuhiro
    PROCEEDINGS OF THE 7TH ACM/IEE JOINT CONFERENCE ON DIGITAL LIBRARIES: BUILDING & SUSTAINING THE DIGITAL ENVIRONMENT, 2007, : 438 - +
  • [30] Data summarization based fast hierarchical clustering method for large datasets
    Patra, Bidyut Kr.
    Nandi, Sukumar
    Viswanath, P.
    2009 INTERNATIONAL CONFERENCE ON INFORMATION MANAGEMENT AND ENGINEERING, PROCEEDINGS, 2009, : 278 - +