Extracting news events from microblogs

被引:7
|
作者
Repp, Oystein [1 ]
Ramampiaro, Heri [1 ]
机构
[1] Norwegian Univ Sci & Technol, Dept Comp Sci, Trondheim, Norway
关键词
Text mining; Deep Learning; Word Embedding; Information Extraction; Event Detection; Social Media;
D O I
10.1080/09720510.2018.1486273
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Twitter stream has become a large source of information, but the magnitude of tweets posted and the noisy nature of its content makes harvesting of knowledge from Twitter has challenged researchers for long time. Aiming at overcoming some of the main challenges of extracting hidden information from tweet streams, this work proposes a new approach for real-time detection of news events from the Twitter stream. We divide our approach into three steps. The first step is to use a neural network or deep learning to detect news-relevant tweets from the stream. The second step is to apply a novel streaming data clustering algorithm to the detected news tweets to form news events. The third and final step is to rank the detected events based on the size of the event clusters and growth speed of the tweet frequencies. We evaluate the proposed system on a large, publicly available corpus of annotated news events from Twitter. As part of the evaluation, we compare our approach with a related state-of-theart solution. Overall, our experiments and user-based evaluation show that our approach on detecting current (real) news events delivers a state-of-the-art performance.
引用
收藏
页码:695 / 723
页数:29
相关论文
共 50 条
  • [1] SEED: A Framework for Extracting Social Events from Press News
    Orlando, Salvatore
    Pizzolon, Francesco
    Tolomei, Gabriele
    PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'13 COMPANION), 2013, : 1285 - 1293
  • [2] Extracting Space Situational Awareness Events from News Text
    Xie, Zhengnan
    Kwak, Alice Saebom
    George, Enfa
    Dozal, Laura W.
    Van, Hoang
    Jah, Moriba
    Furfaro, Roberto
    Jansen, Peter
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6077 - 6082
  • [3] Word-Representation-Based Method for Extracting Organizational Events from Online Media
    Jun-Qiang Zhang
    Xiong-Wen Deng
    Yu Qian
    Journal of Electronic Science and Technology, 2017, 15 (04) : 407 - 412
  • [4] Extracting Events from Spatial Time Series
    Andrienko, Gennady
    Andrienko, Natalia
    Mladenov, Martin
    Mock, Michael
    Poelitz, Christian
    2010 14TH INTERNATIONAL CONFERENCE INFORMATION VISUALISATION (IV 2010), 2010, : 48 - 53
  • [5] Extracting and Aggregating Temporal Events from Text
    Doehling, Lars
    Leser, Ulf
    WWW'14 COMPANION: PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2014, : 839 - 844
  • [6] Extracting Events and Temporal Expressions from Text
    UzZaman, Naushad
    Allen, James F.
    2010 IEEE FOURTH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2010), 2010, : 1 - 8
  • [7] Extracting supply chain maps from news articles using deep neural networks
    Wichmann, Pascal
    Brintrup, Alexandra
    Baker, Simon
    Woodall, Philip
    McFarlane, Duncan
    INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2020, 58 (17) : 5320 - 5336
  • [8] Novel Visual and Statistical Image Features for Microblogs News Verification
    Jin, Zhiwei
    Cao, Juan
    Zhang, Yongdong
    Zhou, Jianshe
    Tian, Qi
    IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (03) : 598 - 608
  • [9] Extracting biomedical events from pairs of text entities
    Xiao Liu
    Antoine Bordes
    Yves Grandvalet
    BMC Bioinformatics, 16
  • [10] Identifying Sub-events and Summarizing Disaster-Related Information from Microblogs
    Rudra, Koustav
    Goyal, Pawan
    Ganguly, Niloy
    Mitra, Prasenjit
    Imran, Muhammad
    ACM/SIGIR PROCEEDINGS 2018, 2018, : 265 - 274