Extracting news events from microblogs

被引:7
|
作者
Repp, Oystein [1 ]
Ramampiaro, Heri [1 ]
机构
[1] Norwegian Univ Sci & Technol, Dept Comp Sci, Trondheim, Norway
关键词
Text mining; Deep Learning; Word Embedding; Information Extraction; Event Detection; Social Media;
D O I
10.1080/09720510.2018.1486273
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Twitter stream has become a large source of information, but the magnitude of tweets posted and the noisy nature of its content makes harvesting of knowledge from Twitter has challenged researchers for long time. Aiming at overcoming some of the main challenges of extracting hidden information from tweet streams, this work proposes a new approach for real-time detection of news events from the Twitter stream. We divide our approach into three steps. The first step is to use a neural network or deep learning to detect news-relevant tweets from the stream. The second step is to apply a novel streaming data clustering algorithm to the detected news tweets to form news events. The third and final step is to rank the detected events based on the size of the event clusters and growth speed of the tweet frequencies. We evaluate the proposed system on a large, publicly available corpus of annotated news events from Twitter. As part of the evaluation, we compare our approach with a related state-of-theart solution. Overall, our experiments and user-based evaluation show that our approach on detecting current (real) news events delivers a state-of-the-art performance.
引用
收藏
页码:695 / 723
页数:29
相关论文
共 50 条
  • [31] Deriving market intelligence from microblogs
    Li, Yung-Ming
    Li, Tsung-Ying
    DECISION SUPPORT SYSTEMS, 2013, 55 (01) : 206 - 217
  • [32] E-ware: a big data system for the incremental discovery of spatio-temporal events from microblogs
    Afyouni I.
    Khan A.
    Al Aghbari Z.
    Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (10) : 13949 - 13968
  • [33] Extracting Causal Relations Among Complex Events in Natural Science Literature
    Barik, Biswanath
    Marsi, Erwin
    Ozturk, Pinar
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, NLDB 2017, 2017, 10260 : 131 - 137
  • [34] Discovering mutilingual news events and term association from the web
    Huang, RZ
    Lam, W
    Law, YY
    7TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL VI, PROCEEDINGS: INFORMATION SYSTEMS, TECHNOLOGIES AND APPLICATIONS: I, 2003, : 226 - 230
  • [35] Event Registry - Learning About World Events From News
    Leban, Gregor
    Fortuna, Blaz
    Brank, Janez
    Grobelnik, Marko
    WWW'14 COMPANION: PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2014, : 107 - 110
  • [36] Detecting and Summarizing Emergent Events in Microblogs and Social Media Streams by Dynamic Centralities
    Avudaiappan, Neela
    Herzog, Alexander
    Kadam, Sneha
    Du, Yuheng
    Thatcher, Jason
    Safro, Ilya
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 1627 - 1634
  • [37] Extracting Entities and Events from Cyber-Physical Security Incident Reports
    Ramrakhiyani, Nitin
    Patil, Sangameshwar
    Jella, Manideep
    Kumar, Alok
    Palshikar, Girish K.
    2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW, 2022, : 602 - 609
  • [38] Extraction of Temporal Events' Frequency from Online News Channels
    Fatima, Ain-Ul-Noor
    Ahmad, Haseeb
    Ahmad, Mudassar
    Ahmad, Waqar
    Faisal, Nadeem
    30TH INTERNATIONAL CONFERENCE ON COMPUTER THEORY AND APPLICATIONS (ICCTA 2020), 2020, : 109 - 116
  • [39] Extracting News Content with Visual Unit of Web Pages
    Zhu, Wenhao
    Dai, Song
    Song, Yang
    Lu, Zhiguo
    2015 16TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD), 2015, : 211 - 215
  • [40] Tracking Terrorism News Threads by Extracting Event Signatures
    Ahmed, Syed Toufeeq
    Bhindwale, Ruchi
    Davulcu, Hasan
    ISI: 2009 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS, 2009, : 182 - 184