A Performance Comparison of Unsupervised Techniques for Event Detection from Oscar Tweets

被引:1
作者
Malik, Muzamil [1 ]
Aslam, Waqar [1 ]
Aslam, Zahid [1 ]
Alharbi, Abdullah [2 ]
Alouffi, Bader [3 ]
Rauf, Hafiz Tayyab [4 ]
机构
[1] Islamia Univ Bahawalpur, Dept Comp Sci & Informat Technol, Bahawalpur, Pakistan
[2] Taif Univ, Coll Comp & Informat Technol, Dept Informat Technol, POB 11099, Taif 21944, Saudi Arabia
[3] Taif Univ, Coll Comp & Informat Technol, Dept Comp Sci, POB 11099, Taif 21944, Saudi Arabia
[4] Staffordshire Univ, Ctr Smart Syst AI & Cybersecur, Stoke On Trent, England
关键词
TOPIC DETECTION; TWITTER;
D O I
10.1155/2022/5980043
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
People's lives are influenced by social media. It is an essential source for sharing news, awareness, detecting events, people's interests, etc. Social media covers a wide range of topics and events to be discussed. Extensive work has been published to capture the interesting events and insights from datasets. Many techniques are presented to detect events from social media networks like Twitter. In text mining, most of the work is done on a specific dataset, and there is the need to present some new datasets to analyse the performance and generic nature of Topic Detection and Tracking methods. Therefore, this paper publishes a dataset of real-life event, the Oscars 2018, gathered from Twitter and makes a comparison of soft frequent pattern mining (SFPM), singular value decomposition and k-means (K-SVD), feature-pivot (Feat-p), document-pivot (Doc-p), and latent Dirichlet allocation (LDA). The dataset contains 2,160,738 tweets collected using some seed words. Only English tweets are considered. All of the methods applied in this paper are unsupervised. This area needs to be explored on different datasets. The Oscars 2018 is evaluated using keyword precision (K-Prec), keyword recall (K-Rec), and topic recall (T-Rec) for detecting events of greater interest. The highest K-Prec, K-Rec, and T-Rec were achieved by SFPM, but they started to decrease as the number of clusters increased. The lowest performance was achieved by Feat-p in terms of all three metrics. Experiments on the Oscars 2018 dataset demonstrated that all the methods are generic in nature and produce meaningful clusters.
引用
收藏
页数:14
相关论文
共 34 条
  • [1] A rule dynamics approach to event detection in Twitter with its application to sports and politics
    Adedoyin-Olowe, Mariam
    Gaber, Mohamed Medhat
    Dancausa, Carlos M.
    Stahl, Frederic
    Gomes, Joao Bartolo
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2016, 55 : 351 - 360
  • [2] Afzali Maedeh, 2019, 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon), P263, DOI 10.1109/COMITCon.2019.8862247
  • [3] K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation
    Aharon, Michal
    Elad, Michael
    Bruckstein, Alfred
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2006, 54 (11) : 4311 - 4322
  • [4] TOP-Rank: A Novel Unsupervised Approach for Topic Prediction Using Keyphrase Extraction for Urdu Documents
    Amin, Ahmad
    Rana, Toqir A.
    Mian, Natash Ali
    Iqbal, Muhammad Waseem
    Khalid, Abbas
    Alyas, Tahir
    Tubishat, Mohammad
    [J]. IEEE ACCESS, 2020, 8 (08): : 212675 - 212686
  • [5] Becker H., 2011, P 5 INT C WEBLOGS SO
  • [6] Emerging topic detection in twitter stream based on high utility pattern mining
    Choi, Hyeok-Jun
    Park, Cheong Hee
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2019, 115 : 27 - 36
  • [7] Real-Time Detection of Traffic From Twitter Stream Analysis
    D'Andrea, Eleonora
    Ducange, Pietro
    Lazzerini, Beatrice
    Marcelloni, Francesco
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2015, 16 (04) : 2269 - 2283
  • [8] Event Detection in Twitter Microblogging
    Doulamis, Nikolaos D.
    Doulamis, Anastasios D.
    Kokkinos, Panagiotis
    Varvarigos, Emmanouel
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2016, 46 (12) : 2810 - 2824
  • [9] Fung G.P. C., 2005, P 31 INT C VERY LARG
  • [10] Multilevel Event Detection, Storyline Generation, and Summarization for Tweet Streams
    Goyal, Poonam
    Kaushik, Prerna
    Gupta, Pranjal
    Vashisth, Dev
    Agarwal, Shavak
    Goyal, Navneet
    [J]. IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2020, 7 (01) : 8 - 23