Semi-supervised internet water army detection based on graph embedding

被引:2
作者
He, Ying [1 ]
Yang, Pin [1 ]
Cheng, Pengsen [1 ]
机构
[1] Sichuan Univ, Sch Cyber Sci & Engn, Chengdu, Peoples R China
关键词
Internet water army detection; Graph embedding; Semi-supervised learning; Online social network; SPAMMER DETECTION; ACCOUNTS;
D O I
10.1007/s11042-022-13633-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Sina Weibo is one of the most popular online social networks, which provides users with a convenient and fast way to communicate. However, its openness and large user base are often exploited by water armies to spread disinformation. These water armies seriously affect the security and reliability of social networks. Existing approaches on the water army detection mainly classify users by their account information, behavior and other features. However, the increasing anti-detection capability of water armies has led to insufficient differentiation of detection features. In addition, there is a lack of labeled data for model training. In this paper, we propose a semi-supervised approach combining network structure features and user attribute features for identifying the Internet water army. This approach uses the graph embedding algorithm to obtain network structure features of users, which together with the defined user attribute features constitute the detection feature set. Users are classified by Tri-Training, a semi-supervised learning algorithm, which leverages the synergistic advantages of multiple classifiers and reduces the need for a large number of labeled data. Experiments on real-world data illustrate that our approach can identify the Internet water army effectively, and it is more suitable for real scenarios with less labeled data. The accuracy can reach 95.15%.
引用
收藏
页码:9891 / 9912
页数:22
相关论文
共 49 条
  • [1] Twitter spam account detection based on clustering and classification methods
    Adewole, Kayode Sakariyah
    Hang, Tao
    Wu, Wanqing
    Songs, Houbing
    Sangaiah, Arun Kumar
    [J]. JOURNAL OF SUPERCOMPUTING, 2020, 76 (07) : 4802 - 4837
  • [2] Aggarwal A, 2012, ECRIM RES SUM
  • [3] A generic statistical approach for spam detection in Online Social Networks
    Ahmed, Faraz
    Abulaish, Muhammad
    [J]. COMPUTER COMMUNICATIONS, 2013, 36 (10-11) : 1120 - 1129
  • [4] Al-Thelaya Khaled A., 2020, 2020 IEEE International Conference on Informatics, IoT, and Enabling Technologies (ICIoT), P206, DOI 10.1109/ICIoT48696.2020.9089509
  • [5] Detect Me If You Can: Spam Bot Detection Using Inductive Representation Learning
    Alhosseini, Seyed Ali
    Bin Tareaf, Raad
    Najafi, Pejman
    Meinel, Christoph
    [J]. COMPANION OF THE WORLD WIDE WEB CONFERENCE (WWW 2019 ), 2019, : 148 - 153
  • [6] If it looks like a spammer and behaves like a spammer, it must be a spammer: analysis and detection of microblogging spam accounts
    Almaatouq, Abdullah
    Shmueli, Erez
    Nouh, Mariam
    Alabdulkareem, Ahmad
    Singh, Vivek K.
    Alsaleh, Mansour
    Alarifi, Abdulrahman
    Alfaris, Anas
    Pentland, Alex 'Sandy'
    [J]. INTERNATIONAL JOURNAL OF INFORMATION SECURITY, 2016, 15 (05) : 475 - 491
  • [7] Detecting Spammers and Content Promoters in Online Video Social Networks
    Benevenuto, Fabricio
    Rodrigues, Tiago
    Almeida, Virgilio
    Almeida, Jussara
    Goncalves, Marcos
    [J]. PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, : 620 - 627
  • [8] Bhat SY, 2013, 2013 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM), P106
  • [9] Discovering spammer communities in twitter
    Bindu, P. V.
    Mishra, Rahul
    Thilagam, P. Santhi
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2018, 51 (03) : 503 - 527
  • [10] A Comprehensive Survey of Graph Embedding: Problems, Techniques, and Applications
    Cai, HongYun
    Zheng, Vincent W.
    Chang, Kevin Chen-Chuan
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (09) : 1616 - 1637