Analysis and classification of privacy-sensitive content in social media posts

被引:0
|
作者
Livio Bioglio
Ruggero G. Pensa
机构
[1] University of Turin,
来源
EPJ Data Science | / 11卷
关键词
Privacy; Text classification; Content analysis;
D O I
暂无
中图分类号
学科分类号
摘要
User-generated contents often contain private information, even when they are shared publicly on social media and on the web in general. Although many filtering and natural language approaches for automatically detecting obscenities or hate speech have been proposed, determining whether a shared post contains sensitive information is still an open issue. The problem has been addressed by assuming, for instance, that sensitive contents are published anonymously, on anonymous social media platforms or with more restrictive privacy settings, but these assumptions are far from being realistic, since the authors of posts often underestimate or overlook their actual exposure to privacy risks. Hence, in this paper, we address the problem of content sensitivity analysis directly, by presenting and characterizing a new annotated corpus with around ten thousand posts, each one annotated as sensitive or non-sensitive by a pool of experts. We characterize our data with respect to the closely-related problem of self-disclosure, pointing out the main differences between the two tasks. We also present the results of several deep neural network models that outperform previous naive attempts of classifying social media posts according to their sensitivity, and show that state-of-the-art approaches based on anonymity and lexical analysis do not work in realistic application scenarios.
引用
收藏
相关论文
共 50 条
  • [31] A Privacy-Sensitive Collaborative Approach to Business Process Development
    Irshad, Hassaan
    Shafiq, Basit
    Vaidya, Jaideep
    Bashir, Muhammad Ahmed
    Asif, Hafiz Salman
    Ghayyur, Sameera
    Shamail, Shafay
    Nabil, Adam
    E-BUSINESS AND TELECOMMUNICATIONS, ICETE 2015, 2016, 585 : 318 - 342
  • [32] Preliminary Analysis of Privacy Implications Observed in Social-Media Posts Across Shopping Platforms
    Sumner, Bethany
    Dorai, Gokila
    Heslen, John
    PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY AND SECURITY, ARES 2022, 2022,
  • [33] Privacy-sensitive recognition of group conversational context with sociometers
    Jayagopi, Dineshbabu
    Kim, Taemie
    Pentland, Alex
    Gatica-Perez, Daniel
    MULTIMEDIA SYSTEMS, 2012, 18 (01) : 3 - 14
  • [34] Privacy-Sensitive Audio Features for Speech/Nonspeech Detection
    Parthasarathi, Sree Hari Krishnan
    Gatica-Perez, Daniel
    Bourlard, Herve
    Magimai-Doss, Mathew
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (08): : 2538 - 2551
  • [35] SensIR: Towards privacy-sensitive image retrieval in the cloud
    Hu, Lishuang
    Xiang, Tao
    Guo, Shangwei
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2020, 84
  • [36] SOCIAL ACTIONS OF SOCIAL NETWORKS USERS TRIGGERED BY MEDIA POSTS CONTENT
    Zakharchenko, Artem
    SOCIAL WELFARE INTERDISCIPLINARY APPROACH, 2018, 8 (02): : 53 - 65
  • [37] Classification of Health-Related Social Media Posts: Evaluation of Post Content-Classifier Models and Analysis of User Demographics
    Rivas, Ryan
    Sadah, Shouq A.
    Guo, Yuhang
    Hristidis, Vagelis
    JMIR PUBLIC HEALTH AND SURVEILLANCE, 2020, 6 (02): : 25 - 43
  • [38] Inferring Colocation and Conversation Networks from Privacy-Sensitive Audio with Implications for Computational Social Science
    Wyatt, Danny
    Choudhury, Tanzeem
    Bilmes, Jeff
    Kitts, James A.
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (01)
  • [39] Protecting Privacy-Sensitive Locations in Trajectories with Correlated Positions
    Liu, Bo
    Zhu, Tianqing
    Zhou, Wanlei
    Wang, Kun
    Zhou, Haibo
    Ding, Ming
    2019 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2019,
  • [40] Privacy-sensitive recognition of group conversational context with sociometers
    Dineshbabu Jayagopi
    Taemie Kim
    Alex Pentland
    Daniel Gatica-Perez
    Multimedia Systems, 2012, 18 : 3 - 14