Privacy-Preserving Text Labelling Through Crowdsourcing

被引:1
作者
Haralabopoulos, Giannis [1 ]
Torres, Mercedes Torres [1 ]
Anagnostopoulos, Ioannis [2 ]
McAuley, Derek [1 ]
机构
[1] Univ Nottingham, Nottingham, England
[2] Univ Thessaly, Lamia, Greece
来源
ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS. AIAI 2021 IFIP WG 12.5 INTERNATIONAL WORKSHOPS | 2021年 / 628卷
基金
英国工程与自然科学研究理事会;
关键词
Privacy; Crowdsourcing; Labelling; Natural language processing;
D O I
10.1007/978-3-030-79157-5_35
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The extensive use of online social media has highlighted the importance of privacy in the digital space. As more scientists analyse the data created in these platforms, privacy concerns have extended to data usage within the academia. Although text analysis is a well documented topic in academic literature with a multitude of applications, ensuring privacy of user-generated content has been overlooked. In an effort to reduce the exposure of online users' information, we propose a privacy-preserving text labelling method for varying applications, based in crowdsourcing. We transform text with different levels of privacy and analyse the effectiveness of the transformation with regards to label correlation. To demonstrate the adaptive nature of our approach we also employ a TF/IDF filtering transformation. Our results suggest that total privacy can be implemented in labelling, retaining the annotational diversity and subjectivity of traditional labelling. The privacy-preserving labelling, with the use of NRC lexicon, demonstrates an average 0.11 Mean Spearman's Rho correlation, boosted to 0.124 with TF/IDF filtering.
引用
收藏
页码:431 / 445
页数:15
相关论文
共 22 条
[1]  
Barnes S. B., 2006, First Monday, V11, DOI DOI 10.5210/FM.V11I9.1394
[2]  
De Cristofaro E, 2011, WISEC 11: PROCEEDINGS OF THE FOURTH ACM CONFERENCE ON WIRELESS NETWORK SECURITY, P23
[3]   Is the privacy paradox a relic of the past? An in-depth analysis of privacy attitudes and privacy behaviors [J].
Dienlin, Tobias ;
Trepte, Sabine .
EUROPEAN JOURNAL OF SOCIAL PSYCHOLOGY, 2015, 45 (03) :285-297
[4]   Sentiment analysis leveraging emotions and word embeddings [J].
Giatsoglou, Maria ;
Vozalis, Manolis G. ;
Diamantaras, Konstantinos ;
Vakali, Athena ;
Sarigiannidis, George ;
Chatzisavvas, Konstantinos Ch. .
EXPERT SYSTEMS WITH APPLICATIONS, 2017, 69 :214-224
[5]  
Gundecha Pritam., 2012, 2012 TUTORIALS OPERA, P1, DOI DOI 10.1287/EDUC.1120.0105
[6]  
Haralabopoulos G., 2020, P LREC 2020 WORKSH C, P15
[7]  
Haralabopoulos G, 2017, Arxiv, DOI arXiv:1710.04203
[8]   Text data augmentations: Permutation, antonyms and negation [J].
Haralabopoulos, Giannis ;
Torres, Mercedes Torres ;
Anagnostopoulos, Ioannis ;
McAuley, Derek .
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 177
[9]   Ensemble Deep Learning for Multilabel Binary Classification of User-Generated Content [J].
Haralabopoulos, Giannis ;
Anagnostopoulos, Ioannis ;
McAuley, Derek .
ALGORITHMS, 2020, 13 (04)
[10]   Paid Crowdsourcing, Low Income Contributors, and Subjectivity [J].
Haralabopoulos, Giannis ;
Wagner, Christian ;
McAuley, Derek ;
Anagnostopoulos, Ioannis .
ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS (AIAI 2019), 2019, 560 :225-231