Identity Linkage Across Diverse Social Networks

被引:3
作者
Benkhedda, Youcef [1 ]
Azouaou, Faical [1 ]
Abbar, Sofiane [2 ]
机构
[1] Ecole Natl Super Informat, BP 68M, Algiers 16309, Algeria
[2] HBKU, Qatar Comp Res Inst, Doha 5825, Qatar
来源
2020 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM) | 2020年
关键词
D O I
10.1109/ASONAM49781.2020.9381445
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
User identity linkage across online social networks has gained a significant interest in the last few years in diverse applications such as data fusion, de-duplication, personalized advertisement, user profiling, and expert recommendation. Existing techniques investigated the use of personal discrete attributes such as user name, gender, location, and email which are not always available. Other techniques explored the use of network relations. In our proposal, we attempt to design a generic framework for user identity linkage across diverse social networks based exclusively on the widely available textual user generated content. We intentionally selected two social networks, Twitter and Quora, which have different contribution models and serve different purposes, and explore different supervised and unsupervised techniques for matching profiles as well as different language models ranging from simple tf*idf vectorization to more sophisticated BERT embeddings. We discuss the limits of different choices and present some encouraging preliminary results. For example, we find that prolific users can be identified with 84% accuracy. We also present a framework we designed to create the largest publicly available annotated dataset for profile linkage in social networks.
引用
收藏
页码:468 / 472
页数:5
相关论文
共 4 条
[1]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[2]   Billion-Scale Similarity Search with GPUs [J].
Johnson, Jeff ;
Douze, Matthijs ;
Jegou, Herve .
IEEE TRANSACTIONS ON BIG DATA, 2021, 7 (03) :535-547
[3]   How to Fine-Tune BERT for Text Classification? [J].
Sun, Chi ;
Qiu, Xipeng ;
Xu, Yige ;
Huang, Xuanjing .
CHINESE COMPUTATIONAL LINGUISTICS, CCL 2019, 2019, 11856 :194-206
[4]   Scalable Sparse Subspace Clustering by Orthogonal Matching Pursuit [J].
You, Chong ;
Robinson, Daniel P. ;
Vidal, Rene .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3918-3927