An Empirical Study for Detecting Fake Facebook Profiles Using Supervised Mining Techniques

被引:4
作者
Albayati, Mohammed Basil [1 ]
Altamimi, Ahmad Mousa [1 ]
机构
[1] Appl Sci Private Univ, Amman, Jordan
来源
INFORMATICA-AN INTERNATIONAL JOURNAL OF COMPUTING AND INFORMATICS | 2019年 / 43卷 / 01期
关键词
data mining; online social networks; facebook; fake profiles; data science; SOCIAL MEDIA;
D O I
10.31449/inf.v43i1.2319
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Our social life and the way of people communicate are greatly affected by the social media technologies. The variety of stand-alone and built-in social media services such as Facebook, Twitter, LinkedIn, and alike facilitate users to create highly interactive platforms. However, these overwhelming technologies made us sank in an enormous amount of information. Recently, Facebook exposed data on 50 million Facebook unaware users for analytical purposes. Fake profiles are also used by Scammers to infiltrate networks of friends to wreak all sorts of havoc as stealing valuable information, financial fraud, or entering other user's social graph. In this paper, we turn our focus to Facebook fake profiles, and proposed a smart system (FBChecker) that enables users to check if any Facebook profile is fake. To achieve that, FBChecker utilizes the data mining approach to analyze and classify a set of behavioral and informational attributes provided in the personal profiles. Specifically, we empirically examine these attributes using four supervised data mining algorithms (e.g., k-NN, decision tree, SVM, and naive Bayes) to determine how successfully we can recognize the fake profiles. To demonstrate the validity of our conceptual work, the selected classifiers have been implemented using RapidMiner data science platform with a dataset of 200 profiles collected from the authors' profile and a honeypot page. Two experiments are developed; in the first one, the k-NN schema is applied as an estimator model for imputation the missing data with substituted values, whereas in the second experiment a filtering operator is applied to exclude the profiles with missing values. Results showed high accuracy rate with the all classifiers, however, the SVM outperforms other classifiers with an accuracy rate of 98.0% followed by Naive Bayes.
引用
收藏
页码:77 / 86
页数:10
相关论文
共 24 条
[1]  
[Anonymous], 2008, P INT C WEB SEARCH W
[2]  
[Anonymous], 2011, J. Mach. Learn. Technol
[3]   Detection of Spammers in Twitter marketing: A Hybrid Approach Using Social Media Analytics and Bio Inspired Computing [J].
Aswani, Reema ;
Kar, Arpan Kumar ;
Ilavarasan, P. Vigneswara .
INFORMATION SYSTEMS FRONTIERS, 2018, 20 (03) :515-530
[4]  
Bhat SajidYousuf., 2013, ADV SOCIAL NETWORKS, P100
[5]  
Buchner A.G., 1998, SIGMOD Record, V27, P54
[6]   A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection [J].
Buczak, Anna L. ;
Guven, Erhan .
IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2016, 18 (02) :1153-1176
[7]  
Cook D J, 2006, Mining Graph Data, DOI [10.1002/0470073047, DOI 10.1002/0470073047]
[8]   Friend or foe? Fake profile identification in online social networks [J].
Fire, Michael ;
Kagan, Dima ;
Elyashar, Aviad ;
Elovici, Yuval .
SOCIAL NETWORK ANALYSIS AND MINING, 2014, 4 (01) :1-23
[9]  
Gupta A., 2017, Addressing the risk culture challenge in banking using text analytics, P1
[10]  
Hajirnis A., 2015, The Brown University Child and Adolescent Behavior Letter, V31, P1, DOI DOI 10.1002/CBL.30086