Hoax News Detection on Twitter using Term Frequency Inverse Document Frequency and Support Vector Machine Method

被引:4
作者
Fauzi, A. [1 ]
Setiawan, E. B. [1 ]
Baiza, Z. K. A. [1 ]
机构
[1] Telkom Univ, Sch Comp, Telekomunikasi St 01, Bandung 40257, Indonesia
来源
2ND INTERNATIONAL CONFERENCE ON DATA AND INFORMATION SCIENCE | 2019年 / 1192卷
关键词
social media; twitter; hoax; TF-IDF; SVM;
D O I
10.1088/1742-6596/1192/1/012025
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Twitter is one of the social media that is currently popularly used around the world. It's just that twitter has some problems that adversely affect its users. Hoax is one of the negative things that often occur in social media, news in the hoax is still doubted the truth or the fact. In this final project, the authors built a system to detect hoax news on twitter. The purpose of the research is to minimize the hoax news spread on twitter. The use of the Term Frequency Inverse Document Frequency (TF-IDF) weighting system in the system gives a weighted value to a tweet taken from the occurrence of a hoax news sent by someone on Twitter. Data classification uses the Support Vector Machine (SVM) method of the system to predict the possibility of a twitter account user spreading a hoax news based on the user's behavior. Testing data is done based on the contents of content tweets. Datasets are arranged based on attributes used such as the number of retweets, URLs, number of hashtags, provocations, feuds, anxieties, and unbalanced news. Processed data is divided into training data and testing data. The result of data using all features get the highest accuracy is 78,33%. The contribution of this research is that it can detect news that has a tendency towards hoax and can filter which is classified as hoax or not hoax.
引用
收藏
页数:6
相关论文
共 10 条
[1]  
Asur S., 2010, Proceedings 2010 IEEE/ACM International Conference on Web Intelligence-Intelligent Agent Technology (WI-IAT), P492, DOI 10.1109/WI-IAT.2010.63
[2]  
Feldman James Sanger Ronen, 2007, TEXT MINING HDB
[3]  
Gunn S.R., 1998, ISIS TECH REP, V14, P5
[4]  
Han dan J., 2006, DATA MINING CONCEPTS
[5]  
Han J, 2012, MOR KAUF D, P1
[6]  
Jiang Yunliang, 2010, Proceedings 2010 International Conference on Web Information Systems and Mining (WISM 2010), P257, DOI 10.1109/WISM.2010.14
[7]  
Liang J. Z., 2004, P IEEE 3 INT C MACH
[8]  
McCord M., 2011, SPAM DETECTION TWITT
[9]  
Prasetijo AB, 2017, 2017 4TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY, COMPUTER, AND ELECTRICAL ENGINEERING (ICITACEE), P45, DOI 10.1109/ICITACEE.2017.8257673
[10]  
Situngkir, 2011, SPREAD HOAX SOCIAL M