Evaluating Machine Learning Algorithms for Fake News Detection

被引:0
作者
Gilda, Shlok [1 ]
机构
[1] Pune Inst Comp Technol, Dept Comp Engn, Pune, Maharashtra, India
来源
PROCEEDINGS OF THE 2017 IEEE 15TH STUDENT CONFERENCE ON RESEARCH AND DEVELOPMENT (SCORED) | 2017年
关键词
Natural language processing; Machine learning; Classification algorithms; Fake-news detection;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper explores the application of natural language processing techniques for the detection of 'fake news', that is, misleading news stories that come from non-reputable sources. Using a dataset obtained from Signal Media and a list of sources from OpenSources. co, we apply term frequency-inverse document frequency (TF-IDF) of bi-grams and probabilistic context free grammar (PCFG) detection to a corpus of about 11,000 articles. We test our dataset on multiple classification algorithms Support Vector Machines, Stochastic Gradient Descent, Gradient Boosting, Bounded Decision Trees, and Random Forests. We find that TF-IDF of bi-grams fed into a Stochastic Gradient Descent model identifies non-credible sources with an accuracy of 77.2%, with PCFGs having slight effects on recall.
引用
收藏
页码:110 / 115
页数:6
相关论文
共 18 条
[1]  
[Anonymous], 2016, News feed fyi: Addressing hoaxes and fake news
[2]  
[Anonymous], 2014, P 18 C EMP METH NAT, DOI DOI 10.3115/V1/D14-1082
[3]   Cython: The Best of Both Worlds [J].
Behnel, Stefan ;
Bradshaw, Robert ;
Citro, Craig ;
Dalcin, Lisandro ;
Seljebotn, Dag Sverre ;
Smith, Kurt .
COMPUTING IN SCIENCE & ENGINEERING, 2011, 13 (02) :31-39
[4]  
Bird, 2009, NATURAL LANGUAGE PRO
[5]  
Buitinck L, 2013, ECML PKDD WORKSH LAN, P108, DOI DOI 10.48550/ARXIV.1309.0238
[6]  
Choi JD, 2015, PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1, P387
[7]  
Ciampaglia Giovanni Luca, 2015, PLoS One, V10, DOI DOI 10.1371/JOURNAL.PONE.0128193
[8]  
Collins M, 2002, PROCEEDINGS OF THE 2002 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, P1
[9]  
Conroy NK., 2015, P ASS INFORM SCI TEC, V52, P1
[10]  
Corney D., 2016, P 1 INT WORKSH REC T, P42