Detecting clickbaits using two-phase hybrid CNN-LSTM biterm model

被引：21

作者：

Kaur, Sawinder ^{[1
]}

Kumar, Parteek ^{[1
]}

Kumaraguru, Ponnurangam ^{[2
]}

机构：

[1] Thapar Inst Engn & Technol, Patiala, Punjab, India

[2] Indraprastha Inst Informat Technol, Delhi, India

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2020年 / 151卷

关键词：

Clickbait; News; Classifier; Features; Social media;

D O I：

10.1016/j.eswa.2020.113350

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Clickbait indicates the type of content with an intending goal to attract the attention of readers. It has grown to become a nuisance to social media users. The purpose of clickbait is to bring an appealing link in front of users. Clickbaits seen in the form of headlines influence people to get attracted and curious to read the inside content. The content seen in the form of text on clickbait posts is very short to identify its features as clickbait. In this paper, a novel approach (two-phase hybrid CNN-LSTM Biterm model) has been proposed for modeling short topic content. The hybrid CNN-LSTM model when implemented with pre-trained GloVe embedding yields the best results based on accuracy, recall, precision, and F1-score performance metrics. The proposed model achieves 91.24%, 95.64%, 95.87% precision values for Dataset 1, Dataset 2 and Dataset 3, respectively. Eight types of clickbait such as Reasoning, Number, Reaction, Revealing, Shocking/Unbelievable, Hypothesis/Guess, Questionable, Forward referencing are classified in this work using the Biterm Topic Model (BTM). It has been shown that the clickbaits such as Shocking/Unbelievable, Hypothesis/Guess and Reaction are the highest in numbers among rest of the clickbait headlines published online. Also, a ground dataset of non-textual (image-based) data using multiple social media platforms has been created in this paper. The textual information has been retrieved from the images with the help of OCR tool. A comparative study is performed to show the effectiveness of our proposed model which helps to identify the various categories of clickbait headlines that are spread on social media platforms. (C) 2020 Elsevier Ltd. All rights reserved.

引用

页数：13

共 40 条

[1]

Agrawal A, 2016, PROCEEDINGS ON 2016 2ND INTERNATIONAL CONFERENCE ON NEXT GENERATION COMPUTING TECHNOLOGIES (NGCT), P268, DOI 10.1109/NGCT.2016.7877426

[2]

[Anonymous], 2005, P 43 ANN M ASS COMP

[3]

[Anonymous], 2013, International Journal, DOI DOI 10.18576/AMIS/100428

[4]

Bergstra J., 2010, PROC 9 PYTHON SCI

[5]

Biyani P, 2016, AAAI CONF ARTIF INTE, P94

[6] Click bait: Forward-reference as lure in online news headlines [J].

Blom, Jonas Nygaard ;

Hansen, Kenneth Reinecke .

JOURNAL OF PRAGMATICS, 2015, 76 :87-100

[7]

Cao X., 2017, ARXIV171001977

[8] Tabloids in the era of social media? Understanding the production and consumption of clickbaits in Twitter [J].

Chakraborty A. ;

Sarkar R. ;

Mrigen A. ;

Ganguly N. .

Proceedings of the ACM on Human-Computer Interaction, 2017, 1 (CSCW)

[9]

Chakraborty A, 2016, PROCEEDINGS OF THE 2016 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING ASONAM 2016, P9, DOI 10.1109/ASONAM.2016.7752207

[10]

Chen Y., 2015, P 2015 ACM WORKSH MU, P15, DOI DOI 10.1145/2823465.2823467

← 1 2 3 4 →