Data Analysis of the Web News Headlines based on Natural Language Processing

被引:0
|
作者
Karna, Hrvoje [1 ,2 ]
Braovic, Maja [3 ]
Vickovic, Linda [3 ]
Krstinic, Damir [3 ]
机构
[1] Minist Def Republ Croatia, Zagreb, Croatia
[2] Univ Split, Split, Croatia
[3] Univ Split, Fac Elect Engn Mech Engn & Naval Architecture, Dept Elect & Comp, Split, Croatia
关键词
data mining; information extraction; natural language processing; news portals; text analysis;
D O I
10.24138/jcomss-2023-0047
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
paper explores the problem of media content data analysis with the focus on the phenomenon of vaccination, closely related to the COVID-19 pandemic. The presented research is an extension of the previous work, but it differs in two main areas. Firstly, the text corpus submitted to the analysis has been considerably increased. Secondly, the previous data analysis was performed on the body part of the posts, while now it is focused on the most prominent part of the news posts, their headlines. This change from body to headline analysis was provoked by significant differences in their characteristics and the fact that most people read only headlines. Described data acquisition uses an advanced content collection approach followed by the modeling process, during which a set of natural language processing algorithms were applied. To enable the comparison, the model uses the same set of algorithms in the modeling phase like in previous work. The main contributions of the work are manifested in: i) approaching the problem from a new perspective, ii) applying more efficient method of data collection, and crucially iii) enabling the comparison of analysis results for individual parts of the content, which ensured a comprehensive insight into the characteristics of news posts.
引用
收藏
页码:158 / 167
页数:10
相关论文
共 50 条
  • [1] Clickbait Pattern Detection and Classification of News Headlines using Natural Language Processing
    Manjesh, Suraj
    Kanakagiri, Tushar
    Vaishak, P.
    Chettiar, Vivek
    Shobha, G.
    2017 2ND INTERNATIONAL CONFERENCE ON COMPUTATIONAL SYSTEMS AND INFORMATION TECHNOLOGY FOR SUSTAINABLE SOLUTION (CSITSS-2017), 2017, : 153 - 158
  • [2] Electronic Medical Record Data Mining and Processing Based on Natural Language Processing
    Zhang, Shichen
    PROCEEDINGS OF 2024 INTERNATIONAL CONFERENCE ON MACHINE INTELLIGENCE AND DIGITAL APPLICATIONS, MIDA2024, 2024, : 212 - 217
  • [3] Understanding the Pandemic Through Mining Covid News Using Natural Language Processing
    Sadman, Nafiz
    Anjum, Nishat
    Gupta, Kishor Datta
    Mahmud, M. A. Parvez
    2021 IEEE 11TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE (CCWC), 2021, : 362 - 367
  • [4] Research on web monitoring system based on natural language processing
    Liu, L
    Fan, XZ
    Zhao, XP
    2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, PROCEEDINGS, 2003, : 746 - 751
  • [5] Natural language processing in the web era
    Basili, Roberto
    Magnini, Bernardo
    INTELLIGENZA ARTIFICIALE, 2012, 6 (02) : 117 - 119
  • [6] Analysis of news sentiments using natural language processing and deep learning
    Mattia Vicari
    Mauro Gaspari
    AI & SOCIETY, 2021, 36 : 931 - 937
  • [7] Analysis of news sentiments using natural language processing and deep learning
    Vicari, Mattia
    Gaspari, Mauro
    AI & SOCIETY, 2021, 36 (03) : 931 - 937
  • [8] A hadoop based platform for natural language processing of web pages and documents
    Nesi, Paolo
    Pantaleo, Gianni
    Sanesi, Gianmarco
    JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 2015, 31 : 130 - 138
  • [9] A Natural Language processing for semantic web services
    Stanojevic, M
    Vranes, S
    Eurocon 2005: The International Conference on Computer as a Tool, Vol 1 and 2 , Proceedings, 2005, : 229 - 232
  • [10] Natural Language Processing with Optimal Deep Learning Based Fake News Classification
    Althubiti, Sara A.
    Alenezi, Fayadh
    Mansour, Romany F.
    CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 73 (02): : 3529 - 3544