Threshold-based Naive Bayes classifier

被引：11

作者：

Romano, Maurizio ^{[1
]}

Contu, Giulia ^{[1
]}

Mola, Francesco ^{[1
]}

Conversano, Claudio ^{[1
]}

机构：

[1] Univ Cagliari, Dept Econ & Business Sci, Cagliari, Italy

来源：

ADVANCES IN DATA ANALYSIS AND CLASSIFICATION | 2024年 / 18卷 / 02期

关键词：

Naive Bayes; Booking; com; Customer satisfaction; Sentiment analysis; Natural language processing; Word of mouth; ONLINE REVIEWS; QUALITY;

D O I：

10.1007/s11634-023-00536-8

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

The Threshold-based Naive Bayes (Tb-NB) classifier is introduced as a (simple) improved version of the original Naive Bayes classifier. Tb-NB extracts the sentiment from a Natural Language text corpus and allows the user not only to predict how much a sentence is positive (negative) but also to quantify a sentiment with a numeric value. It is based on the estimation of a single threshold value that concurs to define a decision rule that classifies a text into a positive (negative) opinion based on its content. One of the main advantage deriving from Tb-NB is the possibility to utilize its results as the input of post-hoc analysis aimed at observing how the quality associated to the different dimensions of a product or a service or, in a mirrored fashion, the different dimensions of customer satisfaction evolve in time or change with respect to different locations. The effectiveness of Tb-NB is evaluated analyzing data concerning the tourism industry and, specifically, hotel guests' reviews from all hotels located in the Sardinian region and available on Booking.com. Moreover, Tb-NB is compared with other popular classifiers used in sentiment analysis in terms of model accuracy, resistance to noise and computational efficiency.

引用

页码：325 / 361

页数：37

共 41 条

[1] ROLE OF PRODUCT-RELATED CONVERSATIONS IN DIFFUSION OF A NEW PRODUCT [J].