Gender Classification Using Sentiment Analysis and Deep Learning in a Health Web Forum

被引：23

作者：

Park, Sunghee ^{[1
]}

Woo, Jiyoung ^{[1
]}

机构：

[1] Soonchunhyang Univ, Dept Future Convergence Technol, Asan 31538, South Korea

来源：

APPLIED SCIENCES-BASEL | 2019年 / 9卷 / 06期

基金：

新加坡国家研究基金会;

关键词：

sentiment analysis; gender classification; machine learning; deep learning; medical web forum; INFORMATION;

D O I：

10.3390/app9061249

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Sentiment analysis is the most common text classification tool that analyzes incoming messages and tells whether the underlying sentiment is positive, negative, or neutral. We can use this technique to understand people by gender, especially people who are suffering from a sensitive disease. People use health-related web forums to easily access health information written by and for non-experts and also to get comfort from people who are in a similar situation. The government operates medical web forums to provide medical information, manage patients' needs and feelings, and boost information-sharing among patients. If we can classify people's emotional or information needs by gender, age, or location, it is possible to establish a detailed health policy specialized into patient segments. However, people with sensitive illness such as AIDS tend to hide their information. Especially, in the case of sexually transmitted AIDS, we can detect problems and needs according to gender. In this work, we present a gender detection model using sentiment analysis and machine learning including deep learning. Through the experiment, we found that sentiment features generate low accuracy. However, senti-words give better results with SVM. Overall, traditional machine learning algorithms have a high misclassification rate for the female category. The deep learning algorithm overcomes this drawback with over 90% accuracy.

引用

页数：12

共 22 条

[1]

[Anonymous], 14085882 ARXIV

[2]

[Anonymous], 2015, GENDER CLASSIFICATIO

[3]

[Anonymous], 2010, Proceedings of the 2010 conference on Empirical Methods in natural Language Processing, DOI DOI 10.5555/1870658.1870679

[4] Gender Classification of Twitter Data Based on Textual Meta-Attributes Extraction [J].

Batista Lopes Filho, Jose Ahirton ;

Pasti, Rodrigo ;

de Castro, Leandro Nunes .

NEW ADVANCES IN INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 1, 2016, 444 :1025-1034

[5] How valuable is medical social media data? Content analysis of the medical web [J].

Denecke, Kerstin ;

Nejdl, Wolfgang .

INFORMATION SCIENCES, 2009, 179 (12) :1870-1880

[6]

Dwivedi VP, 2017, INT CONF ADV COMPU, P142, DOI 10.1109/ICoAC.2017.8441506

[7]

Garibo-Orts O., 2018, P 9 INT C CLEF ASS C

[8]

Kingma DP, 2014, ARXIV

[9]

Mohammad SM, 2017, SOCIO AFFECT COMPUT, V5, P61, DOI 10.1007/978-3-319-55394-8_4

[10]

Na Y., 2002, Korean J. Sci. Emot. Sensib., V5, P9

← 1 2 3 →