A feasibility study on identifying drinking-related contents in Facebook through mining heterogeneous data

被引:5
作者
ElTayeby, Omar [1 ]
Eaglin, Todd [1 ]
Abdullah, Malak [1 ]
Burlinson, David [1 ]
Dou, Wenwen [1 ]
Yao, Lixia [2 ]
机构
[1] Univ N Carolina, Charlotte, NC USA
[2] Mayo Clin, Rochester, MN USA
关键词
binge drinking; image classification; machine learning; social media; text mining; video classification; SOCIAL NETWORKING SITES; COLLEGE-STUDENTS; ALCOHOL REFERENCES; BINGE DRINKING; ADOLESCENTS; EXPOSURE; TRENDS; IMAGE;
D O I
10.1177/1460458218798084
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Binge drinking is a severe health problem faced by many US colleges and universities. College students often post drinking-related text and images on social media, portraying their alcohol use as socially desirable. In this project, we investigated the feasibility of mining the heterogeneous data (e.g. text, images, and videos) on Facebook to identify drinking-related contents. We manually annotated 4266 posts during 21 October 2011 and 3 November 2014 from I shmacked group on Facebook, where 511 posts were drinking-related. Our machine learning models show that by combining heterogeneous data types, we were able to identify drinking-related posts with an F1-score of 0.81. Prediction models built on text data were more reliable compared to those built on image and video data for predicting drinking-related contents. As the first step of our efforts in this direction, this feasibility study showed promise toward unleashing the potential of mining social media to identify students who binge drink.
引用
收藏
页码:1756 / 1767
页数:12
相关论文
共 47 条
[1]  
[Anonymous], ARXIV12066435
[2]  
Bird S., 2009, Natural language processing with Python: analyzing text with the natural language toolkit
[3]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[4]   Narcissism and social networking web sites [J].
Buffardi, Laura E. ;
Campbell, W. Keith .
PERSONALITY AND SOCIAL PSYCHOLOGY BULLETIN, 2008, 34 (10) :1303-1314
[5]   Support vector machines for histogram-based image classification [J].
Chapelle, O ;
Haffner, P ;
Vapnik, VN .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1999, 10 (05) :1055-1064
[6]  
Colas F, 2006, INT FED INFO PROC, V217, P169
[7]  
Dimitrakakis C, 2002, ONLINE POLICY ADAPTA
[8]   Alcohol References on Undergraduate Males' Facebook Profiles [J].
Egan, Katie G. ;
Moreno, Megan A. .
AMERICAN JOURNAL OF MENS HEALTH, 2011, 5 (05) :413-420
[9]   Detecting Drinking-Related Contents on Social Media by Classifying Heterogeneous Data Types [J].
ElTayeby, Omar ;
Eaglin, Todd ;
Abdullah, Malak ;
Burlinson, David ;
Dou, Wenwen ;
Yao, Lixia .
ADVANCES IN ARTIFICIAL INTELLIGENCE: FROM THEORY TO PRACTICE (IEA/AIE 2017), PT II, 2017, 10351 :364-373
[10]  
Engs RC, 1996, J ALCOHOL DRUG EDUC, V41, P13