Gaining Insights From Social Media Language: Methodologies and Challenges

被引:133
作者
Kern, Margaret L. [1 ]
Park, Gregory [2 ]
Eichstaedt, Johannes C. [2 ]
Schwartz, H. Andrew [3 ,4 ]
Sap, Maarten [2 ]
Smith, Laura K. [2 ]
Ungar, Lyle H. [3 ]
机构
[1] Univ Melbourne, Melbourne Grad Sch Educ, 100 Leicester St,Level 2, Parkville, Vic 3010, Australia
[2] Univ Penn, Dept Psychol, Philadelphia, PA 19104 USA
[3] Univ Penn, Dept Comp & Informat Sci, Philadelphia, PA 19104 USA
[4] SUNY Stony Brook, Dept Comp Sci, Stony Brook, NY USA
关键词
social media; linguistic analysis; interdisciplinary collaboration; online behavior; computational social science; LATENT SEMANTIC ANALYSIS; REGRESSION; PERSONALITY; ALGORITHM; SELECTION; NETWORK; SCIENCE; HEALTH; USERS; TEXT;
D O I
10.1037/met0000091
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Language data available through social media provide opportunities to study people at an unprecedented scale. However, little guidance is available to psychologists who want to enter this area of research. Drawing on tools and techniques developed in natural language processing, we first introduce psychologists to social media language research, identifying descriptive and predictive analyses that language data allow. Second, we describe how raw language data can be accessed and quantified for inclusion in subsequent analyses, exploring personality as expressed on Facebook to illustrate. Third, we highlight challenges and issues to be considered, including accessing and processing the data, interpreting effects, and ethical issues. Social media has become a valuable part of social life, and there is much we can learn by bringing together the tools of computer science with the theories and insights of psychology.
引用
收藏
页码:507 / 525
页数:19
相关论文
共 123 条
[51]  
Han B., 2011, P 49 ANN M ASS COMP, P368
[52]  
Hastie T., 2009, The Elements of Statistical learning: Data mining, inference, and Prediction
[53]   Guidance issued for US Internet research [J].
Hayden, Erika Check .
NATURE, 2013, 496 (7446) :411-411
[54]   Analyzing spatiotemporal trends in social media data via smoothing spline analysis of variance [J].
Helwig, Nathaniel E. ;
Gao, Yizhao ;
Wang, Shaowen ;
Ma, Ping .
SPATIAL STATISTICS, 2015, 14 :491-504
[55]   RIDGE REGRESSION - BIASED ESTIMATION FOR NONORTHOGONAL PROBLEMS [J].
HOERL, AE ;
KENNARD, RW .
TECHNOMETRICS, 1970, 12 (01) :55-&
[56]   Analysis of a complex of statistical variables into principal components [J].
Hotelling, H .
JOURNAL OF EDUCATIONAL PSYCHOLOGY, 1933, 24 :417-441
[57]   Automated text analysis in psychology: methods, applications, and future developments [J].
Iliev, Rumen ;
Dehghani, Morteza ;
Sagi, Eyal .
LANGUAGE AND COGNITION, 2015, 7 (02) :265-290
[58]  
Jurafsky D., 2014, Computational Linguistics and Speech Recognition
[59]  
Jurgens, 2013, ICWSM, P273, DOI 10.1609/icwsm.v7i1 .14399
[60]  
Jurgens D., 2015, P 9 INT AAAI C WEB S