Analysis of Web Browsing Data: A Guide

被引:5
|
作者
von Hohenberg, Bernhard Clemm [1 ,10 ]
Stier, Sebastian [2 ,8 ]
Cardenal, Ana S. [3 ]
Guess, Andrew M. [4 ,5 ]
Menchen-Trevino, Ericka [6 ]
Wojcieszak, Magdalena [7 ,9 ]
机构
[1] GESIS Leibniz Inst Social Sci, Cologne, Germany
[2] GESIS Leibniz Inst Social Sci, Computat Social Sci Dept, Cologne, Germany
[3] Univ Oberta Catalunya, Barcelona, Spain
[4] Princeton Univ, Polit & Publ Affairs, Princeton, NJ USA
[5] Princeton Univ, Ctr Informat Technol Policy, Princeton, NJ USA
[6] Amer Univ, Washington, DC USA
[7] Univ Calif Davis, Davis, CA USA
[8] Univ Mannheim, Sch Social Sci, Mannheim, Germany
[9] Univ Amsterdam, Amsterdam Sch Commun Res, Amsterdam, Netherlands
[10] GESIS Leibniz Inst SocialSciences, Dept Computat Social Sci, D-50667 Cologne, Germany
基金
欧洲研究理事会;
关键词
web browsing data; digital trace data; web tracking data; computational social science; ONLINE; NEWS;
D O I
10.1177/08944393241227868
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The use of individual-level browsing data, that is, the records of a person's visits to online content through a desktop or mobile browser, is of increasing importance for social scientists. Browsing data have characteristics that raise many questions for statistical analysis, yet to date, little hands-on guidance on how to handle them exists. Reviewing extant research, and exploring data sets collected by our four research teams spanning seven countries and several years, with over 14,000 participants and 360 million web visits, we derive recommendations along four steps: preprocessing the raw data; filtering out observations; classifying web visits; and modelling browsing behavior. The recommendations we formulate aim to foster best practices in the field, which so far has paid little attention to justifying the many decisions researchers need to take when analyzing web browsing data.
引用
收藏
页码:1479 / 1504
页数:26
相关论文
共 50 条
  • [1] Quantifying the Systematic Bias in the Accessibility and Inaccessibility of Web Scraping Content From URL-Logged Web-Browsing Digital Trace Data
    Dahlke, Ross
    Kumar, Deepak
    Durumeric, Zakir
    Hancock, Jeffrey T.
    SOCIAL SCIENCE COMPUTER REVIEW, 2023,
  • [2] Clustering Web Users Based on Browsing Behavior
    Zhu, Tingshao
    ACTIVE MEDIA TECHNOLOGY, 2010, 6335 : 530 - 537
  • [3] Practical Guide to Chemometric Analysis of Optical Spectroscopic Data
    Lackey, Hope E.
    Sell, Rachel L.
    Nelson, Gilbert L.
    Bryan, Thomas A.
    Lines, Amanda M.
    Bryan, Samuel A.
    JOURNAL OF CHEMICAL EDUCATION, 2023, 100 (07) : 2608 - 2626
  • [4] New Tab Page Recommendations Strongly Concentrate Web Browsing to Familiar Sources
    Bharadhwaj, Homanga
    Srivastava, Nisheeth
    PROCEEDINGS OF THE 11TH ACM CONFERENCE ON WEB SCIENCE (WEBSCI'19), 2019, : 7 - 16
  • [5] Apparel product attributes, web browsing, and e-impulse buying on shopping websites
    Park, Eun Joo
    Kim, Eun Young
    Funches, Venessa Martin
    Foxx, William
    JOURNAL OF BUSINESS RESEARCH, 2012, 65 (11) : 1583 - 1589
  • [6] Browsing the Web for School: Social Inequality in Adolescents' School-Related Use of the Internet
    Weber, Maximilian
    Becker, Birgit
    SAGE OPEN, 2019, 9 (02):
  • [7] Communicative Approaches to Big Data: A Systematic Analysis of Web of Science Publications
    Kiyan, Zafcr
    Torenli, Nurcan
    CONNECTIST-ISTANBUL UNIVERSITY JOURNAL OF COMMUNICATION SCIENCES, 2020, (58): : 241 - 272
  • [8] Dynamics of hotel website browsing activity: the power of informatics and data analytics
    Chan, Irene Cheng Chu
    Ma, Jing
    Law, Rob
    Buhalis, Dimitrios
    Hatter, Richard
    INDUSTRIAL MANAGEMENT & DATA SYSTEMS, 2021, 121 (06) : 1398 - 1416
  • [9] A Technical Guide to Designing and Implementing Effective web Surveys
    Baatard, Greg
    PROCEEDINGS OF THE 11TH EUROPEAN CONFERENCE ON RESEARCH METHODS, 2012, : 48 - 54
  • [10] Effects of Web Interactivity: A Meta-Analysis
    Yang, Fan
    Shen, Fuyuan
    COMMUNICATION RESEARCH, 2018, 45 (05) : 635 - 658