Analysis of Web Browsing Data: A Guide

被引:5
|
作者
von Hohenberg, Bernhard Clemm [1 ,10 ]
Stier, Sebastian [2 ,8 ]
Cardenal, Ana S. [3 ]
Guess, Andrew M. [4 ,5 ]
Menchen-Trevino, Ericka [6 ]
Wojcieszak, Magdalena [7 ,9 ]
机构
[1] GESIS Leibniz Inst Social Sci, Cologne, Germany
[2] GESIS Leibniz Inst Social Sci, Computat Social Sci Dept, Cologne, Germany
[3] Univ Oberta Catalunya, Barcelona, Spain
[4] Princeton Univ, Polit & Publ Affairs, Princeton, NJ USA
[5] Princeton Univ, Ctr Informat Technol Policy, Princeton, NJ USA
[6] Amer Univ, Washington, DC USA
[7] Univ Calif Davis, Davis, CA USA
[8] Univ Mannheim, Sch Social Sci, Mannheim, Germany
[9] Univ Amsterdam, Amsterdam Sch Commun Res, Amsterdam, Netherlands
[10] GESIS Leibniz Inst SocialSciences, Dept Computat Social Sci, D-50667 Cologne, Germany
基金
欧洲研究理事会;
关键词
web browsing data; digital trace data; web tracking data; computational social science; ONLINE; NEWS;
D O I
10.1177/08944393241227868
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The use of individual-level browsing data, that is, the records of a person's visits to online content through a desktop or mobile browser, is of increasing importance for social scientists. Browsing data have characteristics that raise many questions for statistical analysis, yet to date, little hands-on guidance on how to handle them exists. Reviewing extant research, and exploring data sets collected by our four research teams spanning seven countries and several years, with over 14,000 participants and 360 million web visits, we derive recommendations along four steps: preprocessing the raw data; filtering out observations; classifying web visits; and modelling browsing behavior. The recommendations we formulate aim to foster best practices in the field, which so far has paid little attention to justifying the many decisions researchers need to take when analyzing web browsing data.
引用
收藏
页码:1479 / 1504
页数:26
相关论文
共 50 条
  • [31] Exploring a landslide inventory created by automated web data mining: the case of Italy
    Franceschini, Rachele
    Rosi, Ascanio
    Catani, Filippo
    Casagli, Nicola
    LANDSLIDES, 2022, 19 (04) : 841 - 853
  • [32] Predicting political attitudes from web tracking data: a machine learning approach
    Kirkizh, Nora
    Ulloa, Roberto
    Stier, Sebastian
    Pfeffer, Juergen
    JOURNAL OF INFORMATION TECHNOLOGY & POLITICS, 2024, 21 (04) : 564 - 577
  • [33] IMPACT OF THE WEB APPLICATION FOR THE EDUCATIONAL PROCESS ON THE COMPOUND INTEREST CONSIDERING DATA SCIENCE
    Salas-Rueda, Ricardo-Adan
    Salas-Rueda, Erika-Patricia
    Salas-Rueda, Rodrigo-David
    TURKISH ONLINE JOURNAL OF DISTANCE EDUCATION, 2020, 21 (03): : 77 - 93
  • [34] Composite analysis of web pages in adaptive environment through Modified Salp Swarm algorithm to rank the web pages
    Manohar, E.
    Anandha Banu, E.
    Shalini Punithavathani, D.
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 13 (5) : 2585 - 2600
  • [35] Exploring a landslide inventory created by automated web data mining: the case of Italy
    Rachele Franceschini
    Ascanio Rosi
    Filippo Catani
    Nicola Casagli
    Landslides, 2022, 19 : 841 - 853
  • [36] The Criteria People Use in Relevance Decisions on Health Information: An Analysis of User Eye Movements When Browsing a Health Discussion Forum
    Pian, Wenjing
    Khoo, Christopher S. G.
    Chang, Yun-Ke
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2016, 18 (06)
  • [37] Biological, Chemical, and Nutritional Food Risks and Food Safety Issues From Italian Online Information Sources: Web Monitoring, Content Analysis, and Data Visualization
    Tiozzo, Barbara
    Ruzza, Mirko
    Rizzoli, Valentina
    D'Este, Laura
    Giaretta, Mose
    Ravarotto, Licia
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2020, 22 (12)
  • [38] Web Media and Stock Markets : A Survey and Future Directions from a Big Data Perspective
    Li, Qing
    Chen, Yan
    Wang, Jun
    Chen, Yuanzhu
    Chen, Hsinchun
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (02) : 381 - 399
  • [39] DESIGN AND USE OF A WEB APPLICATION FOR THE FIELD OF STATISTICS CONSIDERING THE ASSURE MODEL AND DATA SCIENCE
    Salas-Rueda, Ricardo-Adan
    Salas-Rueda, Erika-Patricia
    Salas-Rueda, Rodrigo-David
    TEXTO LIVRE-LINGUAGEM E TECNOLOGIA, 2019, 12 (01): : 48 - 71
  • [40] When survey science met web tracking: Presenting an error framework for metered data
    Bosch, Oriol J.
    Revilla, Melanie
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 2022, 185 : S408 - S436