Efficient on-the-fly Web bot detection

被引:14
|
作者
Suchacka, Grazyna [1 ]
Cabri, Alberto [2 ,3 ,4 ]
Rovetta, Stefano [2 ,3 ,4 ]
Masulli, Francesco [2 ,3 ,4 ]
机构
[1] Univ Opole, Inst Informat, Opole, Poland
[2] Univ Genoa, Dept Informat Bioengn Robot & Syst Engn DIBRIS, Genoa, Italy
[3] Vega Res Labs Srl, Genoa, Italy
[4] Italian Ist Nazl Alta Matemat Francesco Severi, Natl Grp Sci Comp GNCS INdAM, Rome, Italy
关键词
Web bot; Internet robot; Real-time bot detection; Machine learning; Sequential analysis; Neural network; Early decision; ROBOT DETECTION; NEURAL-NETWORK;
D O I
10.1016/j.knosys.2021.107074
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A large fraction of traffic on present-day Web servers is generated by bots - intelligent agents able to traverse the Web and execute various advanced tasks. Since bots' activity may raise concerns about server security and performance, many studies have investigated traffic features discriminating bots from human visitors and developed methods for automated traffic classification. Very few previous works, however, aim at identifying bots on-the-fly, trying to classify active sessions as early as possible. This paper proposes a novel method for binary classification of streams of Web server requests in order to label each active session as "bot'' or "human''. A machine learning approach has been developed to discover traffic patterns from historical usage data. The model, built on a neural network, is used to classify each incoming HTTP request and a sequential probabilistic analysis approach is then applied to capture relationships between subsequent HTTP requests in an ongoing session to assess the likelihood of the session being generated by a bot or a human, as soon as possible. A performance evaluation study with real server traffic data confirmed the effectiveness of the proposed classifier in discriminating bots from humans at early stages of their visits, leaving very few of them undecided, with very low number of false positives. (C) 2021 The Author(s). Published by Elsevier B.V.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Efficient on-the-fly Web bot detection
    Suchacka, Grażyna
    Cabri, Alberto
    Rovetta, Stefano
    Masulli, Francesco
    Knowledge-Based Systems, 2021, 223
  • [2] On-the-fly intrusion detection for Web Portals
    Sion, R
    Atallah, M
    Prabhakar, S
    ITCC 2003: INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: COMPUTERS AND COMMUNICATIONS, PROCEEDINGS, 2003, : 325 - 330
  • [3] On-the-fly tool detection
    DeGaspari, J
    MECHANICAL ENGINEERING, 2000, 122 (10) : 36 - 36
  • [4] An efficient on-the-fly cycle collection
    Paz, H
    Petrank, E
    Bacon, DF
    Kolodner, EK
    Rajan, VT
    COMPILER CONSTRUCTION, PROCEEDINGS, 2005, 3443 : 156 - 171
  • [5] An efficient on-the-fly cycle collection
    Paz, Harel
    Bacon, David F.
    Kolodner, Elliot K.
    Petrank, Erez
    Rajan, V. T.
    ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 2007, 29 (04):
  • [6] An efficient on-the-fly cycle collection
    Technion, Israel Institute of Technology, Haifa, Israel
    不详
    不详
    不详
    不详
    不详
    1600, (August 1, 2007):
  • [7] An Approach to Testing Web Applications On-The-Fly
    Li, Liping
    Qian, Zhongsheng
    He, Tao
    ICMECG: 2009 INTERNATIONAL CONFERENCE ON MANAGEMENT OF E-COMMERCE AND E-GOVERNMENT, PROCEEDINGS, 2009, : 428 - +
  • [8] ON-THE-FLY DETECTION OF ACCESS ANOMALIES
    SCHONBERG, E
    SIGPLAN NOTICES, 1989, 24 (07): : 285 - 297
  • [9] On-the-fly detection of access anomalies
    Schonberg, E
    ACM SIGPLAN NOTICES, 2004, 39 (04) : 315 - 327
  • [10] Efficient on-the-fly data race detection in multithreaded C++ programs
    Pozniansky, E
    Schuster, A
    ACM SIGPLAN NOTICES, 2003, 38 (10) : 178 - 189