Using Bandits for Effective Database Activity Monitoring

被引：4

作者：

Grushka-Cohen, Hagit ^{[1
]}

Biller, Ofer ^{[2
]}

Sofer, Oded ^{[2
]}

Rokach, Lior ^{[1
]}

Shapira, Bracha ^{[1
]}

机构：

[1] Ben Gurion Univ Negev, Beer Sheva, Israel

[2] IBM Secur Div, New York, NY USA

来源：

ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2020, PT II | 2020年 / 12085卷

关键词：

Multi-armed bandit; Database activity monitoring; Filter bubble; Sampling;

D O I：

10.1007/978-3-030-47436-2_53

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Database activity monitoring systems aim to protect organizational data by logging users' activity to Identify and document malicious activity. High-velocity streams and operating costs, restrict these systems to examining only a sample of the activity. Current solutions use manual policies to decide which transactions to monitor. This limits the diversity of the data collected, creating a "filter bubble" over representing specific subsets of the data such as high-risk users and under-representing the rest of the population which may never be sampled. In recommendation systems, Bandit algorithms have recently been used to address this problem. We propose addressing the sampling for database activity monitoring problem as a recommender system. In this work, we redefine the data sampling problem as a special case of the multi-armed bandit problem and present a novel algorithm, C-epsilon-Greedy, which combines expert knowledge with random exploration. We analyze the effect of diversity on coverage and downstream event detection using simulated data. In doing so, we find that adding diversity to the sampling using the bandit-based approach works well for this task, maximizing population coverage without decreasing the quality in terms of issuing alerts about events, and outperforming policies manually crafted by experts and other sampling methods.

引用

页码：701 / 713

页数：13

共 17 条

[1]

Agrawal S., 2012, JMLR P, P1

[2] Finite-time analysis of the multiarmed bandit problem [J].

Auer, P ;

Cesa-Bianchi, N ;

Fischer, P .

MACHINE LEARNING, 2002, 47 (2-3) :235-256

[3] How Serendipity Improves User Satisfaction with Recommendations? A Large-Scale User Evaluation [J].

Chen, Li ;

Yang, Yonghua ;

Wang, Ningxia ;

Yang, Keping ;

Yuan, Quan .

WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019), 2019, :240-250

[4]

Chen Wei, 2013, INT C MACH LEARN, V28, P151

[5] Interactive Anomaly Detection on Attributed Networks [J].

Ding, Kaize ;

Li, Jundong ;

Liu, Huan .

PROCEEDINGS OF THE TWELFTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'19), 2019, :357-365

[6]

Evina PA, 2019, INT WIREL COMMUN, P1866

[7] Simulating User Activity for Assessing Effect of Sampling on DB Activity Monitoring Anomaly Detection [J].

Grushka-Cohen, Hagit ;

Biller, Ofer ;

Sofer, Oded ;

Rokach, Lior ;

Shapira, Bracha .

POLICY-BASED AUTONOMIC DATA GOVERNANCE (PADG 2018), 2019, 11550 :82-90

[8] CyberRank-Knowledge Elicitation for Risk Assessment of Database Security [J].

Grushka-Cohen, Hagit ;

Sofer, Oded ;

Biller, Ofer ;

Shapira, Bracha ;

Rokach, Lior .

CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, :2009-2012

[9]

Kaplan J., 2011, Meeting the cybersecurity challenge

[10]

Kuleshov V., 2014, ARXIV PREPRINT ARXIV

← 1 2 →