Extraction and Prediction of User Communication Behaviors From DNS Query Logs Based on Nonnegative Tensor Factorization

被引:1
作者
Hatanaka, Kotaro [1 ,2 ]
Kimura, Tatsuaki [1 ]
Komai, Yuka [3 ,4 ]
Ishibashi, Keisuke [5 ]
Kobayashi, Masahiro [6 ]
Harada, Shigeaki [6 ]
机构
[1] Osaka Univ, Dept Informat & Commun Technol, Osaka 5650871, Japan
[2] NTT DOCOMO, Customer Success Dept 1, Tokyo 1006150, Japan
[3] NTT Corp, NTT Network Technol Labs, Tokyo 1818585, Japan
[4] Nippon Telegraph & Tel West Corp, Business & Mkt Div, Osaka 5340024, Japan
[5] Int Christian Univ, Coll Liberal Arts, Div Arts & Sci, Tokyo 1818585, Japan
[6] NTT Corp, NTT Network Serv Syst Labs, Tokyo 1818585, Japan
来源
IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT | 2023年 / 20卷 / 03期
关键词
DNS; user communication behavior; nonnegative tensor factorization; autoregressive model;
D O I
10.1109/TNSM.2023.3238858
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Owing to the critical role of the domain name system (DNS), its query log data are utilized for various network monitoring purposes. With the diversification of network services, these data have become increasingly complex, making mining useful information challenging. DNS query log data can be considered as the superposition of two types of communication patterns: groups of domains accessed simultaneously (e.g., ad servers and content delivery network (CDN) servers) and time-series access patterns based on user behavior characteristics (e.g., access trends during the night). However, previous studies have not focused on extracting both access patterns hidden in the data. This study proposes a method that extracts both patterns of accessed domains and temporal access patterns as user communication behaviors from DNS query log data and predicts future accesses based on these patterns. The proposed method first aggregates similar fully qualified domain names (FQDNs) associated with the same service. We then present temporal regularized nonnegative tensor factorization (TR-NTF) that extracts both access patterns from a third-order tensor expressing DNS query log data and enables prediction. We evaluate the proposed method using synthetic and actual data and demonstrate that it successfully extracts hidden communication patterns and achieves sufficient prediction accuracy.
引用
收藏
页码:2611 / 2624
页数:14
相关论文
共 46 条
[1]  
akamai, Akamai Technologies
[2]  
[Anonymous], Google
[3]  
[Anonymous], Apple
[4]  
[Anonymous], Microsoft
[5]  
[Anonymous], 2011, TWITTER
[6]  
[Anonymous], Dropbox
[7]  
Antonakakis Manos., 2010, P 19 USENIX C SECURI, P18
[8]   Large-Scale Internet User Behavior Analysis of a Nationwide K-12 Education Network Based on DNS Queries [J].
Arriola, Alexis ;
Pastorini, Marcos ;
Capdehourat, German ;
Grampin, Eduardo ;
Castro, Alberto .
COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2020, PT I, 2020, 12249 :776-791
[9]   EXPOSURE: A Passive DNS Analysis Service to Detect and Report Malicious Domains [J].
Bilge, Leyla ;
Sen, Sevil ;
Balzarotti, Davide ;
Kirda, Engin ;
Kruegel, Christopher .
ACM TRANSACTIONS ON INFORMATION AND SYSTEM SECURITY, 2014, 16 (04)
[10]   Active Content Popularity Learning and Caching Optimization With Hit Ratio Guarantees [J].
Bommaraveni, Srikanth ;
Vu, Thang X. ;
Chatzinotas, Symeon ;
Ottersten, Bjorn .
IEEE ACCESS, 2020, 8 :151350-151359