Intercepting Hail Hydra: Real-time detection of Algorithmically Generated Domains

被引:20
作者
Casino, Fran [1 ,2 ]
Lykousas, Nikolaos [1 ]
Homoliak, Ivan [3 ]
Patsakis, Constantinos [1 ,2 ]
Hernandez-Castro, Julio [4 ]
机构
[1] Univ Piraeus, Dept Informat, 80 Karaoli & Dimitriou Str, Piraeus 18534, Greece
[2] Athena Res Ctr, Informat Management Syst Inst, Artemidos 6, Maroussi 15125, Greece
[3] Brno Univ Technol, Fac Informat Technol, Brno, Czech Republic
[4] Univ Kent, Sch Comp, Canterbury, Kent, England
基金
欧盟地平线“2020”;
关键词
Malware; Domain Generation Algorithms; Botnets; DNS; Algorithmically Generated Domain; DNS; ENSEMBLES;
D O I
10.1016/j.jnca.2021.103135
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
A crucial technical challenge for cybercriminals is to keep control over the potentially millions of infected devices that build up their botnets, without compromising the robustness of their attacks. A single, fixed C&C server, for example, can be trivially detected either by binary or traffic analysis and immediately sink-holed or taken-down by security researchers or law enforcement. Botnets often use Domain Generation Algorithms (DGAs), primarily to evade take-down attempts. DGAs can enlarge the lifespan of a malware campaign, thus potentially enhancing its profitability. They can also contribute to hindering attack accountability. In this work, we introduce HYDRAS, the most comprehensive and representative dataset of AlgorithmicallyGenerated Domains (AGD) available to date. The dataset contains more than 100 DGA families, including both real-world and adversarially designed ones. We analyse the dataset and discuss the possibility of differentiating between benign requests (to real domains) and malicious ones (to AGDs) in real-time. The simultaneous study of so many families and variants introduces several challenges; nonetheless, it alleviates biases found in previous literature employing small datasets which are frequently overfitted, exploiting characteristic features of particular families that do not generalise well. We thoroughly compare our approach with the current state-of-the-art and highlight some methodological shortcomings in the actual state of practice. The outcomes obtained show that our proposed approach significantly outperforms the current state-of-the-art in terms of both classification performance and efficiency.
引用
收藏
页数:17
相关论文
共 73 条
[1]  
Abakumov A., 2020, DGA repository
[2]  
Alaeiyan M., 2020, COMPUT COMMUN
[3]   MaldomDetector: A system for detecting algorithmically generated domain names with machine learning [J].
Almashhadani, Ahmad O. ;
Kaiiali, Mustafa ;
Carlin, Domhnall ;
Sezer, Sakir .
COMPUTERS & SECURITY, 2020, 93
[4]  
Anand P. Mohan, 2020, Procedia Computer Science, V171, P1129, DOI 10.1016/j.procs.2020.04.121
[5]   DeepDGA: Adversarially-Tuned Domain Generation and Detection [J].
Anderson, Hyrum S. ;
Woodbridge, Jonathan ;
Filar, Bobby .
AISEC'16: PROCEEDINGS OF THE 2016 ACM WORKSHOP ON ARTIFICIAL INTELLIGENCE AND SECURITY, 2016, :13-21
[6]  
Anomali Labs, 2019, INTERPLANETARY STORM
[7]  
[Anonymous], 2014, COMBINING PATTERN CL
[8]  
Antonakakis M., 2012, 21 USENIX SEC S USEN, P491
[9]  
Antonakakis M, 2017, PROCEEDINGS OF THE 26TH USENIX SECURITY SYMPOSIUM (USENIX SECURITY '17), P1093
[10]  
Attardi G., 2018, International Symposium on Security in Computing and Communication, P687