A progressive learning method on unknown protocol behaviors

被引:3
作者
Sun, Fanghui [1 ]
Wang, Shen [1 ]
Zhang, Hongli [1 ]
机构
[1] Harbin Inst Technol, Sch Cyberspace Sci, Fac Comp, Harbin 150001, Peoples R China
关键词
Protocol reverse engineering; State machine learning; Finite state transducer; State explosion problem; MESSAGE FORMAT; INFERENCE;
D O I
10.1016/j.jnca.2021.103249
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Reverse analyzing of unknown protocol behaviors keeps being a tough nut in Protocol Reverse Engineering (PRE), which infers specifications of unknown protocols by observable information, especially when only transmitted messages are available. This paper proposes a novel protocol state machine model Stochastic Protocol finite-state Transducer (SPT) to describe the message interaction rules between communicating terminals in a probabilistic way attempting to simulate behavior rules of unknown protocols in certain implementation. Together with a state related field recognition and compensation method, a progressive SPT learning algorithm of unknown protocols named Sptia-PL, is designed and implemented to reconstruct the SPT of target protocol with the ability to predict succeeding behaviors. By updating the SPT progressively, the proposed method is able to learn continuously in linear time and remain the established model in optimal condition during the whole learning process. This strategy thoroughly avoids the state explosion problem existing in most state machine learning methods of PRE. Experiments on two open and three local collected datasets of FTP, SMTP and POP3 prove the rationality of SPT model and effectiveness of Sptia-PL algorithm by an average Accuracy over 0.94 and a Coverage close to 0.99. The small computing cost O(N) of this method and high confidence of results outperforms all the known state-of-the-art methods significantly.
引用
收藏
页数:14
相关论文
共 39 条
  • [1] Ambainis A, 1996, LECT NOTES COMPUT SC, V1178, P233, DOI 10.1007/BFb0009499
  • [2] LEARNING REGULAR SETS FROM QUERIES AND COUNTEREXAMPLES
    ANGLUIN, D
    [J]. INFORMATION AND COMPUTATION, 1987, 75 (02) : 87 - 106
  • [3] Antunes J., 2011, 2011 18th Working Conference on Reverse Engineering, P169, DOI 10.1109/WCRE.2011.28
  • [4] Biondi P., 2005, SCAPY EXPLORE NET NE
  • [5] Bossert G., 2012, INT COMM CRIT C
  • [6] Independent comparison of popular DPI tools for traffic classification
    Bujlow, Tomasz
    Carela-Espanol, Valentin
    Barlet-Ros, Pere
    [J]. COMPUTER NETWORKS, 2015, 76 : 75 - 89
  • [7] Automatic protocol reverse-engineering: Message format extraction and field semantics inference
    Caballero, Juan
    Song, Dawn
    [J]. COMPUTER NETWORKS, 2013, 57 (02) : 451 - 474
  • [8] Casacuberta F, 2000, LECT NOTES ARTIF INT, V1891, P1
  • [9] Exploring Effective Fuzzing Strategies to Analyze Communication Protocols
    Chen, Yurong
    Lan, Tian
    Venkataramani, Guru
    [J]. FEAST'19: PROCEEDINGS OF THE 3RD ACM WORKSHOP ON FORMING AN ECOSYSTEM AROUND SOFTWARE TRANSFORMATION, 2019, : 17 - 23
  • [10] Inference and Analysis of Formal Models of Botnet Command and Control Protocols
    Cho, Chia Yuan
    Babic, Domagoj
    Shin, Eui Chul Richard
    Song, Dawn
    [J]. PROCEEDINGS OF THE 17TH ACM CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY (CCS'10), 2010, : 426 - 439