A progressive learning method on unknown protocol behaviors

被引：3

作者：

Sun, Fanghui ^{[1
]}

Wang, Shen ^{[1
]}

Zhang, Hongli ^{[1
]}

机构：

[1] Harbin Inst Technol, Sch Cyberspace Sci, Fac Comp, Harbin 150001, Peoples R China

来源：

JOURNAL OF NETWORK AND COMPUTER APPLICATIONS | 2022年 / 197卷

关键词：

Protocol reverse engineering; State machine learning; Finite state transducer; State explosion problem; MESSAGE FORMAT; INFERENCE;

D O I：

10.1016/j.jnca.2021.103249

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Reverse analyzing of unknown protocol behaviors keeps being a tough nut in Protocol Reverse Engineering (PRE), which infers specifications of unknown protocols by observable information, especially when only transmitted messages are available. This paper proposes a novel protocol state machine model Stochastic Protocol finite-state Transducer (SPT) to describe the message interaction rules between communicating terminals in a probabilistic way attempting to simulate behavior rules of unknown protocols in certain implementation. Together with a state related field recognition and compensation method, a progressive SPT learning algorithm of unknown protocols named Sptia-PL, is designed and implemented to reconstruct the SPT of target protocol with the ability to predict succeeding behaviors. By updating the SPT progressively, the proposed method is able to learn continuously in linear time and remain the established model in optimal condition during the whole learning process. This strategy thoroughly avoids the state explosion problem existing in most state machine learning methods of PRE. Experiments on two open and three local collected datasets of FTP, SMTP and POP3 prove the rationality of SPT model and effectiveness of Sptia-PL algorithm by an average Accuracy over 0.94 and a Coverage close to 0.99. The small computing cost O(N) of this method and high confidence of results outperforms all the known state-of-the-art methods significantly.

引用

页数：14

共 39 条

[1] Ambainis A, 1996, LECT NOTES COMPUT SC, V1178, P233, DOI 10.1007/BFb0009499
[2] LEARNING REGULAR SETS FROM QUERIES AND COUNTEREXAMPLES
ANGLUIN, D
[J]. INFORMATION AND COMPUTATION, 1987, 75 (02) : 87 - 106
[3] Antunes J., 2011, 2011 18th Working Conference on Reverse Engineering, P169, DOI 10.1109/WCRE.2011.28
[4] Biondi P., 2005, SCAPY EXPLORE NET NE
[5] Bossert G., 2012, INT COMM CRIT C
[6] Independent comparison of popular DPI tools for traffic classification
Bujlow, Tomasz
Carela-Espanol, Valentin
Barlet-Ros, Pere
[J]. COMPUTER NETWORKS, 2015, 76 : 75 - 89
[7] Automatic protocol reverse-engineering: Message format extraction and field semantics inference
Caballero, Juan
Song, Dawn
[J]. COMPUTER NETWORKS, 2013, 57 (02) : 451 - 474
[8] Casacuberta F, 2000, LECT NOTES ARTIF INT, V1891, P1
[9] Exploring Effective Fuzzing Strategies to Analyze Communication Protocols
Chen, Yurong
Lan, Tian
Venkataramani, Guru
[J]. FEAST'19: PROCEEDINGS OF THE 3RD ACM WORKSHOP ON FORMING AN ECOSYSTEM AROUND SOFTWARE TRANSFORMATION, 2019, : 17 - 23
[10] Inference and Analysis of Formal Models of Botnet Command and Control Protocols
Cho, Chia Yuan
Babic, Domagoj
Shin, Eui Chul Richard
Song, Dawn
[J]. PROCEEDINGS OF THE 17TH ACM CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY (CCS'10), 2010, : 426 - 439

← 1 2 3 4 →