Capturing Uncertainty Information and Categorical Characteristics for Network Payload Grouping in Protocol Reverse Engineering

被引:13
作者
Luo, Jian-Zhen [1 ,2 ]
Yu, Shun-Zheng [1 ]
Cai, Jun [2 ]
机构
[1] Sun Yat Sen Univ, Sch Informat Sci & Technol, Guangzhou 510275, Guangdong, Peoples R China
[2] Guangdong Polytech Normal Univ, Sch Elect & Informat, Guangzhou 510665, Guangdong, Peoples R China
基金
中国国家自然科学基金;
关键词
Internet protocols - Extraction - Clustering algorithms - Rough set theory;
D O I
10.1155/2015/962974
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
As a promising tool to recover the specifications of unknown protocols, protocol reverse engineering has drawn more and more attention in research over the last decade. It is a critical task of protocol reverse engineering to extract the protocol keywords from network trace. Since the messages of different types have different sets of protocol keywords, it is an effective method to improve the accuracy of protocol keyword extraction by clustering the network payload of unknown traffic into clusters and analyzing each clusters to extract the protocol keywords. Although the classic algorithms such as K-means and EM can be used for network payload clustering, the quality of resultant traffic clusters was far from satisfactory when these algorithms are applied to cluster application layer traffic with categorical attributes. In this paper, we propose a novel method to improve the accuracy of protocol reverse engineering by applying a rough set-based technique for clustering the application layer traffic. This technique analyze multidimension uncertain information in multiple categorical attributes based on rough sets theory to cluster network payload, and apply the Minimum Description Length criteria to determine the optimal number of clusters. The experiments show that our method outperforms the existing algorithms and improves the results of protocol keyword extraction.
引用
收藏
页数:9
相关论文
共 29 条
[1]  
[Anonymous], 2012, ROUGH SETS SELECTED
[2]   Traffic classification on the fly [J].
Bernaille, Laurent ;
Teixeira, Renata ;
Akodkenou, Ismael ;
Soule, Augustin ;
Salamatian, Kave .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2006, 36 (02) :23-26
[3]  
Bernaille L, 2007, LECT NOTES COMPUT SC, V4427, P165
[4]   Automatic protocol reverse-engineering: Message format extraction and field semantics inference [J].
Caballero, Juan ;
Song, Dawn .
COMPUTER NETWORKS, 2013, 57 (02) :451-474
[5]   Inference and Analysis of Formal Models of Botnet Command and Control Protocols [J].
Cho, Chia Yuan ;
Babic, Domagoj ;
Shin, Eui Chul Richard ;
Song, Dawn .
PROCEEDINGS OF THE 17TH ACM CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY (CCS'10), 2010, :426-439
[6]  
Cui W, 2007, P 16 USENIX SEC S US, P1
[7]  
Erman J., 2006, P 49 IEEE GLOB TEL C, P1, DOI DOI 10.1109/GLOCOM.2006.443
[8]  
Erman J, 2007, PERF E R SI, V35, P369
[9]  
Erman Jeffrey., 2006, Proceedings of the 2006 SIGCOMM workshop on Mining Network Data, P281
[10]  
Georgieva O, 2011, LECT NOTES COMPUT SC, V6881, P82, DOI 10.1007/978-3-642-23851-2_9