A Clustering Algorithm for Binary Protocol Data Frames Based on Principal Component Analysis and Density Peaks Clustering

被引:0
作者
Yan, Xiaoyong [1 ]
Li, Qing [1 ]
Tao, Siyu [1 ]
机构
[1] Natl Digital Switching Syst Engn & Technol Res Ct, Zhengzhou, Henan, Peoples R China
来源
2017 17TH IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY (ICCT 2017) | 2017年
基金
中国国家自然科学基金;
关键词
protocol identification; binary protocol; principal component analysis; density peaks clustering; frames clustering;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Binary protocols lack session flow characteristics and its frequent patterns extracting is difficult. In order to achieve binary protocol data frames identification, an unsupervised clustering algorithm based on improved principal component analysis (PCA) and density peaks clustering (DPC) is proposed. We improve PCA by determining the dimensionality for PCA based on information gain. The improved PCA can remove redundant information and retain the characteristics of original data. Meanwhile, we improve DPC based on distance index weighting. The improved DPC can select cluster centers automatically and enhance the distinction between cluster centers and other data frames effectively. Experimental results show that the proposed algorithm works effectively for binary protocol data frames clustering.
引用
收藏
页码:1260 / 1266
页数:7
相关论文
共 14 条
[1]  
Ding C., 2004, P 21 INT C MACH LEAR, P29, DOI DOI 10.1145/1015330.1015408
[2]   Study on density peaks clustering based on k-nearest neighbors and principal component analysis [J].
Du, Mingjing ;
Ding, Shifei ;
Jia, Hongjie .
KNOWLEDGE-BASED SYSTEMS, 2016, 99 :135-145
[3]  
He Ling, 2010, Application Research of Computers, V27, P23, DOI 10.3969/j.issn.1001-3695.2010.01.006
[4]  
Hu Jie, 2008, Application Research of Computers, V25, P2601
[5]  
Huang Xiao-yan, 2015, Application Research of Computers, V32, P493, DOI 10.3969/j.issn.1001-3695.2015.02.038
[6]   PCA-based high-dimensional noisy data clustering via control of decision errors [J].
Lee, Jeonghwa ;
Jun, Chi-Hyuck .
KNOWLEDGE-BASED SYSTEMS, 2013, 37 :338-345
[7]  
Liu JH, 2004, PROC INT C TOOLS ART, P658
[8]  
[刘兴彬 LIU Xing-bin], 2008, [通信学报, Journal on Communications], V29, P51
[9]   Capturing Uncertainty Information and Categorical Characteristics for Network Payload Grouping in Protocol Reverse Engineering [J].
Luo, Jian-Zhen ;
Yu, Shun-Zheng ;
Cai, Jun .
MATHEMATICAL PROBLEMS IN ENGINEERING, 2015, 2015
[10]   Clustering by fast search and find of density peaks [J].
Rodriguez, Alex ;
Laio, Alessandro .
SCIENCE, 2014, 344 (6191) :1492-1496