An SVM-based machine learning method for accurate internet traffic classification

被引:178
作者
Yuan, Ruixi [3 ]
Li, Zhu [3 ]
Guan, Xiaohong [1 ,2 ,3 ]
Xu, Li [4 ,5 ]
机构
[1] Xi An Jiao Tong Univ, MOE KLINNS Lab, Xian 710049, Peoples R China
[2] Xi An Jiao Tong Univ, SKLMS Lab, Xian 710049, Peoples R China
[3] Tsinghua Univ, Ctr Intelligent & Networked Syst, TNLIST Lab, Beijing 100084, Peoples R China
[4] Beijing Jiaotong Univ, Coll Econ & Management, Beijing 100044, Peoples R China
[5] Old Dominion Univ, Dept Informat Technol & Decis Sci, Norfolk, VA 23529 USA
关键词
Internet traffic; Network traffic classification; Machine learning; Feature selection; SVM; SUPPORT VECTOR MACHINES; FEATURE-SELECTION; SPECIAL-ISSUE; SYSTEM; CHINA;
D O I
10.1007/s10796-008-9131-2
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Accurate and timely traffic classification is critical in network security monitoring and traffic engineering. Traditional methods based on port numbers and protocols have proven to be ineffective in terms of dynamic port allocation and packet encapsulation. The signature matching methods, on the other hand, require a known signature set and processing of packet payload, can only handle the signatures of a limited number of IP packets in real-time. A machine learning method based on SVM (supporting vector machine) is proposed in this paper for accurate Internet traffic classification. The method classifies the Internet traffic into broad application categories according to the network flow parameters obtained from the packet headers. An optimized feature set is obtained via multiple classifier selection methods. Experimental results using traffic from campus backbone show that an accuracy of 99.42% is achieved with the regular biased training and testing samples. An accuracy of 97.17% is achieved when un-biased training and testing samples are used with the same feature set. Furthermore, as all the feature parameters are computable from the packet headers, the proposed method is also applicable to encrypted network traffic.
引用
收藏
页码:149 / 156
页数:8
相关论文
共 34 条
  • [1] [Anonymous], 2004, IMC
  • [2] [Anonymous], 2004, P 4 ACM SIGCOMM C IN, DOI DOI 10.1145/1028788.1028805
  • [3] Toward an optimal SVM classification system for hyperspectral remote sensing images
    Bazi, Yakoub
    Melgani, Farid
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2006, 44 (11): : 3374 - 3385
  • [4] Electronic supply chain management applications by Swedish SMEs
    Beheshti, H. M.
    Hultman, M.
    Jung, M. -L.
    Opoku, R. A.
    Salehi-Sangari, E.
    [J]. ENTERPRISE INFORMATION SYSTEMS, 2007, 1 (02) : 255 - 268
  • [5] BELLOTTI T, 2008, EXPERT SYST IN PRESS
  • [6] Traffic classification on the fly
    Bernaille, Laurent
    Teixeira, Renata
    Akodkenou, Ismael
    Soule, Augustin
    Salamatian, Kave
    [J]. ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2006, 36 (02) : 23 - 26
  • [7] A tutorial on Support Vector Machines for pattern recognition
    Burges, CJC
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (02) : 121 - 167
  • [8] DUAN L, 2008, ANN OPERATI IN PRESS
  • [9] A local-density based spatial clustering algorithm with noise
    Duan, Lian
    Xu, Lida
    Guo, Feng
    Lee, Jun
    Yan, Baopin
    [J]. INFORMATION SYSTEMS, 2007, 32 (07) : 978 - 986
  • [10] Behavioral authentication of server flows
    Early, JP
    Brodley, CE
    Rosenberg, C
    [J]. 19TH ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE, PROCEEDINGS, 2003, : 46 - 55