Malicious web content detection by machine learning

被引：62

作者：

Hou, Yung-Tsung ^{[1
]}

Chang, Yimeng ^{[2
]}

Chen, Tsuhan ^{[2
]}

Laih, Chi-Sung ^{[3
]}

Chen, Chia-Mei ^{[1
]}

机构：

[1] Natl Sun Yat Sen Univ, Kaohsiung 80424, Taiwan

[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

[3] Natl Cheng Kung Univ, Tainan 70101, Taiwan

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2010年 / 37卷 / 01期

关键词：

Dynamic [!text type='HTML']HTML[!/text; Malicious webpage; Machine learning;

D O I：

10.1016/j.eswa.2009.05.023

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The recent development of the dynamic HTML gives attackers a new and powerful technique to compromise computer systems. A Malicious dynamic HTML code is usually embedded in a normal webpage. The malicious webpage infects the victim when a user browses it. Furthermore such DHTML code can disguise itself easily through obfuscation or transformation, which makes the detection even harder. Anti-virus software packages commonly use signature-based approaches which might not be able to efficiently identify camouflaged malicious HTML codes. Therefore, our paper proposes a malicious web page detection using the technique of machine learning. Our study analyzes the characteristic of a malicious webpage systematically and presents important features for machine learning. Experimental results demonstrate that our method is resilient to code obfuscations and can correctly determine whether a webpage is malicious or not. (C) 2009 Elsevier Ltd. All rights reserved.

引用

页码：55 / 60

页数：6

共 14 条

[1]

Bergeron J., 2001, P S REQ ENG INF SEC

[2]

Boser B. E., 1992, Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, P144, DOI 10.1145/130385.130401

[3] Semantics-aware malware detection [J].

Christodorescu, M ;

Jha, S ;

Seshia, SA ;

Song, D ;

Bryant, RE .

2005 IEEE SYMPOSIUM ON SECURITY AND PRIVACY, PROCEEDINGS, 2005, :32-46

[4]

CHRISTODORESCU M, 2004, P ACM SIGSOFT INT S, P34

[5]

Freund Y., 1996, INT C MACH LEARN ICM, V6, P148, DOI DOI 10.5555/3091696.3091715

[6]

Kinder J, 2005, LECT NOTES COMPUT SC, V3548, P174

[7]

Kolter J.Z., 2004, Proceedings of Knowledge Discovery and Data mining, P470

[8] ON RELEVANCE, PROBABILISTIC INDEXING AND INFORMATION RETRIEVAL [J].

MARON, ME ;

KUHNS, JL .

JOURNAL OF THE ACM, 1960, 7 (03) :216-244

[9] Attacking malicious code: A report to the Infoses Research Council [J].

McGraw, G ;

Morrisett, G .

IEEE SOFTWARE, 2000, 17 (05) :33-+

[10]

Moshchuk A., 2006, P NETWORK DISTRIBUTE, P17

← 1 2 →