OBA2: An Onion approach to Binary code Authorship Attribution

被引:48
作者
Alrabaee, Saed [1 ]
Saleem, Noman [1 ]
Preda, Stere [1 ]
Wang, Lingyu [1 ]
Debbabi, Mourad [1 ]
机构
[1] Concordia Univ, Comp Secur Lab, Natl Cyber Forens & Training Alliance Canada, Montreal, PQ, Canada
关键词
Authorship attribution; Reverse engineering; Binary program analysis; Malware forensics; Digital forensics; PROGRAM;
D O I
10.1016/j.diin.2014.03.012
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A critical aspect of malware forensics is authorship analysis. The successful outcome of such analysis is usually determined by the reverse engineer's skills and by the volume and complexity of the code under analysis. To assist reverse engineers in such a tedious and error-prone task, it is desirable to develop reliable and automated tools for supporting the practice of malware authorship attribution. In a recent work, machine learning was used to rank and select syntax-based features such as n-grams and flow graphs. The experimental results showed that the top ranked features were unique for each author, which was regarded as an evidence that those features capture the author's programming styles. In this paper, however, we show that the uniqueness of features does not necessarily correspond to authorship. Specifically, our analysis demonstrates that many "unique" features selected using this method are clearly unrelated to the authors' programming styles, for example, unique IDs or random but unique function names generated by the compiler; furthermore, the overall accuracy is generally unsatisfactory. Motivated by this discovery, we propose a layered Onion Approach for Binary Authorship Attribution called OBA2. The novelty of our approach lies in the three complementary layers: preprocessing, syntax-based attribution, and semantic-based attribution. Experiments show that our method produces results that not only are more accurate but have a meaningful connection to the authors' styles. (C) 2014 The Author. Published by Elsevier Ltd on behalf of DFRWS.
引用
收藏
页码:S94 / S103
页数:10
相关论文
共 8 条
[1]  
Bai I, 2013, KNOWLEDGE INFORM SYS
[2]  
Brucker F, 2001, THESIS ECOLE NATL SU
[3]   Extraction of Java']Java program fingerprints for software authorship identification [J].
Ding, HB ;
Samadzadeh, MH .
JOURNAL OF SYSTEMS AND SOFTWARE, 2004, 72 (01) :49-57
[4]  
FRANTZESKOU G, 2004, 1 INT C E BUS TEL NE, P85
[5]   Authorship analysis: Identifying the author of a program [J].
Krsul, I ;
Spafford, EH .
COMPUTERS & SECURITY, 1997, 16 (03) :233-257
[6]  
KRUEGEL C, 2005, P 8 INT S REC ADV IN, P207, DOI 10.1007/11663812_11
[7]  
MACDONELL SG, 1999, P 6 INT C NEUR INF D, V1, P66
[8]   Who Wrote This Code? Identifying the Authors of Program Binaries [J].
Rosenblum, Nathan ;
Zhu, Xiaojin ;
Miller, Barton P. .
COMPUTER SECURITY - ESORICS 2011, 2011, 6879 :172-189