Achieving Accuracy and Scalability Simultaneously in Detecting Application Clones on Android Markets

被引:175
作者
Chen, Kai [1 ,2 ]
Liu, Peng [1 ]
Zhang, Yingjun [3 ]
机构
[1] Penn State Univ, Coll IST, University Pk, PA 16802 USA
[2] Chinese Acad Sci, State Key Lab Informat Secur, Inst Informat Engn, Beijing, Peoples R China
[3] Chinese Acad Sci, Inst Software, Beijing, Peoples R China
来源
36TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2014) | 2014年
关键词
Software analysis; Android; clone detection; centroid;
D O I
10.1145/2568225.2568286
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Besides traditional problems such as potential bugs, (smartphone) application clones on Android markets bring new threats. That is, attackers clone the code from legitimate Android applications, assemble it with malicious code or advertisements, and publish these "purpose-added" app clones on the same or other markets for benefits. Three inherent and unique characteristics make app clones difficult to detect by existing techniques: a billion opcode problem caused by cross-market publishing, gap between code clones and app clones, and prevalent Type 2 and Type 3 clones. Existing techniques achieve either accuracy or scalability, but not both. To achieve both goals, we use a geometry characteristic, called centroid, of dependency graphs to measure the similarity between methods (code fragments) in two apps. Then we synthesize the method-level similarities and draw a Y/N conclusion on app (core functionality) cloning. The observed "centroid effect" and the inherent "monotonicity" property enable our approach to achieve both high accuracy and scalability. We implemented the app clone detection system and evaluated it on five whole Android markets (including 150,145 apps, 203 million methods and 26 billion opcodes). It takes less than one hour to perform cross-market app clone detection on the five markets after generating centroids only once.
引用
收藏
页码:175 / 186
页数:12
相关论文
共 47 条
  • [1] ANDROGUARD, 2013, REV ENG MALW GOODW A
  • [2] [Anonymous], 1993, Computing Science and Statistics
  • [3] [Anonymous], TECHNICAL REPORT
  • [4] [Anonymous], 2004, OSDI 04
  • [5] [Anonymous], 2012, P 2 ACM C DATA APPL, DOI DOI 10.1145/2133601.2133640
  • [6] BAKER BS, 1995, SECOND WORKING CONFERENCE ON REVERSE ENGINEERING, PROCEEDINGS, P86, DOI 10.1109/WCRE.1995.514697
  • [7] DMS®:: Program transformations for practical scalable software evolution
    Baxter, ID
    Pidgeon, C
    Mehlich, M
    [J]. ICSE 2004: 26TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, PROCEEDINGS, 2004, : 625 - 634
  • [8] Clone detection using abstract syntax trees
    Baxter, ID
    Yahin, A
    Moura, L
    Sant'Anna, M
    Bier, L
    [J]. INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, PROCEEDINGS, 1998, : 368 - 377
  • [9] Chen K., LIST SHARED LIB LIB
  • [10] Crussell Jonathan, 2012, Computer Security - ESORICS 2012. Proceedings 17th European Symposium on Research in Computer Security, P37, DOI 10.1007/978-3-642-33167-1_3