Mixed Signals: Analyzing Software Attribution Challenges in the Android Ecosystem

被引:2
作者
Hageman, Kaspar [1 ]
Feal, Alvaro [2 ]
Gamba, Julien [2 ]
Girish, Aniketh [2 ]
Bleier, Jakob [3 ]
Lindorfer, Martina [3 ]
Tapiador, Juan [4 ]
Vallina-Rodriguez, Narseo [2 ]
机构
[1] Aarhus Univ, Dept Elect & Comp Engn, DK-8000 Aarhus, Denmark
[2] IMDEA Networks Inst, Madrid 28918, Spain
[3] TU Wien, Secur & Privacy Res Unit, A-1040 Vienna, Austria
[4] Univ Carlos III Madrid, Dept Comp Sci, Madrid 28911, Spain
关键词
Operating systems; Software; Internet; Metadata; Ecosystems; Companies; Web and internet services; Android; attribution; attribution graph; mobile apps;
D O I
10.1109/TSE.2023.3236582
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The ability to identify the author responsible for a given software object is critical for many research studies and for enhancing software transparency and accountability. However, as opposed to other application markets like Apple's iOS App Store, attribution in the Android ecosystem is known to be hard. Prior research has leveraged market metadata and signing certificates to identify software authors without questioning the validity and accuracy of these attribution signals. However, Android application (app) authors can, either intentionally or by mistake, hide their true identity due to: (1) the lack of policy enforcement by markets to ensure the accuracy and correctness of the information disclosed by developers in their market profiles during the app release process, and (2) the use of self-signed certificates for signing apps instead of certificates issued by trusted CAs. In this paper, we perform the first empirical analysis of the availability, volatility and overall aptness of publicly available market and app metadata for author attribution in Android markets. To that end, we analyze a dataset of over 2.5 million market entries and apps extracted from five Android markets for over two years. Our results show that widely used attribution signals are often missing from market profiles and that they change over time. We also invalidate the general belief about the validity of signing certificates for author attribution. For instance, we find that apps from different authors share signing certificates due to the proliferation of app building frameworks and software factories. Finally, we introduce the concept of an attribution graph and we apply it to evaluate the validity of existing attribution signals on the Google Play Store. Our results confirm that the lack of control over publicly available signals can confuse automatic attribution processes.
引用
收藏
页码:2964 / 2979
页数:16
相关论文
共 104 条
  • [1] Ali Mohamed, 2017, 2017 IEEE/ACM 4th International Conference on Mobile Software Engineering and Systems (MOBILESoft). Proceedings, P79, DOI 10.1109/MOBILESoft.2017.3
  • [2] Allix K, 2016, 13TH WORKING CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2016), P468, DOI [10.1145/2901739.2903508, 10.1109/MSR.2016.056]
  • [3] On the Feasibility of Malware Authorship Attribution
    Alrabaee, Saed
    Shirani, Paria
    Debbabi, Mourad
    Wang, Lingyu
    [J]. FOUNDATIONS AND PRACTICE OF SECURITY, FPS 2016, 2017, 10128 : 256 - 272
  • [4] Andow B, 2019, PROCEEDINGS OF THE 28TH USENIX SECURITY SYMPOSIUM, P585
  • [5] [Anonymous], 2022, REQ PACK
  • [6] [Anonymous], PERM ANDR
  • [7] [Anonymous], 2022, WHAT YOU CAN DO F DR
  • [8] [Anonymous], 2021, APP SIGN CONS
  • [9] [Anonymous], 2018, 2018 REF EU DAT PROT
  • [10] [Anonymous], 2022, ANDR MARK CRAWL