On the Security of Python']Python Virtual Machines: An Empirical Study

被引:0
|
作者
Lin, Xinrong [1 ]
Hua, Baojian [1 ]
Fan, Qiliang [1 ]
机构
[1] Univ Sci & Technol China, Sch Software Engn, Hefei, Peoples R China
来源
2022 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2022) | 2022年
关键词
Empirical; !text type='Python']Python[!/text] virtual machines; Security; PROGRAMS; CHECKING; !text type='JAVA']JAVA[!/text; ERRORS; TOOL;
D O I
10.1109/ICSME55016.2022.00028
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Python continues to be one of the most popular programming languages and has been used in many safety-critical fields such as medical treatment, autonomous driving systems, and data science. These fields put forward higher security requirements to Python ecosystems. However, existing studies on machine learning systems in Python concentrate on data security, model security and model privacy, and just assume the underlying Python virtual machines (PVMs) are secure and trustworthy. Unfortunately, whether such an assumption really holds is still unknown. This paper presents, to the best of our knowledge, the first and most comprehensive empirical study on the security of CPython, the official and most deployed Python virtual machine. To this end, we first designed and implemented a software prototype dubbed PVMSCAN, then use it to scan the source code of the latest CPython (version 3.10) and other 10 versions (3.0 to 3.9), which consists of 3,838,606 lines of source code. Empirical results give relevant findings and insights towards the security of Python virtual machines, such as: 1) CPython virtual machines are still vulnerable, for example, PVMSCAN detected 239 vulnerabilities in version 3.10, including 55 null dereferences, 86 uninitialized variables and 98 dead stores; Python/C API-related vulnerabilities are very common and have become one of the most severe threats to the security of PVMs: for example, 70 Python/C API-related vulnerabilities are identified in CPython 3.10; 3) the overall quality of the code remained stable during the evolution of Python VMs with vulnerabilities per thousand line (VPTL) to be 0.50; and 4) automatic vulnerability rectification is effective: 166 out of 239 (69.46%) vulnerabilities can be rectified by a simple yet effective syntax-directed heuristics. We have reported our empirical results to the developers of CPython, and they have acknowledged us and already confirmed and fixed 2 bugs (as of this writing) while others are still being analyzed. This study not only demonstrates the effectiveness of our approach, but also highlights the need to improve the reliability of infrastructures like Python virtual machines by leveraging state-of-the-art security techniques and tools.
引用
收藏
页码:223 / 234
页数:12
相关论文
共 50 条
  • [1] Empirical Analysis of Security Vulnerabilities in Python']Python Packages
    Alfadel, Mahmoud
    Costa, Diego Elias
    Shihab, Emad
    2021 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2021), 2021, : 446 - 457
  • [2] Empirical analysis of security vulnerabilities in Python']Python packages
    Alfadel, Mahmoud
    Costa, Diego Elias
    Shihab, Emad
    EMPIRICAL SOFTWARE ENGINEERING, 2023, 28 (03)
  • [3] PyGuard: Finding and Understanding Vulnerabilities in Python']Python Virtual Machines
    Jiang, Chengman
    Hua, Baojian
    Ouyang, Wanrong
    Fan, Qiliang
    Pan, Zhizhong
    2021 IEEE 32ND INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING (ISSRE 2021), 2021, : 468 - 475
  • [4] Empirical Study of Python']Python Call Graph
    Li, Yu
    34TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE 2019), 2019, : 1274 - 1276
  • [5] An Empirical Study of Flaky Tests in Python']Python
    Gruber, Martin
    Lukasczyk, Stephan
    Krois, Florian
    Fraser, Gordon
    2021 14TH IEEE CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION (ICST 2021), 2021, : 148 - 158
  • [6] An Empirical Study on Bugs in Python']Python Interpreters
    Wang, Ziyuan
    Bu, Dexin
    Sun, Aiyue
    Gou, Shanyi
    Wang, Yong
    Chen, Lin
    IEEE TRANSACTIONS ON RELIABILITY, 2022, 71 (02) : 716 - 734
  • [7] Quantifying the Transition from Python']Python 2 to 3: An Empirical Study of Python']Python Applications
    Malloy, Brian A.
    Power, James F.
    11TH ACM/IEEE INTERNATIONAL SYMPOSIUM ON EMPIRICAL SOFTWARE ENGINEERING AND MEASUREMENT (ESEM 2017), 2017, : 314 - 323
  • [8] An empirical study of fault localization in Python']Python programs
    Rezaalipour, Mohammad
    Furia, Carlo A.
    EMPIRICAL SOFTWARE ENGINEERING, 2024, 29 (04)
  • [9] The Evolution of Type Annotations in Python']Python: An Empirical Study
    Di Grazia, Luca
    Pradel, Michael
    PROCEEDINGS OF THE 30TH ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2022, 2022, : 209 - 220
  • [10] An Empirical Study of Dynamic Types for Python']Python Projects
    Xia, Xinmeng
    He, Xincheng
    Yan, Yanyan
    Xu, Lei
    Xu, Baowen
    SOFTWARE ANALYSIS, TESTING, AND EVOLUTION, SATE 2018, 2018, 11293 : 85 - 100