Machine learning steered symbolic execution framework for complex software code

被引：5

作者：

Bu, Lei ^{[1
]}

Liang, Yongjuan ^{[1
]}

Xie, Zhunyi ^{[1
]}

Qian, Hong ^{[1
]}

Hu, Yi-Qi ^{[1
]}

Yu, Yang ^{[1
]}

Chen, Xin ^{[1
]}

Li, Xuandong ^{[1
]}

机构：

[1] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing, Peoples R China

来源：

FORMAL ASPECTS OF COMPUTING | 2021年 / 33卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Symbolic execution; Machine learning; Nonlinear path condition; Constraint solving; OPTIMIZATION; SEARCH;

D O I：

10.1007/s00165-021-00538-3

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

During program traversing, symbolic execution collects path conditions and feeds them to a constraint solver to obtain feasible solutions. However, complex path conditions, like nonlinear constraints, which widely appear in programs, are hard to be handled efficiently by the existing solvers. In this paper, we adapt the classical symbolic execution framework with a machine learning approach for constraint satisfaction. The approach samples and learns from different solutions to identify potentially feasible area. This sampling-learning style solving can be applied in different class of complex problems easily. Therefore, incorporating this approach, our framework, MLBSE, supports the symbolic execution of not only simple linear path conditions, but also nonlinear arithmetic operations, and even black-box function calls of library methods. Meanwhile, thanks to the theoretical foundation of the machine learning based approach, when the solver fails to solve a path condition, we can have an estimation of the confidence in the satisfiability (ECS) of the problem to give users insights about how the problem is analyzed and whether they could ultimately find a solution. We implement MLBSE on the basis of Symbolic Path Finder (SPF) into a fully automatic Java symbolic execution engine. Users can feed their code to MLBSE directly, which is very convenient to use. To evaluate its performance, 22 real case programs are used as the benchmarks for MLBSE to generate test cases, which involve a total number of 1042 methods that are full of nonlinear operations, floating-point arithmetic as well as native method calls. Experiment results show that the coverage achieved by MLBSE is much higher than the state-of-the-art tools.

引用

页码：301 / 323

页数：23

共 44 条

[1] An orchestrated survey of methodologies for automated software test case generation [J].

Anand, Saswat ;

Burke, Edmund K. ;

Chen, Tsong Yueh ;

Clark, John ;

Cohen, Myra B. ;

Grieskamp, Wolfgang ;

Harman, Mark ;

Harrold, Mary Jean ;

McMinn, Phil ;

Bertolino, Antonia ;

Li, J. Jenny ;

Zhu, Hong .

JOURNAL OF SYSTEMS AND SOFTWARE, 2013, 86 (08) :1978-2001

[2]

[Anonymous], 2018, CYCLOMATIC COMPLEXIT

[3]

[Anonymous], 2018, Scientific Computation

[4]

[Anonymous], 2018, APACHE COMMONS MATH

[5]

[Anonymous], 2018, JACOCO

[6] Automatic Detection of Floating-Point Exceptions [J].

Barr, Earl T. ;

Thanh Vo ;

Vu Le ;

Su, Zhendong .

ACM SIGPLAN NOTICES, 2013, 48 (01) :549-560

[7]

Borges M., 2012, 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation (ICST 2012), P111, DOI 10.1109/ICST.2012.91

[8]

Boyer R. S., 1975, SIGPLAN Notices, V10, P234, DOI 10.1145/390016.808445

[9]

Cadar C., 2008, P 8 USENIX S OPERATI, P209

[10]

Cadar C, 2011, 2011 33RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), P1066, DOI 10.1145/1985793.1985995

← 1 2 3 4 5 →