An efficient string solver for string constraints with regex-counting and string-length

被引:0
作者
Hu, Denghang [1 ]
Wu, Zhilin
机构
[1] Univ Chinese Acad Sci, Chinese Acad Sci, Inst Software, Key Lab Syst Software, Beijing, Peoples R China
关键词
String constraints; String solver; Regular expression; String length; Counting operator;
D O I
10.1016/j.sysarc.2025.103340
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Regular expressions (regex for short) and string-length function are widely used in string-manipulating programs. Counting is a frequently used feature in regexes that counts the number of matchings of sub- patterns. The state-of-the-art string solvers are incapable of solving string constraints with regex-counting and string-length efficiently, especially when the counting and length bounds are large. In this work, we propose an automata-theoretic approach for solving such class of string constraints. The main idea is to symbolically model the counting operators by registers in automata instead of unfolding them explicitly, thus alleviating the state explosion problem. Moreover, the string-length function is modeled by a register as well. As a result, the satisfiability of string constraints with regex-counting and string-length is reduced to the satisfiability of linear integer arithmetic, which the off-the-shelf SMT solvers can then solve. To improve the performance further, we also propose various optimization techniques. We implement the algorithms and validate our approach on 49,843 benchmark instances. The experimental results show that our approach can solve more instances than the state-of-the-art solvers, at a comparable or faster speed, especially when the counting and length bounds are large or when the counting operators are nested with some other counting operators or complement operators.
引用
收藏
页数:12
相关论文
共 42 条
[1]   Efficient Handling of String-Number Conversion [J].
Abdulla, Parosh Aziz ;
Atig, Mohamed Faouzi ;
Chen, Yu-Fang ;
Diep, Bui Phi ;
Dolby, Julian ;
Jank, Petr ;
Lin, Hsin-Hung ;
Holik, Lukas ;
Wu, Wei-Cheng .
PROCEEDINGS OF THE 41ST ACM SIGPLAN CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION (PLDI '20), 2020, :943-957
[2]   cvc5: A Versatile and Industrial-Strength SMT Solver [J].
Barbosa, Haniel ;
Barrett, Clark ;
Brain, Martin ;
Kremer, Gereon ;
Lachnitt, Hanna ;
Mann, Makai ;
Mohamed, Abdalrhman ;
Mohamed, Mudathir ;
Niemetz, Aina ;
Notzli, Andres ;
Ozdemir, Alex ;
Preiner, Mathias ;
Reynolds, Andrew ;
Sheng, Ying ;
Tinelli, Cesare ;
Zohar, Yoni .
TOOLS AND ALGORITHMS FOR THE CONSTRUCTION AND ANALYSIS OF SYSTEMS, TACAS 2022, PT I, 2022, 13243 :415-442
[3]  
Berzish M., 2021, Ph.D. thesis
[4]   Towards more efficient methods for solving regular-expression heavy string constraints [J].
Berzish, Murphy ;
Day, Joel D. ;
Ganesh, Vijay ;
Kulczynski, Mitja ;
Manea, Florin ;
Mora, Federico ;
Nowotka, Dirk .
THEORETICAL COMPUTER SCIENCE, 2023, 943 :50-72
[5]   An SMT Solver for Regular Expressions and Linear Arithmetic over String Length [J].
Berzish, Murphy ;
Kulczynski, Mitja ;
Mora, Federico ;
Manea, Florin ;
Day, Joel D. ;
Nowotka, Dirk ;
Ganesh, Vijay .
COMPUTER AIDED VERIFICATION, PT II, CAV 2021, 2021, 12760 :289-312
[6]  
Berzish M, 2017, PROCEEDINGS OF THE 17TH CONFERENCE ON FORMAL METHODS IN COMPUTER AIDED DESIGN (FMCAD 2017), P55, DOI 10.23919/FMCAD.2017.8102241
[7]   StringFuzz: A Fuzzer for String Solvers [J].
Blotsky, Dmitry ;
Mora, Federico ;
Berzish, Murphy ;
Zheng, Yunhui ;
Kabir, Ifaz ;
Ganesh, Vijay .
COMPUTER AIDED VERIFICATION, CAV 2018, PT II, 2018, 10982 :45-51
[8]  
Cavada R, 2014, LECT NOTES COMPUT SC, V8559, P334, DOI 10.1007/978-3-319-08867-9_22
[9]  
Chapman C., 2016, P 25 INT S SOFTW TES, P282, DOI [DOI 10.1145/2931037.2931073, 10.1145/2931037.2931073]
[10]   Checking determinism of regular expressions with counting [J].
Chen, Haiming ;
Lu, Ping .
INFORMATION AND COMPUTATION, 2015, 241 :302-320