Cleaning Antipatterns in an SQL Query Log

被引:12
作者
Arzamasova, Natalia [1 ]
Schaeler, Martin [2 ]
Bohm, Klemens [3 ]
机构
[1] Karlsruhe Inst Technol, D-76131 Karlsruhe, Germany
[2] Karlsruhe Inst Technol, Databases & Informat Syst Grp, D-76131 Karlsruhe, Germany
[3] Karlsruhe Inst Technol, Databases & Informat Syst, D-76131 Karlsruhe, Germany
关键词
SQL log analysis; patterns and antipatterns; data preprocessing; E-SCIENCE;
D O I
10.1109/TKDE.2017.2772252
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Today, many scientific data sets are open to the public. For their operators, it is important to know what the users are interested in. In this paper, we study the problem of extracting and analyzing patterns from the query log of a database. We focus on design errors (antipatterns), which typically lead to unnecessary SQL statements. Such antipatterns do not only have a negative effect on performance. They also introduce bias on any subsequent analysis of the SQL log. We propose a framework designed to discover patterns and antipatterns in arbitrary SQL query logs and to clean antipatterns. To study the usefulness of our approach and to reveal insights regarding the existence of antipatterns in real-world systems, we examine the SQL log of the SkyServer project, containing more than 40 million queries. Among the top 15 patterns, we have found six antipatterns. This result as well as other ones gives way to the conclusion that antipatterns might falsify refactoring and any other downstream analyses.
引用
收藏
页码:421 / 434
页数:14
相关论文
共 25 条
[1]   SQL QueRIE Recommendations [J].
Akbarnejad, Javad ;
Chatzopoulou, Gloria ;
Eirinaki, Magdalini ;
Koshy, Suju ;
Mittal, Sarika ;
On, Duc ;
Polyzotis, Neoklis ;
Varman, Jothi S. Vindhiya .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2010, 3 (02) :1597-1600
[2]  
[Anonymous], 2008, P KDD
[3]  
[Anonymous], 1998, AntiPatterns: Refactoring Software, Architectures, and Projects in Crisis
[4]   Clone detection using abstract syntax trees [J].
Baxter, ID ;
Yahin, A ;
Moura, L ;
Sant'Anna, M ;
Bier, L .
INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, PROCEEDINGS, 1998, :368-377
[5]   Semantic errors in SQL queries: A quite complete list [J].
Brass, Stefan ;
Goldberg, Christian .
JOURNAL OF SYSTEMS AND SOFTWARE, 2006, 79 (05) :630-644
[6]  
Burleson D., 2007, SQL DESIGN PATTERNS
[7]  
Buschmann F., 1996, STAL PATTERN ORIENTE
[8]   Detecting Problems in the Database Access Code of Large Scale Systems An Industrial Experience Report [J].
Chen, Tse-Hsun ;
Shang, Weiyi ;
Hassan, Ahmed E. ;
Nasser, Mohamed ;
Flora, Parminder .
2016 IEEE/ACM 38TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING COMPANION (ICSE-C), 2016, :71-80
[9]   Detecting Performance Anti-patterns for Applications Developed using Object-Relational Mapping [J].
Chen, Tse-Hsun ;
Shang, Weiyi ;
Jiang, Zhen Ming ;
Hassan, Ahmed E. ;
Nasser, Mohamed ;
Flora, Parminder .
36TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2014), 2014, :1001-1012
[10]  
Dudney B., 2003, J2EE_AntiPatterns