Aardvark: Comparative Visualization of Data Analysis Scripts

被引:1
作者
Faust, Rebecca [1 ]
Scheidegger, Carlos [2 ]
North, Chris [1 ]
机构
[1] Virginia Tech, Dept Comp Sci, Blacksburg, VA 24061 USA
[2] Univ Arizona, Dept Comp Sci, HDC Lab, Tucson, AZ USA
来源
2023 IEEE VISUALIZATION IN DATA SCIENCE, VDS | 2023年
基金
美国国家科学基金会;
关键词
Interactive Visualization; Program Traces; Jupyter; Debugging; Comparison;
D O I
10.1109/VDS60365.2023.00009
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Debugging programs is one of the most challenging and time consuming parts of programming. Data science scripts present additional challenges as debugging often centers around more exploratory tasks, such as understanding the differences between results under different parameter settings. In fact, a common exploratory debugging practice is to run, modify, and re-run a script to observe the effects of the modification. Analysts perform this process frequently as they explore different settings and algorithms in their analysis. However, traditional debugging methods are not well suited to comparing across multiple executions of a script. They often require maintaining two instances of the debugging method and making manual, serial comparisons of program values. To address this gap, we present Aardvark, a comparative trace-based debugging method for identifying and visualizing the differences between two executions of data analysis scripts. Aardvark traces two consecutive instances of an analysis script, identifies the differences between them, and presents them through comparative visualizations. We present a prototype implementation in Python as well as an extension to support scripts in Jupyter notebooks. Finally, to demonstrate Aardvark, we provide two usage scenarios on real world analysis scripts.
引用
收藏
页码:30 / 38
页数:9
相关论文
共 39 条
[1]  
Admin, 2019, Intel trace analyzer and collector
[2]  
Alaboudi A., 2021, EDIT RUN BEHAV PROGR
[3]  
Alper B., 2013, P SIGCHI C HUM FACT, P483, DOI [10.1145/2470654.2470724, DOI 10.1145/2470654.2470724, DOI 10.1145/2470654.24707243,4,5]
[4]   Visual Tracing for the Eclipse Java']Java Debugger [J].
Alsallakh, Bilal ;
Bodesinsky, Peter ;
Gruber, Alexander ;
Miksch, Silvia .
2012 16TH EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING (CSMR), 2012, :545-548
[5]   SymNav: Visually Assisting Symbolic Execution [J].
Angelini, Marco ;
Blasilli, Graziano ;
Borzacchiello, Luca ;
Coppa, Emilio ;
D'Elia, Daniele Cono ;
Demetrescu, Camil ;
Lenti, Simone ;
Nicchi, Simone ;
Santucci, Giuseppe .
2019 IEEE SYMPOSIUM ON VISUALIZATION FOR CYBER SECURITY (VIZSEC), 2019,
[6]  
[Anonymous], 2013, GRADIENT DESCENT IMP
[7]  
Burgess-Yeo L, 2020, F# tree diff algorithm
[8]   Tracking Your Changes: A Language-Independent Approach [J].
Canfora, Gerardo ;
Cerullo, Luigi ;
Di Penta, Massimiliano .
IEEE SOFTWARE, 2009, 26 (01) :50-57
[9]  
Chawathe S. S., 1996, SIGMOD Record, V25, P493, DOI 10.1145/235968.233366
[10]   Toward Arbitrary Mapping for Debugging Visualizations [J].
Cheng, Yung-Pin ;
Ku, Chiu-Yu ;
Pan, Wei-Chen ;
Yang, Chuan ;
Lin, Ting-Shu .
2016 IEEE/ACM 38TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING COMPANION (ICSE-C), 2016, :605-608