Discovering Repetitive Code Changes in ML Systems

被引:6
作者
Dilhara, Malinda [1 ]
机构
[1] Univ Colorado, Boulder, CO 80309 USA
来源
PROCEEDINGS OF THE 29TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '21) | 2021年
关键词
Machine learning; Empirical analysis; Code change patterns;
D O I
10.1145/3468264.3473493
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Similar to software evolution in other software systems, ML software systems evolve with many repetitive changes. Despite some research and tooling for repetitive code changes that exist in Java and other languages, there is a lack of such tools for Python. Given the significant rise of ML software development, and that many ML developers are not professionally trained developers, the lack of software evolution tools for ML code is even more critical. To bring the ML developers' toolset into the 21st century, we implemented an approach to adapt and reuse the vast ecosystem of Java static analysis tools for Python. Using this approach, we adapted two software evolution tools, RefactoringMiner and CPATMiner, to Python. With the tools, we conducted the first and most fine-grained study on code change patterns in 59 ML systems and surveyed 253 developers. We recommend empirically-justified, actionable opportunities for tool builders and release the tools for researchers.
引用
收藏
页码:1683 / 1685
页数:3
相关论文
共 21 条
[1]   API Code Recommendation using Statistical Learning from Fine-Grained Changes [J].
Anh Tuan Nguyen ;
Hilton, Michael ;
Codoban, Mihai ;
Hoan Anh Nguyen ;
Mast, Lily ;
Rademacher, Eli ;
Nguyen, Tien N. ;
Dig, Danny .
FSE'16: PROCEEDINGS OF THE 2016 24TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON FOUNDATIONS OF SOFTWARE ENGINEERING, 2016, :511-522
[2]   The Open-Closed Principle of Modern Machine Learning Frameworks [J].
Ben Braiek, Houssem ;
Khomh, Foutse ;
Adams, Bram .
2018 IEEE/ACM 15TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR), 2018, :353-363
[3]   Assessing and Improving Malware Detection Sustainability through App Evolution Studies [J].
Cai, Haipeng .
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2020, 29 (02)
[4]   Understanding Software-2.0: A Study of Machine Learning Library Usage and Evolution [J].
Dilhara, Malinda ;
Ketkar, Ameya ;
Dig, Danny .
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2021, 30 (04)
[5]  
Google, 2021, PYTYPE
[6]   Deep Learning Type Inference [J].
Hellendoorn, Vincent J. ;
Bird, Christian ;
Barr, Earl T. ;
Allamanis, Miltiadis .
ESEC/FSE'18: PROCEEDINGS OF THE 2018 26TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, 2018, :152-162
[7]  
Hindle A, 2012, PROC INT CONF SOFTW, P837, DOI 10.1109/ICSE.2012.6227135
[8]  
Hoan Anh Nguyen, 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). Proceedings, P819, DOI 10.1109/ICSE.2019.00089
[9]  
Nguyen HA, 2013, IEEE INT CONF AUTOM, P180, DOI 10.1109/ASE.2013.6693078
[10]   The Scent of Deep Learning Code: An Empirical Study [J].
Jebnoun, Hadhemi ;
Ben Braiek, Houssem ;
Rahman, Mohammad Masudur ;
Khomh, Foutse .
2020 IEEE/ACM 17TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR, 2020, :420-430