REFINYM: Using Names to Refine Types

被引:27
作者
Dash, Santanu Kumar [1 ]
Allamanis, Miltiadis [2 ]
Barr, Earl T. [1 ]
机构
[1] UCL, London, England
[2] Microsoft Res, Cambridge, England
来源
ESEC/FSE'18: PROCEEDINGS OF THE 2018 26TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING | 2018年
基金
英国工程与自然科学研究理事会;
关键词
Type Refinement; Information-theoretic Clustering;
D O I
10.1145/3236024.3236042
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Source code is bimodal: it combines a formal, algorithmic channel and a natural language channel of identifiers and comments. In this work, we model the bimodality of code with name flows, an assignment flow graph augmented to track identifier names. Conceptual types are logically distinct types that do not always coincide with program types. Passwords and URLs are example conceptual types that can share the program type string. Our tool, REFINYM, is an unsupervised method that mines a lattice of conceptual types from name flows and reifies them into distinct nominal types. For string, REFINYM finds and splits conceptual types originally merged into a single type, reducing the number of same-type variables per scope from 8.7 to 2.2 while eliminating 21.9% of scopes that have more than one same-type variable in scope. This makes the code more self-documenting and frees the type system to prevent a developer from inadvertently assigning data across conceptual types.
引用
收藏
页码:107 / 117
页数:11
相关论文
共 30 条
[1]   Learning Natural Coding Conventions [J].
Allamanis, Miltiadis ;
Barr, Earl T. ;
Bird, Christian ;
Sutton, Charles .
22ND ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (FSE 2014), 2014, :281-293
[2]  
Allamanis Miltiadis, 2018, COMPUT SURVEYS
[3]  
[Anonymous], 2010, Bayesian Nonparametrics
[4]  
[Anonymous], 2012, ELEMENTS INFORM THEO
[5]  
[Anonymous], 2012, Networks, Crowds, and Markets
[6]  
[Anonymous], 2012, MACHINE LEARNING PRO
[7]  
[Anonymous], 2007, P JOINT C EMP METH N, DOI DOI 10.7916/D80V8N84
[8]   REPENT: Analyzing the Nature of Identifier Renamings [J].
Arnaoudova, Venera ;
Eshkevari, Laleh M. ;
Di Penta, Massimiliano ;
Oliveto, Rocco ;
Antoniol, Giuliano ;
Gueheneuc, Yann-Gael .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2014, 40 (05) :502-532
[9]   A New Family of Software Anti-Patterns: Linguistic Anti-Patterns [J].
Arnaoudova, Venera ;
Di Penta, Massimiliano ;
Antoniol, Giuliano ;
Gueheneuc, Yann-Gael .
PROCEEDINGS OF THE 17TH EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING (CSMR 2013), 2013, :187-196
[10]  
Beyond the Lines, 2018, LEV TYP SYST AV MIST