Towards Automating Precision Studies of Clone Detectors

被引:9
|
作者
Saini, Vaibhav [1 ]
Farmahinifarahani, Farima [1 ]
Lu, Yadong [1 ]
Yang, Di [1 ]
Martins, Pedro [1 ]
Sajnanit, Hitesh [2 ]
Baldi, Pierre [1 ]
Lopes, Cristina V. [1 ]
机构
[1] Univ Calif Irvine, Irvine, CA 92717 USA
[2] Microsoft, Redmond, WA USA
来源
2019 IEEE/ACM 41ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2019) | 2019年
关键词
Precision Evaluation; Clone Detection; Machine learning; Open source labeled datasets;
D O I
10.1109/ICSE.2019.00023
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Current research in clone detection suffers from poor ecosystems for evaluating precision of clone detection tools. Corpora of labeled clones are scarce and incomplete, making evaluation labor intensive and idiosyncratic, and limiting inter-tool comparison. Precision-assessment tools are simply lacking. We present a semiautomated approach to facilitate precision studies of clone detection tools. The approach merges automatic mechanisms of clone classification with manual validation of clone pairs. We demonstrate that the proposed automatic approach has a very high precision and it significantly reduces the number of clone pairs that need human validation during precision experiments. Moreover, we aggregate the individual effort of multiple teams into a single evolving dataset of labeled clone pairs, creating an important asset for software clone research.
引用
收藏
页码:49 / 59
页数:11
相关论文
共 50 条
  • [1] On Precision of Code Clone Detection Tools
    Farmahinifarahani, Farima
    Saini, Vaibhav
    Yang, Di
    Sajnani, Hitesh
    Lopes, Cristina V.
    2019 IEEE 26TH INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER), 2019, : 84 - 94
  • [2] A Mutation Analysis Based Benchmarking Framework for Clone Detectors
    Svajlenko, Jeffrey
    Roy, Chanchal K.
    Cordy, James R.
    2013 7TH INTERNATIONAL WORKSHOP ON SOFTWARE CLONES (IWSC), 2013, : 8 - +
  • [3] Towards Automating the Initial Screening Phase of a Systematic Review
    Bekhuis, Tanja
    Demner-Fushman, Dina
    MEDINFO 2010, PTS I AND II, 2010, 160 : 146 - 150
  • [4] Big data clone detection using classical detectors: an exploratory study
    Svajlenko, Jeffrey
    Keivanloo, Iman
    Roy, Chanchal K.
    JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2015, 27 (06) : 430 - 464
  • [5] Improving Clone Detection Precision using Machine Learning Techniques
    Arammongkolvichai, Vara
    Koschke, Rainer
    Ragkhitwetsagul, Chaiyong
    Choetkiertikul, Morakot
    Sunetnanta, Thanwadee
    2019 10TH INTERNATIONAL WORKSHOP ON EMPIRICAL SOFTWARE ENGINEERING IN PRACTICE (IWESEP 2019), 2019, : 31 - 36
  • [6] A Roadmap towards Precision Periodontics
    Rakic, Mia
    Pejcic, Natasa
    Perunovic, Neda
    Vojvodic, Danilo
    MEDICINA-LITHUANIA, 2021, 57 (03): : 1 - 11
  • [7] Are our clone detectors good enough? An empirical study of code effects by obfuscation
    Huang, Weihao
    Meng, Guozhu
    Lin, Chaoyang
    Yan, Qiucun
    Chen, Kai
    Ma, Zhuo
    CYBERSECURITY, 2023, 6 (01)
  • [8] Tree-Pattern-Based Clone Detection with High Precision and Recall
    Lee, Hyo-Sub
    Choi, Myung-Ryul
    Doh, Kyung-Goo
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2018, 12 (05): : 1932 - 1950
  • [9] Are our clone detectors good enough? An empirical study of code effects by obfuscation
    Weihao Huang
    Guozhu Meng
    Chaoyang Lin
    Qiucun Yan
    Kai Chen
    Zhuo Ma
    Cybersecurity, 6
  • [10] Towards Automating Personal Exercise Assessment and Guidance with Affordable Mobile Technology
    Sideridou, Maria
    Kouidi, Evangelia
    Hatzitaki, Vassilia
    Chouvarda, Ioanna
    SENSORS, 2024, 24 (07)