The Qualitas Corpus: A Curated Collection of Java']Java Code for Empirical Studies

被引:196
作者
Tempero, Ewan [1 ]
Anslow, Craig [4 ]
Dietrich, Jens [2 ]
Han, Ted [1 ]
Li, Jing [1 ]
Lumpe, Markus [3 ]
Melton, Hayden [1 ]
Noble, James [4 ]
机构
[1] Univ Auckland, Dept Comp Sci, Auckland, New Zealand
[2] Massey Univ, Sch Engn & Adv Technol, Palmerston North, New Zealand
[3] Swinburne Univ Technol, Fac Informat & Commun Technol, Hawthorn, Vic, Australia
[4] Victoria Univ Wellington, Sch Engn & Comp Sci, Wellington, New Zealand
来源
17TH ASIA PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC 2010) | 2010年
关键词
Empirical studies; curated code corpus; experimental infrastructure; SOFTWARE; METRICS;
D O I
10.1109/APSEC.2010.46
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In order to increase our ability to use measurement to support software development practise we need to do more analysis of code. However, empirical studies of code are expensive and their results are difficult to compare. We describe the Qualitas Corpus, a large curated collection of open source Java systems. The corpus reduces the cost of performing large empirical studies of code and supports comparison of measurements of the same artifacts. We discuss its design, organisation, and issues associated with its development.
引用
收藏
页码:336 / 345
页数:10
相关论文
共 38 条
  • [31] Are unit and integration test definitions still valid for modern Java']Java projects? An empirical study on open-source projects
    Trautsch, Fabian
    Herbold, Steffen
    Grabowski, Jens
    JOURNAL OF SYSTEMS AND SOFTWARE, 2020, 159
  • [32] ViewDEX -: A Java']Java-based software for presentation and evaluation of medical images in observer performance studies.
    Hakansson, Markus
    Svensson, Sune
    Bath, Magnus
    Mansson, Lars Gunnar
    MEDICAL IMAGING 2007: VISUALIZATION AND IMAGE-GUIDED PROCEDURES, PTS 1 AND 2, 2007, 6509
  • [33] On the Untriviality of Trivial Packages: An Empirical Study of npm Java']JavaScript Packages
    Chowdhury, Md Atique Reza
    Abdalkareem, Rabe
    Shihab, Emad
    Adams, Bram
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (08) : 2695 - 2708
  • [34] Extracting and studying the Logging-Code-Issue- Introducing changes in Java']Java-based large-scale open source software systems
    Chen, Boyuan
    Jiang, Zhen Ming
    EMPIRICAL SOFTWARE ENGINEERING, 2019, 24 (04) : 2285 - 2322
  • [35] Empirical Studies on the NLP Techniques for Source Code Data Preprocessing
    Sun, Xiaobing
    Liu, Xiangyue
    Hu, Jiajun
    Zhu, Junwu
    2014 3RD INTERNATIONAL WORKSHOP ON EVIDENTIAL ASSESSMENT OF SOFTWARE TECHNOLOGIES (EAST), 2014, : 32 - 39
  • [36] Empirical studies concerning the maintenance of UML diagrams and their use in the maintenance of code: A systematic mapping study
    Fernandez-Saez, Ana M.
    Genero, Marcela
    Chaudron, Michel R. V.
    INFORMATION AND SOFTWARE TECHNOLOGY, 2013, 55 (07) : 1119 - 1142
  • [37] On the Influence of UML Class Diagrams Refactoring on Code Debt: A Family of Replicated Empirical Studies
    Freire, Savio
    Passos, Amanda
    Mendonca, Manoel
    Sant'Anna, Claudio
    Spinola, Rodrigo O.
    2020 46TH EUROMICRO CONFERENCE ON SOFTWARE ENGINEERING AND ADVANCED APPLICATIONS (SEAA 2020), 2020, : 346 - 353
  • [38] Extracting and studying the Logging-Code-Issue- Introducing changes in Java-based large-scale open source software systems
    Boyuan Chen
    Zhen Ming (Jack) Jiang
    Empirical Software Engineering, 2019, 24 : 2285 - 2322