Boa: Ultra-Large-Scale Software Repository and Source-Code Mining

被引：58

作者：

Dyer, Robert ^{[1
]}

Hoan Anh Nguyen ^{[2
]}

Rajan, Hridesh ^{[2
]}

Nguyen, Tien N. ^{[2
]}

机构：

[1] Bowling Green State Univ, Bowling Green, OH 43403 USA

[2] Iowa State Univ, Ames, IA 50011 USA

来源：

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY | 2015年 / 25卷 / 01期

基金：

美国国家科学基金会;

关键词：

Boa; mining software repositories; domain-specific language; scalable; ease of use; lower barrier to entry;

D O I：

10.1145/2803171

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

In today's software-centric world, ultra-large-scale software repositories, such as SourceForge, GitHub, and Google Code, are the new library of Alexandria. They contain an enormous corpus of software and related information. Scientists and engineers alike are interested in analyzing this wealth of information. However, systematic extraction and analysis of relevant data from these repositories for testing hypotheses is hard, and best left for mining software repository (MSR) experts! Specifically, mining source code yields significant insights into software development artifacts and processes. Unfortunately, mining source code at a large scale remains a difficult task. Previous approaches had to either limit the scope of the projects studied, limit the scope of the mining task to be more coarse grained, or sacrifice studying the history of the code. In this article we address mining source code: (a) at a very large scale; (b) at a fine-grained level of detail; and (c) with full history information. To address these challenges, we present domain-specific language features for source-code mining in our language and infrastructure called Boa. The goal of Boa is to ease testing MSR-related hypotheses. Our evaluation demonstrates that Boa substantially reduces programming efforts, thus lowering the barrier to entry. We also show drastic improvements in scalability.

引用

页数：34

共 50 条

[1] Boa: A Language and Infrastructure for Analyzing Ultra-Large-Scale Software Repositories
Dyer, Robert
Hoan Anh Nguyen
Rajan, Hridesh
Nguyen, Tien N.
PROCEEDINGS OF THE 35TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2013), 2013, : 422 - 431
[2] On Accelerating Ultra-Large-Scale Mining
Upadhyaya, Ganesha
Rajan, Hridesh
2017 IEEE/ACM 39TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: NEW IDEAS AND EMERGING TECHNOLOGIES RESULTS TRACK (ICSE-NIER), 2017, : 39 - 42
[3] Ultra-Large-Scale Repository Analysis via Graph Compression
Boldi, Paolo
Pietri, Antoine
Vigna, Sebastiano
Zacchiroli, Stefano
PROCEEDINGS OF THE 2020 IEEE 27TH INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION, AND REENGINEERING (SANER '20), 2020, : 184 - 194
[4] A New Software Architecture for Ultra-large-scale Rendering Cloud
Zhou Weini
Lu Yongquan
Gao Pengdong
Qiu Chu
Qi Quan
2012 11TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS TO BUSINESS, ENGINEERING & SCIENCE (DCABES), 2012, : 196 - 199
[5] An ultra-large-scale simulation framework
Rao, DM
Wilsey, PA
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2002, 62 (11) : 1670 - 1693
[6] Ultra-large-scale syntheses of monodisperse nanocrystals
Jongnam Park
Kwangjin An
Yosun Hwang
Je-Geun Park
Han-Jin Noh
Jae-Young Kim
Jae-Hoon Park
Nong-Moon Hwang
Taeghwan Hyeon
Nature Materials, 2004, 3 : 891 - 895
[7] Ultra-large-scale syntheses of monodisperse nanocrystals
Park, J
An, KJ
Hwang, YS
Park, JG
Noh, HJ
Kim, JY
Park, JH
Hwang, NM
Hyeon, T
NATURE MATERIALS, 2004, 3 (12) : 891 - 895
[8] Ultra-Large-Scale Silicon Optical Switches
Qiao, Lei
Tang, Weijie
Chu, Tao
2016 IEEE 13TH INTERNATIONAL CONFERENCE ON GROUP IV PHOTONICS (GFP), 2016, : 1 - 2
[9] Type Migration in Ultra-Large-Scale Codebases
Ketkar, Ameya
Mesbah, Ali
Mazinanian, Davood
Dig, Danny
Aftandilian, Edward
2019 IEEE/ACM 41ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2019), 2019, : 1142 - 1153
[10] Using intentional source-code views to aid software maintenance
Mens, K
Poll, B
González, S
INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, PROCEEDINGS, 2003, : 169 - 178

← 1 2 3 4 5 →