A segmentation-based approach for temporal analysis of software version repositories

被引:1
作者
Siy, Harvey [1 ]
Chundi, Parvathi [1 ]
Rosenkrantz, Daniel J. [2 ]
Subramaniam, Mahadevan [1 ]
机构
[1] Univ Nebraska, Dept Comp Sci, Omaha, NE 68182 USA
[2] SUNY Albany, Dept Comp Sci, Albany, NY 12222 USA
来源
JOURNAL OF SOFTWARE MAINTENANCE AND EVOLUTION-RESEARCH AND PRACTICE | 2008年 / 20卷 / 03期
关键词
mining software repositories; time series segmentation; temporal analysis; software evolution; change analysis; version control systems; open-source development;
D O I
10.1002/smr.368
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Time series segmentation is a promising approach to discover temporal patterns from time-stamped numeric data. A novel approach to apply time series segmentation to discern temporal information from software version repositories is proposed. Data from such repositories, both numeric and non-numeric, are represented as item-set time series data. A dynamic programming algorithm for optimal segmentation is presented. The algorithm automatically produces a compacted item-set time series that can be analyzed to identify temporal patterns. The effectiveness of the approach is illustrated by analyzing version control repositories of several open-source projects to identify time-varying patterns of developer activity. The experimental results show that the segmentation algorithm produces segments that capture meaningful information and is superior to the information content obtained by arbitrarily segmenting software history into regular time intervals. Copyright (c) 2008 John Wiley & Sons, Ltd.
引用
收藏
页码:199 / 222
页数:24
相关论文
共 35 条
  • [1] Modified Gath-Geva clustering for fuzzy segmentation of multivariate time-series
    Abonyi, J
    Feil, B
    Nemeth, S
    Arva, P
    [J]. FUZZY SETS AND SYSTEMS, 2005, 149 (01) : 39 - 56
  • [2] [Anonymous], 2005, P INT WORKSH MIN SOF, DOI DOI 10.1145/1083142.1083147
  • [3] [Anonymous], TUT SIAM INT C DAT M
  • [4] ANTONIOL G, 2001, MODELING EVOLUTION C, P273
  • [5] Anvik J., 2007, Proceedings of the Fourth International Workshop on Mining Software Repositories, P2
  • [6] Bird Christian, 2006, P 2006 INT WORKSH MI, P137, DOI [DOI 10.1145/1137983.1138016, 10.1145/1137983.1138016]
  • [7] SiZer for exploration of structures in curves
    Chaudhuri, P
    Marron, JS
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1999, 94 (447) : 807 - 823
  • [8] Chundi P, 2004, SIAM PROC S, P57
  • [9] CHUNDI P, 2004, ACM C INF KNOWL MAN, P437
  • [10] DAMBROS M, 2005, P 3 INT WORKSH VIS S, P1