A segmentation-based approach for temporal analysis of software version repositories

被引:1
作者
Siy, Harvey [1 ]
Chundi, Parvathi [1 ]
Rosenkrantz, Daniel J. [2 ]
Subramaniam, Mahadevan [1 ]
机构
[1] Univ Nebraska, Dept Comp Sci, Omaha, NE 68182 USA
[2] SUNY Albany, Dept Comp Sci, Albany, NY 12222 USA
来源
JOURNAL OF SOFTWARE MAINTENANCE AND EVOLUTION-RESEARCH AND PRACTICE | 2008年 / 20卷 / 03期
关键词
mining software repositories; time series segmentation; temporal analysis; software evolution; change analysis; version control systems; open-source development;
D O I
10.1002/smr.368
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Time series segmentation is a promising approach to discover temporal patterns from time-stamped numeric data. A novel approach to apply time series segmentation to discern temporal information from software version repositories is proposed. Data from such repositories, both numeric and non-numeric, are represented as item-set time series data. A dynamic programming algorithm for optimal segmentation is presented. The algorithm automatically produces a compacted item-set time series that can be analyzed to identify temporal patterns. The effectiveness of the approach is illustrated by analyzing version control repositories of several open-source projects to identify time-varying patterns of developer activity. The experimental results show that the segmentation algorithm produces segments that capture meaningful information and is superior to the information content obtained by arbitrarily segmenting software history into regular time intervals. Copyright (c) 2008 John Wiley & Sons, Ltd.
引用
收藏
页码:199 / 222
页数:24
相关论文
共 35 条
  • [11] Duncan SR, 1996, IEEE DECIS CONTR P, P3123, DOI 10.1109/CDC.1996.573607
  • [12] Populating a release history database from version control and bug tracking systems
    Fischer, M
    Pinzger, M
    Gall, H
    [J]. INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, PROCEEDINGS, 2003, : 23 - 32
  • [13] Software evolution from a time-series perspective
    Fuentetaja, E
    Bagert, DJ
    [J]. INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, PROCEEDINGS, 2002, : 226 - 229
  • [14] Detection of logical coupling based on product release history
    Gall, H
    Hajek, K
    Jazayeri, M
    [J]. INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, PROCEEDINGS, 1998, : 190 - 198
  • [15] How developers drive software evolution
    Gîrba, T
    Kuhn, A
    Seeberger, M
    Ducasse, S
    [J]. EIGHTH INTERNATIONAL WORKSHOP ON PRINCIPLES OF SOFTWARE EVOLUTION, PROCEEDINGS, 2005, : 113 - 122
  • [16] Gonzalez-Barahona J.M., 2007, P 4 INT WORKSH MIN S, P28
  • [17] Time series segmentation for context recognition in mobile devices
    Himberg, J
    Korpiaho, K
    Mannila, H
    Tikanmäki, J
    Toivonen, HTT
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2001, : 203 - 210
  • [18] KAGDI H, 2006, P 3 INT WORKSH MIN S, P47
  • [19] Keogh E., 2004, Data mining in time series databases, P1, DOI [DOI 10.1142/9789812565402_0001, 10.1142/9789812565402]
  • [20] KEOGH EJ, 2001, ONLINE ALGORITHM SEG, P289