Detecting asynchrony and dephase change patterns by mining software repositories

نویسندگان

  • Fehmi Jaafar
  • Yann-Gaël Guéhéneuc
  • Sylvie Hamel
  • Giuliano Antoniol
چکیده

Software maintenance accounts for the largest part of the costs of any program. During maintenance activities, developers implement changes (sometimes simultaneously) on artefacts in order to fix bugs and to implement new requirements. To reduce this part of the costs, previous work proposed approaches to identify the artefacts of programs that change together. These approaches analyse historical data, mined from version control systems, and report change patterns, which lead at the causes, consequences, and actors of the changes to source code files. They also introduce so-called change patterns that describe some typical change dependencies among files. In this paper, we introduce two novel change patterns: the Asynchrony change pattern, corresponding to macro co-changes (MC), i.e., of files that co-change within a large time interval (change periods), and the Dephase change pattern, corresponding to dephase macro co-changes (DC), i.e., macro co-changes that always happen with the same shifts in time. We present our approach, that we named Macocha, to identify these two change patterns in large programs. We use the k-nearest neighbor algorithm to group changes into change periods.We also use the Hamming distance to detect approximate occurrences of Macro co-changes and Dephase macro co-changes. We apply Macocha and compare its performance in terms of precision and recall with UMLDiff (file stability) and Association Rules (co-changing files) on seven systems: ArgoUML, FreeBSD, JFreeChart, Openser, SIP, XalanC, and XercesC, developed with three different languages (C, C++, and Java). These systems have a size ranging from 532 to 1,693 files and during the study period they have undergone 1,555 to 23,944 change commits. We use external information and static analysis to validate (approximate) Macro co-changes and Dephase macro co-changes found byMacocha. Through our case study, we show the existence and usefulness of these novel change patterns to ease software maintenance and, potentially, reduce related costs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysing Artefacts Dependencies to Evolving Software Systems

Program maintenance accounts for the largest part of the costs of any program. During maintenance activities, developers implement changes (sometimes simultaneously) on artefacts to fix bugs and to implement new requirements. Thus, developers need knowledge to identify hidden dependencies among programs artefacts and detect correlated artefacts. As programs evolved, their designs become more co...

متن کامل

Mining evolutionary dependencies from web-localization repositories

An approach to mining repositories of web-based user documentation for patterns of evolutionary change in the context of internationalization and localization is presented. Localized web documents that are frequently co-changed (i.e., an evolutionary dependency) during the natural language translation process are uncovered to support the future evolution of the system. A sequential-pattern mini...

متن کامل

Mining Software Repositories for Software Change Impact Analysis: A Case Study

Data mining algorithms have been recently applied to software repositories to help on the maintenance of evolving software systems. In the past, information about what classes changed together, obtained by mining software repositories, were used to guide future changes. We use this information to measure the possible impacts of a proposed change. In this paper we propose and compare two approac...

متن کامل

Mining API Usage Patterns by Applying Method Categorization to Improve Code Completion

Developers often face difficulties while using APIs. API usage patterns can aid them in using APIs efficiently, which are extracted from source code stored in software repositories. Previous approaches have mined repositories to extract API usage patterns by simply applying data mining techniques to the collection of method invocations of API objects. In these approaches, respective functional ...

متن کامل

A Proposed Data Mining Methodology and its Application to Industrial Procedures

Data mining is the process of discovering correlations, patterns, trends or relationships by searching through a large amount of data stored in repositories, corporate databases, and data warehouses. Industrial procedures with the help of engineers, managers, and other specialists, comprise a broad field and have many tools and techniques in their problem-solving arsenal. The purpose of this st...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of Software: Evolution and Process

دوره 26  شماره 

صفحات  -

تاریخ انتشار 2014