Locating Software Errors à la Google

Large part of software development is fixing errors, to improve software quality and customer satisfaction.

We developed an approach (and proof-of-concept tool for Java) that suggests the 10 most likely files where to fix an error. This could help to:

  • reduce time and cost to correct errors, by automatically adding the suggestions to the error report;
  • build confidence and improve productivity of new or outsourced staff not familiar with the codebase;
  • assign the work to a developer, based on who worked most or last on the suggested files.

Our approach is fast, light-weight and simple. Unlike other approaches, it doesn’t require project history and yet outperforms them: for 8 open source projects, we locate on average 77% of their errors, i.e. we list at least one faulty file among the 10 suggested files.

We are looking for academic and industrial collaborations to:

  • Try out the tool on industrial code.
  • Improve the accuracy of the results.
  • Extend to other programming languages, besides Java.
  • Integrate with professional workflows and tools, e.g. bug tracking systems, continuous integration tools, IDEs.

Locating Bugs without Looking Back, MSR 2016. Paper: http://oro.open.ac.uk/45654.

On average, for 77% of defects one or more affected files ranked in top-10

Our tool scores each file against a given defect description and then ranks the files in descending order of score, aiming for at least one of the files affected by the defect to be among the top-ranked ones, so that it can serve as an entry point to navigate the code and find the other affected files. In particular it succeeds in placing an affected file among the top-1, top-5 and top-10 files for 48%, 70% and 77% of CRs, on average.

We compared our approach to five other state-of-the-art tools, using their own five metrics on their own six case studies. Out of the 30 performance indicators, we match 2 and improve 28. On average, for 77% of the bug reports we place one or more affected files in the top-10 ranked files.

We also improved, in most cases substantially, the mean reciprocal rank value for all six applications evaluated, thereby reducing the number of files to inspect before finding a relevant file.

The results for our study can be obtained by clicking here: results