Locating Software Errors à la WebSearchEngine

Large part of software development is fixing errors, to improve software quality and customer satisfaction. Developers spent 60-80% of their time on searching for relevant code files to fix.

In the context of doctoral research at the Open University in London, we invented a completely new search algorithm. It suggests the likely files to fix an error with enormous efficiency, leading to:

  • reduce time and cost to correct errors, by automatically adding the suggestions to the error report;
  • build confidence and improve productivity of new or outsourced staff not familiar with the codebase;

Our approach is fast, light-weight and simple and yet outperforms other approaches by listing at least one faulty file among the top 10 suggested.
Average hit rates:

  • 1 out 1: 44%
  • 1 out of 5: 69%
  • 1 out of 10: 76%

We are looking for industrial collaborations partners.

  • Our tool is the only one commercially available in the industry.
  • Extendable to other programming languages, besides Java.
  • Enabled for professional workflows and tools, e.g. bug tracking systems, continuous integration tools, IDEs.

Locating Bugs without Looking Back, MSR 2016. Paper: http://oro.open.ac.uk/45654.

Given a software error report, which source code files need to be changed?

Our tool scores each file against a given software error description and then ranks the files in descending order of score, aiming for at least one of the files affected by the defect to be among the top-ranked ones, so that it can serve as an entry point to navigate the code and find the other affected files. In particular it succeeds in placing an affected file among the top-1, top-5 and top-10 files for 48%, 70% and 77% of CRs, on average.

We compared our approach to five other state-of-the-art tools, using their own five metrics on their own six case studies. Out of the 30 performance indicators, we match 2 and improve 28. On average, for 77% of the bug reports we place one or more affected files in the top-10 ranked files.

We also improved, in most cases substantially, the mean reciprocal rank value for all six applications evaluated, thereby reducing the number of files to inspect before finding a relevant file.

The results for our study can be obtained by clicking here: results