Bug localisation is a core program comprehension task in software maintenance: given the observation of a bug, where is it located in the source code files? Information retrieval (IR) approaches see a bug report as the query, and the source code files as the documents to be retrieved, ranked by relevance. Such approaches have the advantage of not requiring expensive static or dynamic analysis of the code.
However, most of state-of-the-art IR approaches rely on project history, in particular previously fixed bugs and previous versions of the source code. We present a novel approach that directly scores each current file against the given report, thus not requiring past code and reports. The scoring is based on heuristics identified through manual inspection of a small set of bug reports.
We compared our approach to five others, using their own five metrics on their own six open source projects. Out of 30 performance indicators, we improve 28. For example, on average we find one or more affected files in the top 10 ranked files for 77% of the bug reports. These results show the applicability of our approach to software projects without history.
Improving Information Retrieval Bug Localisation Using Contextual Heuristics. Dilshener, Tezcan (2017). PhD thesis, The Open University
Improving information retrieval-based concept location using contextual relationships. T. Dilshener (2012), In 2012 34th International Conference on Software Engineering (ICSE), pp. 1499–1502. presentation poster pdf