Improving Information Retrievel Bug Localisation Using Contextual Relations

The project artefacts, i.e. domain concepts, user guide and dataset, for each application utilised in evaluating our tool can be obtained from the links listed here. The datasets include source code files and closed bug reports, which contain reference to the code files changed to resolve the reported bug.

Domain: Aspect Oriented Programming (AOP) – Concepts
Application used: AspectJ – User Guide, Dataset

Domain:  Integrated Development Environment (IDE) – Concepts
Application used: EclipseUser Guide, Dataset

Domain: Graphical User Interface (GUI) – Concepts
Application used: SWTUser Guide  Dataset

Domain: BarCode imaging/scanning – Concepts
Application used: ZXing –  User Guide, Dataset

Domain: Unified Modeling Language (UML) – Concepts
Application used: ArgoUML – User Guide, Dataset

Domain:  Servlet container – Concepts
Application used: Tomcat –  User Guide, Dataset (Contact author)

Domain:  Basel2 credit and risk management: Concepts
Application used:  Pillar1User Guide, Dataset


Disclaimer: The provided links contain information outside our control.

Locating Software Errors without Looking Back

Bug localisation is a core program comprehension task in software maintenance: given the observation of a bug, where is it located in the source code files? Information retrieval (IR) approaches see a bug report as the query, and the source code files as the documents to be retrieved, ranked by relevance. Such approaches have the advantage of not requiring expensive static or dynamic analysis of the code.

However, most of state-of-the-art IR approaches rely on project history, in particular previously fixed bugs and previous versions of the source code. We present a novel approach that directly scores each current file against the given report, thus not requiring past code and reports. The scoring is based on heuristics identified through manual inspection of a small set of bug reports.

We compared our approach to five others, using their own five metrics on their own six open source projects. Out of 30 performance indicators, we improve 28. For example, on average we find one or more affected files in the top 10 ranked files for 77% of the bug reports. These results show the applicability of our approach to software projects without history.

Improving Information Retrieval Bug Localisation Using Contextual Heuristics. Dilshener, Tezcan (2017). PhD thesis, The Open University

Locating bugs without looking back (journal version). Dilshener, T., Wermelinger, M. & Yu, Y. Autom Softw Eng (2017)., online pdf

Locating bugs without looking back. T. Dilshener; M. Wermelinger; and Y. Yu (2016), In Proceedings of the 13th International Conference on Mining Software Repositories, Austin, Texas, MSR ’16, pp. 286–290. ACM, New York, NY, USA. presentation poster pdf

Improving Bug Localisation Using Lexical Information and Call Relations. T. Dilshener; M. Wermelinger; and Y. Yu (2014) presentation poster pdf

Leveraging Domain Vocabulary across Artefacts: a Comparison of Conceptually Related Applications. T. Dilshener; M. Wermelinger; and Y. Yu (2013) presentation poster pdf

Improving information retrieval-based concept location using contextual relationships. T. Dilshener (2012),  In 2012 34th International Conference on Software Engineering (ICSE), pp. 1499–1502. presentation poster pdf

Relating developers’ concepts and artefact vocabulary in a financial software module. T. Dilshener and M. Wermelinger (2011), In Software Maintenance (ICSM), 2011 27th IEEE International Conference on, pp. 412–417. presentation pdf

Google Scholar details click here