Locating Software Errors à la WebSearchEngine

Large part of software development is fixing errors, to improve software quality and customer satisfaction. Developers spent 60-80% of their time on searching for relevant code files to fix.

In the context of doctoral research at the Open University in London, we invented a completely new search algorithm. It suggests the likely files to fix an error with enormous efficiency, leading to:

  • reduce time and cost to correct errors, by automatically adding the suggestions to the error report;
  • build confidence and improve productivity of new or outsourced staff not familiar with the codebase;

Our approach is fast, light-weight and simple and yet outperforms other approaches by listing at least one faulty file among the top 10 suggested.
Average hit rates:

  • 1 out 1: 44%
  • 1 out of 5: 69%
  • 1 out of 10: 76%

We are looking for industrial collaborations partners.

  • Our tool is ready for use with industrial code.
  • Extendable to other programming languages, besides Java.
  • Enabled for professional workflows and tools, e.g. bug tracking systems, continuous integration tools, IDEs.

Locating Bugs without Looking Back, MSR 2016. Paper: http://oro.open.ac.uk/45654.

Given a software error report, which source code files need to be changed?

Our tool scores each file against a given defect description and then ranks the files in descending order of score, aiming for at least one of the files affected by the defect to be among the top-ranked ones, so that it can serve as an entry point to navigate the code and find the other affected files. In particular it succeeds in placing an affected file among the top-1, top-5 and top-10 files for 48%, 70% and 77% of CRs, on average.

We compared our approach to five other state-of-the-art tools, using their own five metrics on their own six case studies. Out of the 30 performance indicators, we match 2 and improve 28. On average, for 77% of the bug reports we place one or more affected files in the top-10 ranked files.

We also improved, in most cases substantially, the mean reciprocal rank value for all six applications evaluated, thereby reducing the number of files to inspect before finding a relevant file.

The results for our study can be obtained by clicking here: results

Software error localisation tool

ConCodeSe ranks, on average, for 77% of bug reports one or more affected files in top-10.

Given a bug report (BR) and a source file, our approach computes two kinds of scores for the file: a probabilistic score, given by VSM, and a lexical similarity score. Each kind of scoring is obtained with four search types using a different set of terms indexed from the BR and the file.

For each of the 8 combinations of scoring, all files are ranked in descending order. Then, for each file we take the best of its 8 ranks.

Whilst other localisation algorithms take a “one size fits all” approach, we treat each BR and file individually, using the summary, stack trace, stemming, comments and file names only when available and relevant, i.e. when they improve the ranking.

Download: Contextual Code Search Engine  User guide: pdf

Quick start guide: pdf

Improving Information Retrievel Bug Localisation Using Contextual Relations

The project artefacts, i.e. domain concepts, user guide and dataset, for each application utilised in evaluating our tool can be obtained from the links listed here. The datasets include source code files and closed bug reports, which contain reference to the code files changed to resolve the reported bug.

Domain: Aspect Oriented Programming (AOP) – Concepts
Application used: AspectJ – User Guide, Dataset

Domain:  Integrated Development Environment (IDE) – Concepts
Application used: EclipseUser Guide, Dataset

Domain: Graphical User Interface (GUI) – Concepts
Application used: SWTUser Guide  Dataset

Domain: BarCode imaging/scanning – Concepts
Application used: ZXing –  User Guide, Dataset

Domain: Unified Modeling Language (UML) – Concepts
Application used: ArgoUML – User Guide, Dataset

Domain:  Servlet container – Concepts
Application used: Tomcat –  User Guide, Dataset (Contact author)

Domain:  Basel2 credit and risk management: Concepts
Application used:  Pillar1User Guide, Dataset

 

Disclaimer: The provided links contain information outside our control.

Locating Software Errors without Looking Back

Bug localisation is a core program comprehension task in software maintenance: given the observation of a bug, where is it located in the source code files? Information retrieval (IR) approaches see a bug report as the query, and the source code files as the documents to be retrieved, ranked by relevance. Such approaches have the advantage of not requiring expensive static or dynamic analysis of the code.

However, most of state-of-the-art IR approaches rely on project history, in particular previously fixed bugs and previous versions of the source code. We present a novel approach that directly scores each current file against the given report, thus not requiring past code and reports. The scoring is based on heuristics identified through manual inspection of a small set of bug reports.

We compared our approach to five others, using their own five metrics on their own six open source projects. Out of 30 performance indicators, we improve 28. For example, on average we find one or more affected files in the top 10 ranked files for 77% of the bug reports. These results show the applicability of our approach to software projects without history.

Improving Information Retrieval Bug Localisation Using Contextual Heuristics. Dilshener, Tezcan (2017). PhD thesis, The Open University

Locating bugs without looking back (journal version). Dilshener, T., Wermelinger, M. & Yu, Y. Autom Softw Eng (2017). https://doi.org/10.1007/s10515-017-0226-1, online pdf

Locating bugs without looking back. T. Dilshener; M. Wermelinger; and Y. Yu (2016), In Proceedings of the 13th International Conference on Mining Software Repositories, Austin, Texas, MSR ’16, pp. 286–290. ACM, New York, NY, USA. presentation poster pdf

Improving Bug Localisation Using Lexical Information and Call Relations. T. Dilshener; M. Wermelinger; and Y. Yu (2014) presentation poster pdf

Leveraging Domain Vocabulary across Artefacts: a Comparison of Conceptually Related Applications. T. Dilshener; M. Wermelinger; and Y. Yu (2013) presentation poster pdf

Improving information retrieval-based concept location using contextual relationships. T. Dilshener (2012),  In 2012 34th International Conference on Software Engineering (ICSE), pp. 1499–1502. presentation poster pdf

Relating developers’ concepts and artefact vocabulary in a financial software module. T. Dilshener and M. Wermelinger (2011), In Software Maintenance (ICSM), 2011 27th IEEE International Conference on, pp. 412–417. presentation pdf

Google Scholar details click here

GDPR – Data privacy policy

Privacy Policy

We are very delighted that you have shown interest in our enterprise. Data protection is of a particularly high priority for the management of the Dilshener Consulting. The use of the Internet pages of the Dilshener Consulting is possible without any indication of personal data; however, if a data subject wants to use special enterprise services via our website, processing of personal data could become necessary. If the processing of personal data is necessary and there is no statutory basis for such processing, we generally obtain consent from the data subject.

The processing of personal data, such as the name, address, e-mail address, or telephone number of a data subject shall always be in line with the General Data Protection Regulation (GDPR), and in accordance with the country-specific data protection regulations applicable to the Dilshener Consulting. By means of this data protection declaration, our enterprise would like to inform the general public of the nature, scope, and purpose of the personal data we collect, use and process. Furthermore, data subjects are informed, by means of this data protection declaration, of the rights to which they are entitled.

As the controller, the Dilshener Consulting has implemented numerous technical and organizational measures to ensure the most complete protection of personal data processed through this website. However, Internet-based data transmissions may in principle have security gaps, so absolute protection may not be guaranteed. For this reason, every data subject is free to transfer personal data to us via alternative means, e.g. by telephone.

1. Definitions

The data protection declaration of the Dilshener Consulting is based on the terms used by the European legislator for the adoption of the General Data Protection Regulation (GDPR). Our data protection declaration should be legible and understandable for the general public, as well as our customers and business partners. To ensure this, we would like to first explain the terminology used.

In this data protection declaration, we use, inter alia, the following terms:

  • a) Personal data

    Personal data means any information relating to an identified or identifiable natural person (“data subject”). An identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.

  • b) Data subject

    Data subject is any identified or identifiable natural person, whose personal data is processed by the controller responsible for the processing.

  • c) Processing

    Processing is any operation or set of operations which is performed on personal data or on sets of personal data, whether or not by automated means, such as collection, recording, organisation, structuring, storage, adaptation or alteration, retrieval, consultation, use, disclosure by transmission, dissemination or otherwise making available, alignment or combination, restriction, erasure or destruction.

  • d) Restriction of processing

    Restriction of processing is the marking of stored personal data with the aim of limiting their processing in the future.

  • e) Profiling

    Profiling means any form of automated processing of personal data consisting of the use of personal data to evaluate certain personal aspects relating to a natural person, in particular to analyse or predict aspects concerning that natural person’s performance at work, economic situation, health, personal preferences, interests, reliability, behaviour, location or movements.

  • f) Pseudonymisation

    Pseudonymisation is the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person.

  • g) Controller or controller responsible for the processing

    Controller or controller responsible for the processing is the natural or legal person, public authority, agency or other body which, alone or jointly with others, determines the purposes and means of the processing of personal data; where the purposes and means of such processing are determined by Union or Member State law, the controller or the specific criteria for its nomination may be provided for by Union or Member State law.

  • h) Processor

    Processor is a natural or legal person, public authority, agency or other body which processes personal data on behalf of the controller.

  • i) Recipient

    Recipient is a natural or legal person, public authority, agency or another body, to which the personal data are disclosed, whether a third party or not. However, public authorities which may receive personal data in the framework of a particular inquiry in accordance with Union or Member State law shall not be regarded as recipients; the processing of those data by those public authorities shall be in compliance with the applicable data protection rules according to the purposes of the processing.

  • j) Third party

    Third party is a natural or legal person, public authority, agency or body other than the data subject, controller, processor and persons who, under the direct authority of the controller or processor, are authorised to process personal data.

  • k) Consent

    Consent of the data subject is any freely given, specific, informed and unambiguous indication of the data subject’s wishes by which he or she, by a statement or by a clear affirmative action, signifies agreement to the processing of personal data relating to him or her.

2. Name and Address of the controller

Controller for the purposes of the General Data Protection Regulation (GDPR), other data protection laws applicable in Member states of the European Union and other provisions related to data protection is:

Dilshener Consulting

Leostr. 6

81375 München

Germany

Phone: 089-71039186

Email: info@concodese.com

Website: www.concodese.com

3. Collection of general data and information

The website of the Dilshener Consulting collects a series of general data and information when a data subject or automated system calls up the website. This general data and information are stored in the server log files. Collected may be (1) the browser types and versions used, (2) the operating system used by the accessing system, (3) the website from which an accessing system reaches our website (so-called referrers), (4) the sub-websites, (5) the date and time of access to the Internet site, (6) an Internet protocol address (IP address), (7) the Internet service provider of the accessing system, and (8) any other similar data and information that may be used in the event of attacks on our information technology systems.

When using these general data and information, the Dilshener Consulting does not draw any conclusions about the data subject. Rather, this information is needed to (1) deliver the content of our website correctly, (2) optimize the content of our website as well as its advertisement, (3) ensure the long-term viability of our information technology systems and website technology, and (4) provide law enforcement authorities with the information necessary for criminal prosecution in case of a cyber-attack. Therefore, the Dilshener Consulting analyzes anonymously collected data and information statistically, with the aim of increasing the data protection and data security of our enterprise, and to ensure an optimal level of protection for the personal data we process. The anonymous data of the server log files are stored separately from all personal data provided by a data subject.

4. Comments function in the blog on the website

The Dilshener Consulting offers users the possibility to leave individual comments on individual blog contributions on a blog, which is on the website of the controller. A blog is a web-based, publicly-accessible portal, through which one or more people called bloggers or web-bloggers may post articles or write down thoughts in so-called blogposts. Blogposts may usually be commented by third parties.

If a data subject leaves a comment on the blog published on this website, the comments made by the data subject are also stored and published, as well as information on the date of the commentary and on the user’s (pseudonym) chosen by the data subject. In addition, the IP address assigned by the Internet service provider (ISP) to the data subject is also logged. This storage of the IP address takes place for security reasons, and in case the data subject violates the rights of third parties, or posts illegal content through a given comment. The storage of these personal data is, therefore, in the own interest of the data controller, so that he can exculpate in the event of an infringement. This collected personal data will not be passed to third parties, unless such a transfer is required by law or serves the aim of the defense of the data controller.

5. Routine erasure and blocking of personal data

The data controller shall process and store the personal data of the data subject only for the period necessary to achieve the purpose of storage, or as far as this is granted by the European legislator or other legislators in laws or regulations to which the controller is subject to.

If the storage purpose is not applicable, or if a storage period prescribed by the European legislator or another competent legislator expires, the personal data are routinely blocked or erased in accordance with legal requirements.

6. Rights of the data subject

  • a) Right of confirmation

    Each data subject shall have the right granted by the European legislator to obtain from the controller the confirmation as to whether or not personal data concerning him or her are being processed. If a data subject wishes to avail himself of this right of confirmation, he or she may, at any time, contact any employee of the controller.

  • b) Right of access

    Each data subject shall have the right granted by the European legislator to obtain from the controller free information about his or her personal data stored at any time and a copy of this information. Furthermore, the European directives and regulations grant the data subject access to the following information:

    • the purposes of the processing;
    • the categories of personal data concerned;
    • the recipients or categories of recipients to whom the personal data have been or will be disclosed, in particular recipients in third countries or international organisations;
    • where possible, the envisaged period for which the personal data will be stored, or, if not possible, the criteria used to determine that period;
    • the existence of the right to request from the controller rectification or erasure of personal data, or restriction of processing of personal data concerning the data subject, or to object to such processing;
    • the existence of the right to lodge a complaint with a supervisory authority;
    • where the personal data are not collected from the data subject, any available information as to their source;
    • the existence of automated decision-making, including profiling, referred to in Article 22(1) and (4) of the GDPR and, at least in those cases, meaningful information about the logic involved, as well as the significance and envisaged consequences of such processing for the data subject.

    Furthermore, the data subject shall have a right to obtain information as to whether personal data are transferred to a third country or to an international organisation. Where this is the case, the data subject shall have the right to be informed of the appropriate safeguards relating to the transfer.

    If a data subject wishes to avail himself of this right of access, he or she may, at any time, contact any employee of the controller.

  • c) Right to rectification

    Each data subject shall have the right granted by the European legislator to obtain from the controller without undue delay the rectification of inaccurate personal data concerning him or her. Taking into account the purposes of the processing, the data subject shall have the right to have incomplete personal data completed, including by means of providing a supplementary statement.

    If a data subject wishes to exercise this right to rectification, he or she may, at any time, contact any employee of the controller.

  • d) Right to erasure (Right to be forgotten)

    Each data subject shall have the right granted by the European legislator to obtain from the controller the erasure of personal data concerning him or her without undue delay, and the controller shall have the obligation to erase personal data without undue delay where one of the following grounds applies, as long as the processing is not necessary:

    • The personal data are no longer necessary in relation to the purposes for which they were collected or otherwise processed.
    • The data subject withdraws consent to which the processing is based according to point (a) of Article 6(1) of the GDPR, or point (a) of Article 9(2) of the GDPR, and where there is no other legal ground for the processing.
    • The data subject objects to the processing pursuant to Article 21(1) of the GDPR and there are no overriding legitimate grounds for the processing, or the data subject objects to the processing pursuant to Article 21(2) of the GDPR.
    • The personal data have been unlawfully processed.
    • The personal data must be erased for compliance with a legal obligation in Union or Member State law to which the controller is subject.
    • The personal data have been collected in relation to the offer of information society services referred to in Article 8(1) of the GDPR.

    If one of the aforementioned reasons applies, and a data subject wishes to request the erasure of personal data stored by the Dilshener Consulting, he or she may, at any time, contact any employee of the controller. An employee of Dilshener Consulting shall promptly ensure that the erasure request is complied with immediately.

    Where the controller has made personal data public and is obliged pursuant to Article 17(1) to erase the personal data, the controller, taking account of available technology and the cost of implementation, shall take reasonable steps, including technical measures, to inform other controllers processing the personal data that the data subject has requested erasure by such controllers of any links to, or copy or replication of, those personal data, as far as processing is not required. An employees of the Dilshener Consulting will arrange the necessary measures in individual cases.

  • e) Right of restriction of processing

    Each data subject shall have the right granted by the European legislator to obtain from the controller restriction of processing where one of the following applies:

    • The accuracy of the personal data is contested by the data subject, for a period enabling the controller to verify the accuracy of the personal data.
    • The processing is unlawful and the data subject opposes the erasure of the personal data and requests instead the restriction of their use instead.
    • The controller no longer needs the personal data for the purposes of the processing, but they are required by the data subject for the establishment, exercise or defence of legal claims.
    • The data subject has objected to processing pursuant to Article 21(1) of the GDPR pending the verification whether the legitimate grounds of the controller override those of the data subject.

    If one of the aforementioned conditions is met, and a data subject wishes to request the restriction of the processing of personal data stored by the Dilshener Consulting, he or she may at any time contact any employee of the controller. The employee of the Dilshener Consulting will arrange the restriction of the processing.

  • f) Right to data portability

    Each data subject shall have the right granted by the European legislator, to receive the personal data concerning him or her, which was provided to a controller, in a structured, commonly used and machine-readable format. He or she shall have the right to transmit those data to another controller without hindrance from the controller to which the personal data have been provided, as long as the processing is based on consent pursuant to point (a) of Article 6(1) of the GDPR or point (a) of Article 9(2) of the GDPR, or on a contract pursuant to point (b) of Article 6(1) of the GDPR, and the processing is carried out by automated means, as long as the processing is not necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller.

    Furthermore, in exercising his or her right to data portability pursuant to Article 20(1) of the GDPR, the data subject shall have the right to have personal data transmitted directly from one controller to another, where technically feasible and when doing so does not adversely affect the rights and freedoms of others.

    In order to assert the right to data portability, the data subject may at any time contact any employee of the Dilshener Consulting.

  • g) Right to object

    Each data subject shall have the right granted by the European legislator to object, on grounds relating to his or her particular situation, at any time, to processing of personal data concerning him or her, which is based on point (e) or (f) of Article 6(1) of the GDPR. This also applies to profiling based on these provisions.

    The Dilshener Consulting shall no longer process the personal data in the event of the objection, unless we can demonstrate compelling legitimate grounds for the processing which override the interests, rights and freedoms of the data subject, or for the establishment, exercise or defence of legal claims.

    If the Dilshener Consulting processes personal data for direct marketing purposes, the data subject shall have the right to object at any time to processing of personal data concerning him or her for such marketing. This applies to profiling to the extent that it is related to such direct marketing. If the data subject objects to the Dilshener Consulting to the processing for direct marketing purposes, the Dilshener Consulting will no longer process the personal data for these purposes.

    In addition, the data subject has the right, on grounds relating to his or her particular situation, to object to processing of personal data concerning him or her by the Dilshener Consulting for scientific or historical research purposes, or for statistical purposes pursuant to Article 89(1) of the GDPR, unless the processing is necessary for the performance of a task carried out for reasons of public interest.

    In order to exercise the right to object, the data subject may contact any employee of the Dilshener Consulting. In addition, the data subject is free in the context of the use of information society services, and notwithstanding Directive 2002/58/EC, to use his or her right to object by automated means using technical specifications.

  • h) Automated individual decision-making, including profiling

    Each data subject shall have the right granted by the European legislator not to be subject to a decision based solely on automated processing, including profiling, which produces legal effects concerning him or her, or similarly significantly affects him or her, as long as the decision (1) is not is necessary for entering into, or the performance of, a contract between the data subject and a data controller, or (2) is not authorised by Union or Member State law to which the controller is subject and which also lays down suitable measures to safeguard the data subject’s rights and freedoms and legitimate interests, or (3) is not based on the data subject’s explicit consent.

    If the decision (1) is necessary for entering into, or the performance of, a contract between the data subject and a data controller, or (2) it is based on the data subject’s explicit consent, the Dilshener Consulting shall implement suitable measures to safeguard the data subject’s rights and freedoms and legitimate interests, at least the right to obtain human intervention on the part of the controller, to express his or her point of view and contest the decision.

    If the data subject wishes to exercise the rights concerning automated individual decision-making, he or she may, at any time, contact any employee of the Dilshener Consulting.

  • i) Right to withdraw data protection consent

    Each data subject shall have the right granted by the European legislator to withdraw his or her consent to processing of his or her personal data at any time.

    If the data subject wishes to exercise the right to withdraw the consent, he or she may, at any time, contact any employee of the Dilshener Consulting.

7. Data protection for applications and the application procedures

The data controller shall collect and process the personal data of applicants for the purpose of the processing of the application procedure. The processing may also be carried out electronically. This is the case, in particular, if an applicant submits corresponding application documents by e-mail or by means of a web form on the website to the controller. If the data controller concludes an employment contract with an applicant, the submitted data will be stored for the purpose of processing the employment relationship in compliance with legal requirements. If no employment contract is concluded with the applicant by the controller, the application documents shall be automatically erased two months after notification of the refusal decision, provided that no other legitimate interests of the controller are opposed to the erasure. Other legitimate interest in this relation is, e.g. a burden of proof in a procedure under the General Equal Treatment Act (AGG).

8. Legal basis for the processing

Art. 6(1) lit. a GDPR serves as the legal basis for processing operations for which we obtain consent for a specific processing purpose. If the processing of personal data is necessary for the performance of a contract to which the data subject is party, as is the case, for example, when processing operations are necessary for the supply of goods or to provide any other service, the processing is based on Article 6(1) lit. b GDPR. The same applies to such processing operations which are necessary for carrying out pre-contractual measures, for example in the case of inquiries concerning our products or services. Is our company subject to a legal obligation by which processing of personal data is required, such as for the fulfillment of tax obligations, the processing is based on Art. 6(1) lit. c GDPR.
In rare cases, the processing of personal data may be necessary to protect the vital interests of the data subject or of another natural person. This would be the case, for example, if a visitor were injured in our company and his name, age, health insurance data or other vital information would have to be passed on to a doctor, hospital or other third party. Then the processing would be based on Art. 6(1) lit. d GDPR.
Finally, processing operations could be based on Article 6(1) lit. f GDPR. This legal basis is used for processing operations which are not covered by any of the abovementioned legal grounds, if processing is necessary for the purposes of the legitimate interests pursued by our company or by a third party, except where such interests are overridden by the interests or fundamental rights and freedoms of the data subject which require protection of personal data. Such processing operations are particularly permissible because they have been specifically mentioned by the European legislator. He considered that a legitimate interest could be assumed if the data subject is a client of the controller (Recital 47 Sentence 2 GDPR).

9. The legitimate interests pursued by the controller or by a third party

Where the processing of personal data is based on Article 6(1) lit. f GDPR our legitimate interest is to carry out our business in favor of the well-being of all our employees and the shareholders.

10. Period for which the personal data will be stored

The criteria used to determine the period of storage of personal data is the respective statutory retention period. After expiration of that period, the corresponding data is routinely deleted, as long as it is no longer necessary for the fulfillment of the contract or the initiation of a contract.

11. Provision of personal data as statutory or contractual requirement; Requirement necessary to enter into a contract; Obligation of the data subject to provide the personal data; possible consequences of failure to provide such data

We clarify that the provision of personal data is partly required by law (e.g. tax regulations) or can also result from contractual provisions (e.g. information on the contractual partner).

Sometimes it may be necessary to conclude a contract that the data subject provides us with personal data, which must subsequently be processed by us. The data subject is, for example, obliged to provide us with personal data when our company signs a contract with him or her. The non-provision of the personal data would have the consequence that the contract with the data subject could not be concluded.

Before personal data is provided by the data subject, the data subject must contact any employee. The employee clarifies to the data subject whether the provision of the personal data is required by law or contract or is necessary for the conclusion of the contract, whether there is an obligation to provide the personal data and the consequences of non-provision of the personal data.

12. Existence of automated decision-making

As a responsible company, we do not use automatic decision-making or profiling.

This Privacy Policy has been generated by the Privacy Policy Generator of the DGD – Your External DPO that was developed in cooperation with German Lawyers from WILDE BEUGER SOLMECKE, Cologne.