Multiclass Classification and Prioritisation of Static Analysis Warnings Using Developer-Labelled Industrial Data

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Automatic static code analysis tools are used to identify code quality issues like vulnerabilities or performance problems. In practice, the high number of irrelevant warnings created by such tools is problematic, but can be addressed by pre-filtering and ranking these warnings before they are shown to the developer. Since ground truth labelled data is rarely available, existing research tends to heuristically construct training data from unlabelled open-source data by assigning clearly separated binary categories like fixed and irrelevant to the warnings. However, this labelling approach cannot capture distinctions such as between relevant, but not yet fixed and not fixed, because irrelevant warnings. The Teamscale software developed by CQSE provides a unique opportunity to investigate this concern, since its developers have adopted the practice of meticulously labelling all static analysis warnings as either accepted, tolerated, or false-positive. Using this dataset, we adapt previously proposed models for a new multiclass classification task to evaluate both their ability to classify warnings in this new setting and their ability to prioritise important warnings. Our experiments show that especially the mentioned subtle difference between categories of unresolved warnings is more challenging for the models compared to a binary prediction. Nevertheless, training the models for the multiclass task rather than the binary one results in a statistically significant improvement in the prioritisation of the warnings.

Article activity feed