User Tools

Site Tools


data_mining_static_code_attributes_to_learn_defect_predictors

This is an old revision of the document!


Menzies, T.; Greenwald, J. & Frank, A. Data Mining Static Code Attributes to Learn Defect Predictors. Transactions on Software Engineering, IEEE CS Press, 2007, 33, 2-13

Abstract

The value of using static code attributes to learn defect predictors has been widely debated. Prior work has explored issues like the merits of “McCabes versus Halstead versus lines of code counts” for generating defect predictors. We show here that such debates are irrelevant since how the attributes are used to build predictors is much more important than which particular attributes are used. Also, contrary to prior pessimism, we show that such defect predictors are demonstrably useful and, on the data studied here, yield predictors with a mean probability of detection of 71 percent and mean false alarms rates of 25 percent. These predictors would be useful for prioritizing a resource-bound exploration of code that has yet to be inspected.

Comments

Yann-Gaël Guéhéneuc, 2014/02/12

The authors make the case that, when it comes to defect prediction, “how the attributes are used to build predictors is much more important than which particular attributes are used”. To come this this conclusion, they used 38 attributed and three different learners: OneR, J48, and naïve Bayes. They used as dataset the NASA MDP (Metric Data Program).

data_mining_static_code_attributes_to_learn_defect_predictors.1392466433.txt.gz · Last modified: 2019/10/06 20:37 (external edit)