计算机软件新技术-计算机科学国家重点实验联合学术研讨会

演讲人:周毓明

题目: Re-examining the potentially confounding effect of class size in the validation of object-oriented metrics on fault-proneness

摘要:Previous research shows that class size can falsely accentuate the associations between object-orient (OO) metrics and fault-proneness. Therefore, it is suggested that class size should be controlled as a confounding variable when validating OO metrics on fault-proneness. However, there is a need to re-examine this subject for three reasons. First, the method for identifying the potentially confounding effect of class size in previous research depends on an essentially arbitrary-chosen threshold, which may influence the analysis results. Second, previous research only analyzes a small number of OO metrics and, therefore, it is not clear whether the confounding effect of class size on fault-proneness in general exists. Third, it is not clear how to remove the confounding effect of class size and whether this remove could improve the performance of fault-proneness prediction models. In this paper, we employ a statistical test method and three class size metrics to re-examine the potentially confounding effect of class size on the associations between OO metrics and fault-proneness. The investigated OO metrics include cohesion, coupling, and inheritance metrics. Furthermore, we propose a simple but effective method to remove the confounding effect of class size from OO metrics. Our experimental results, based on four open-source systems, indicate that (1) the confounding effect of class size on the associations between OO metrics and fault-proneness in general exists, regardless of whichever size metric is used; that (2) the confounding effect of class size generally leads to an overestimate of the associations between OO metrics and fault-proneness; and that (3) after removing the confounding effect of class size, the ability of prediction models to predict the rank order of modules according to their fault-proneness could be significantly improved. These results strongly suggest that that the confounding effect of class size should be removed when validating OO metrics on fault-proneness.