Repository logo
Institutional Digital Repository
Shreenivas Deshpande Library, IIT (BHU), Varanasi

Class Imbalance Issue in Software Defect Prediction Models by various Machine Learning Techniques: An Empirical Study

dc.contributor.authorPandey S.K.; Tripathi A.K.
dc.date.accessioned2025-05-23T11:27:43Z
dc.description.abstractSoftware practitioners are continuing to build advanced software defect prediction (SDP) models to help the tester find fault-prone modules. However, the Class Imbalance (CI) problem consists of uncommonly few defective instances, and more non-defective instances cause inconsistency in the performance. We have conducted 880 experiments to analyze the variation in the performance of 10 SDP models by concerning the class imbalance problem. In our experiments, we have used 22 public datasets consists of 41 software metrics, 10 baseline SDP methods, and 4 sampling techniques. We used Mathews Correlation Coefficient (MCC), which is more useful when a dataset is highly imbalanced. We have also compared the predictive performance of various ML models by applying 4 sampling techniques. To examine the performance of different SDP models, we have used the F-measure. We found the performance of the learning models is unsatisfactory, which needs to mitigate. We have also found a few surprising results, some logical patterns between classifier and sampling technique. It provides a connection between sampling technique, software matrices, and a classifier. © 2021 IEEE.
dc.identifier.doihttps://doi.org/10.1109/ICSCC51209.2021.9528170
dc.identifier.urihttp://172.23.0.11:4000/handle/123456789/11727
dc.relation.ispartofseries2021 8th International Conference on Smart Computing and Communications: Artificial Intelligence, AI Driven Applications for a Smart World, ICSCC 2021
dc.titleClass Imbalance Issue in Software Defect Prediction Models by various Machine Learning Techniques: An Empirical Study

Files

Collections