Pengaruh Penanganan Ketidakseimbangan Kelas pada Prediksi Cacat Perangkat Lunak dengan Teknik Oversampling
Abstract
Software defect prediction plays a vital role in SDLC testing by identifying modules prone to defects. However, imbalanced class distributions, where defect (minority) samples are outnumbered by non-defect ones, can hinder model performance. This study investigates the impact of oversampling techniques (SMOTE, ADASYN) on Naive Bayes classification for defect prediction. While the base Naive Bayes model achieved good overall accuracy (83%), it struggled with defect class recall (30%). Applying SMOTE and ADASYN improved recall (40% and 38%, respectively) but slightly lowered accuracy (77% and 80%). Future work will explore feature selection and deep learning approaches for potentially better performance.
Keywords: Software Defect Prediction, Classification, Naïve Bayes, Oversampling, SMOTE, ADASYN