Software Defect Prediction For Quality Evaluation Using Learning Techniques Ensemble Stacking


  • Muhammad Romadhona Kusuma Universitas Nusa Mandiri
  • Windu Gata Nusa Mandiri University
  • Sigit Kurniawan Muhammadiyah University of Technology Jakarta
  • Dedi Dwi Saputra Universitas Siber Indonesia
  • Supriadi Panggabean Darunnajah University



Software defects, Prediction, Feature Selection, SMOTE, Hyperparameter Tuning


This research aims to improve the software quality and effectiveness of zakat management by the National Amil Zakat Agency (BAZNAS) through the development of a software defect prediction model (SDPM). We used machine learning techniques and ensemble stacking approach on the "Masjid Tower" dataset containing 228 records and 34 attributes. The preprocessing process involved label encoding, feature selection with Pearson correlation, standard normalization, and the use of SMOTE to handle data imbalance. We performed hyperparameter tuning with grid search CV on Machine Learning algorithms such as Ada Boost and Gradient Boosting. The results showed that the ensemble stacking approach with a combination of Gradient Boosting, Ada Boost, Decision Tree, Bayesian Ridge, and LightGBM meta learner algorithms provided high accuracy with R2 score reaching 0.97, MAE of 0.037, and MSE of 0.006. This finding proves that the ensemble stacking approach is able to overcome the problem of software defects with accurate prediction results, provide useful guidance in the management of zakat and other software applications, and has the potential to improve software quality and the effectiveness of BAZNAS in managing zakat.


Download data is not yet available.


Alibrahim, H., & Ludwig, S. A. (2021). Hyperparameter optimization: Comparing genetic algorithm against grid search and bayesian optimization. 2021 IEEE Congress on Evolutionary Computation (CEC), 1551–1559.

Amershi, S., Begel, A., Bird, C., DeLine, R., Gall, H., Kamar, E., Nagappan, N., Nushi, B., & Zimmermann, T. (2019). Software engineering for machine learning: A case study. 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), 291–300.

Aziz, M. I. A., & Susetyo, H. (2020). Dinamika Pengelolaan Zakat Oleh Negara Di Beberapa Provinsi Di Indonesia Pasca Undang-Undang Nomor 23 Tahun 2011. Jurnal Hukum & Pembangunan, 49(4), 968–977.

Bahri, E. S., & Khumaini, S. (2020). Analisis efektivitas penyaluran zakat pada badan amil zakat nasional. Al Maal: Journal of Islamic Economics and Banking, 1(2), 164–175.

Berrar, D., & others. (2019). Cross-Validation.

Bhandari, K., Kumar, K., & Sangal, A. L. (2023). Data quality issues in software fault prediction: a systematic literature review. Artificial Intelligence Review, 56(8), 7839–7908.

Boehm, B., Abts, C., & Chulani, S. (2000). Software development cost estimation approaches A survey. Annals of Software Engineering, 10(1–4), 177–205.

Botchkarev, A. (2019). A new typology design of performance metrics to measure errors in machine learning regression algorithms. Interdisciplinary Journal of Information, Knowledge, and Management, 14, 45–76.

Chicco, D., Warrens, M. J., & Jurman, G. (2021). The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Computer Science, 7, e623.

Costache, R., Arabameri, A., Blaschke, T., Pham, Q. B., Pham, B. T., Pandey, M., Arora, A., Linh, N. T. T., & Costache, I. (2021). Flash-flood potential mapping using deep learning, alternating decision trees and data provided by remote sensing sensors. Sensors, 21(1), 280.

Dash, G., Kiefer, K., & Paul, J. (2021). Marketing-to-Millennials: Marketing 4.0, customer satisfaction and purchase intention. Journal of Business Research, 122, 608–620.

Dhall, D., Kaur, R., & Juneja, M. (2020). Machine learning: a review of the algorithms and its applications. Proceedings of ICRIC 2019: Recent Innovations in Computing, 47–63.

Elmidaoui, S., Cheikhi, L., Idri, A., & Abran, A. (2020). Machine learning techniques for software maintainability prediction: Accuracy analysis. Journal of Computer Science and Technology, 35, 1147–1174.

Fatmawatie, N., & Endri, E. (2022). Implementation of the principles of financial governance in service companies. Journal of Governance and Regulation, 11(4), 33–45.

Fenton, N. E., & Neil, M. (1999). A critique of software defect prediction models. IEEE Transactions on Software Engineering, 25(5), 675–689.

Ganggayah, M. D., Taib, N. A., Har, Y. C., Lio, P., & Dhillon, S. K. (2019). Predicting factors for survival of breast cancer patients using machine learning techniques. BMC Medical Informatics and Decision Making, 19, 1–17.

Garg, H., & Rani, D. (2020). Novel aggregation operators and ranking method for complex intuitionistic fuzzy sets and their applications to decision-making process. Artificial Intelligence Review, 53, 3595–3620.

Gökhan, A., Güzeller, C. O., & Eser, M. T. (2019). The effect of the normalization method used in different sample sizes on the success of artificial neural network model. International Journal of Assessment Tools in Education, 6(2), 170–192.

Hanson, J., Paliwal, K. K., Litfin, T., Yang, Y., & Zhou, Y. (2020). Getting to know your neighbor: protein structure prediction comes of age with contextual machine learning. Journal of Computational Biology, 27(5), 796–814.

Haryono, K., Wahyuni, E. G., & Fahreza, F. M. A. (2021). The Mapping of Mosque Community to Improve Mosque Engagement in Community. ABDIMAS: Jurnal Pengabdian Masyarakat, 4(2), 788–800.

He, Q., & Pursiainen, S. (2021). An extended application ‘Brain Q’processing EEG and MEG data of finger stimulation extended from ‘Zeffiro’based on machine learning and signal processing. Cognitive Systems Research, 69, 50–66.

Hodson, T. O. (2022). Root-mean-square error (RMSE) or mean absolute error (MAE): When to use them or not. Geoscientific Model Development, 15(14), 5481–5487.

Humayun, M., Niazi, M., Jhanjhi, N. Z., Alshayeb, M., & Mahmood, S. (2020). Cyber security threats and vulnerabilities: a systematic mapping study. Arabian Journal for Science and Engineering, 45, 3171–3189.

Hutchinson, B., Smart, A., Hanna, A., Denton, E., Greer, C., Kjartansson, O., Barnes, P., & Mitchell, M. (2021). Towards accountability for machine learning datasets: Practices from software engineering and infrastructure. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 560–575.

Janiesch, C., Zschech, P., & Heinrich, K. (2021). Machine learning and deep learning. Electronic Markets, 31(3), 685–695.

Kaushik, H., Singh, D., Kaur, M., Alshazly, H., Zaguia, A., & Hamam, H. (2021). Diabetic retinopathy diagnosis from fundus images using stacked generalization of deep models. IEEE Access, 9, 108276–108292.

Konstantinov, A. V, & Utkin, L. V. (2021). Interpretable machine learning with an ensemble of gradient boosting machines. Knowledge-Based Systems, 222, 106993.

Kumar, P. S., Nayak, J., & Behera, H. S. (2022). Model-based Software Defect Prediction from Software Quality Characterized Code Features by using Stacking Ensemble Learning. Journal of Engineering Science & Technology Review, 15(2).

Logesh, R., Subramaniyaswamy, V., Malathi, D., Sivaramakrishnan, N., & Vijayakumar, V. (2020). Enhancing recommendation stability of collaborative filtering recommender system through bio-inspired clustering ensemble method. Neural Computing and Applications, 32, 2141–2164.

Luengo, J., Garc’ia-Gil, D., Ram’irez-Gallego, S., Garc’ia, S., & Herrera, F. (2020). Big data preprocessing. Cham: Springer.

Marinov, D., & Karapetyan, D. (2019). Hyperparameter optimisation with early termination of poor performers. 2019 11th Computer Science and Electronic Engineering (CEEC), 160–163.

Mooijman, P., Catal, C., Tekinerdogan, B., Lommen, A., & Blokland, M. (2023). The effects of data balancing approaches: A case study. Applied Soft Computing, 132, 109853.

Nabipour, M., Nayyeri, P., Jabani, H., Mosavi, A., & Salwana, E. (2020). Deep learning for stock market prediction. Entropy, 22(8), 840.

Paleyes, A., Urma, R.-G., & Lawrence, N. D. (2022). Challenges in deploying machine learning: a survey of case studies. ACM Computing Surveys, 55(6), 1–29.

Perdana, R. S., & Yuhana, U. L. (2015). Prediksi Code Defect Perangkat Lunak Dengan Metode Association Rule Mining dan Cumulative Support Thresholds. Jurnal Buana Informatika, 6(2).

Pitri, P. (2023). Strategi Pendayagunaan Zakat Produktif Di Badan Amil Zakat Nasional (Baznas) Kabupaten Bangka. Neraca: Jurnal Ekonomi, Manajemen Dan Akuntansi, 1(3), 286–300.

Reddivari, S., & Raman, J. (2019). Software quality prediction: an investigation based on machine learning. 2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI), 115–122.

Romadloni, N. T., Pardede, H. F., & others. (2019). Seleksi Fitur Berbasis Pearson Correlation Untuk Optimasi Opinion Mining Review Pelanggan. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 3(3), 505–510.

Saheed, Y. K., Longe, O., Baba, U. A., Rakshit, S., & Vajjhala, N. R. (2021). An ensemble learning approach for software defect prediction in developing quality software product. Advances in Computing and Data Sciences: 5th International Conference, ICACDS 2021, Nashik, India, April 23--24, 2021, Revised Selected Papers, Part I 5, 317–326.

Sherwani, F., Ibrahim, B., & Asad, M. M. (2021). Hybridized classification algorithms for data classification applications: A review. Egyptian Informatics Journal, 22(2), 185–192.

Sun, J., Li, J., & Fujita, H. (2022). Multi-class imbalanced enterprise credit evaluation based on asymmetric bagging combined with light gradient boosting machine. Applied Soft Computing, 130, 109637.

Thalib, I. S. (2023). Klasifikasi Sentimen Tragedi Kanjuruhan Pada Twitter Menggunakan Algoritma Naive Bayes. Klasifikasi Sentimen Tragedi Kanjuruhan Pada Twitter Menggunakan Algoritma Naive Bayes, 4(3), 467–473.

Thara, D. K., PremaSudha, B. G., & Xiong, F. (2019). Auto-detection of epileptic seizure events using deep neural network with different feature scaling techniques. Pattern Recognition Letters, 128, 544–550.

Tuggener, L., Amirian, M., Rombach, K., Lörwald, S., Varlet, A., Westermann, C., & Stadelmann, T. (2019). Automated machine learning in practice: state of the art and recent results. 2019 6th Swiss Conference on Data Science (SDS), 31–36.

Tyralis, H., & Papacharalampous, G. (2021). Boosting algorithms in energy research: A systematic review. Neural Computing and Applications, 33(21), 14101–14117.

Wankhade, K. K., Jondhale, K. C., & Dongre, S. S. (2021). A clustering and ensemble based classifier for data stream classification. Applied Soft Computing, 102, 107076.

Xu, C., Wang, X., Yang, H., Xie, K., & Chen, X. (2019). Exploring the impacts of speed variances on safety performance of urban elevated expressways using GPS data. Accident Analysis & Prevention, 123, 29–38.

Yang, Z., Jin, C., Zhang, Y., Wang, J., Yuan, B., & Li, H. (2022). Software Defect Prediction: An Ensemble Learning Approach. Journal of Physics: Conference Series, 2171(1), 12008.

Zebari, R., Abdulazeez, A., Zeebaree, D., Zebari, D., & Saeed, J. (2020). A comprehensive review of dimensionality reduction techniques for feature selection and feature extraction. Journal of Applied Science and Technology Trends, 1(2), 56–70.

Zulfiker, M. S., Kabir, N., Biswas, A. A., Nazneen, T., & Uddin, M. S. (2021). An in-depth analysis of machine learning approaches to predict depression. Current Research in Behavioral Sciences, 2, 100044.




How to Cite

Kusuma, M. R., Windu Gata, Sigit Kurniawan, Dedi Dwi Saputra, & Supriadi Panggabean. (2023). Software Defect Prediction For Quality Evaluation Using Learning Techniques Ensemble Stacking. Inspiration: Jurnal Teknologi Informasi Dan Komunikasi, 13(2), 1–13.