Analisis Perbandingan Akurasi Algoritma Naïve Bayes Dan C4.5 untuk Klasifikasi Diabetes

Authors

  • Mursyid Ardiansyah Program Studi Teknik Informatika, Universitas Amikom Yogyakarta http://orcid.org/0000-0002-5121-4450
  • Andi Sunyoto Program Studi Teknik Informatika, Universitas Amikom Yogyakarta
  • Emha Taufiq Luthfi Program Studi Teknik Informatika, Universitas Amikom Yogyakarta

DOI:

https://doi.org/10.29408/edumatic.v5i2.3424

Keywords:

C4.5, Classification, Decision Tree, Diabetes, Naïve Bayes

Abstract

Diabetes is a metabolic disease in which blood sugar rises high. If blood sugar is not controlled properly, it can cause a variety of critical diseases, one of which is diabetes. The purpose of this study was to find out the results of comparing the performance values of Naïve Bayes and C4.5 algorithms with 7 different scenarios in the classification of diabetes that will be tested for accuracy, precision, and recall performance. The method used in this study is descriptive, and the source of skunder data obtained from the data of diabetic patients available on Kaggle with the format .csv issued by Ishan Dutta as many as 520 data and 17 fields. The tool used for data analysis is Rapidminer for the process of classification and performance testing of Naïve Bayes algorithm and C4.5 Algorithm. Our results showed that the C4.5 algorithm (scenario 4) had good results in the classification of diabetes compared to Naïve Bayes' algorithm (scenario 2) where the performance of the C4.5 algorithm had an accuracy of 99.03%, precision 100%, and recall 98.18%.

References

Agustina, D. melina, & Wijanarto. (2016). Analisis Perbandingan Algoritma ID3 Dan C4 . 5 Untuk Klasifikasi Penerima Hibah Pemasangan Air Minum pada PDAM Kabupaten Kendal. Journal of Applied Intelligent System, 1(3), 234–244.

Amra, I. A. A., & Maghari, A. Y. A. (2017). Students performance prediction using KNN and Naïve Bayesian. ICIT 2017 - 8th International Conference on Information Technology, Proceedings, 909–913. https://doi.org/10.1109/ICITECH.2017.8079967

Atma, Y. D., & Setyanto, A. (2018). Perbandingan algoritma c4.5 dan k-nn dalam identifikasi mahasiswa berpotensi drop out. Metik Jurnal, 2(2), 31–37.

Azizah, E. N., Pujianto, U., Nugraha, E., & Darusalam. (2018). Comparative performance between C4.5 and Naive Bayes classifiers in predicting student academic performance in a Virtual Learning Environment. 2018 4th International Conference on Education and Technology, ICET 2018, 1, 18–22. https://doi.org/10.1109/ICEAT.2018.8693928

Banu, M. N., & Gomathy, B. (2014). Disease forecasting system using data mining methods. International Conference on Intelligent Computing Applications, ICICA 2014, 130–133. https://doi.org/10.1109/ICICA.2014.36

Bhagya Shree, S. R., & Sheshadri, H. S. (2018). Diagnosis of Alzheimer’s disease using Naive Bayesian Classifier. Neural Computing and Applications, 29(1), 123–132. https://doi.org/10.1007/s00521-016-2416-3

Boukenze, B., Haqiq, A., & Mousannif, H. (2017). Predicting chronic kidney failure disease using data mining techniques. Lecture Notes in Electrical Engineering, 397, 701–712. https://doi.org/10.1007/978-981-10-1627-1_55

Brian, T. (2017). Analisis Learning Rates Pada Algoritma Backpropagation Untuk Klasifikasi Penyakit Diabetes. Edutic - Scientific Journal of Informatics Education, 3(1), 21–27. https://doi.org/10.21107/edutic.v3i1.2557

Fitriani, E. (2020). Perbandingan Algoritma C4.5 Dan Naïve Bayes Untuk Menentukan Kelayakan Penerima Bantuan Program Keluarga Harapan. Sistemasi, 9(1), 103. https://doi.org/10.32520/stmsi.v9i1.596

Gata, W., Basri, H., Hidayat, R., Patras, Y. E., Baharuddin, B., Fatmasari, R., Tohari, S., & Wardhani, N. K. (2019). Algorithm Implementations Naïve Bayes, Random Forest. C4.5 on Online Gaming for Learning Achievement Predictions. 258(Icream 2018). https://doi.org/10.2991/icream-18.2019.1

Gerhana, Y. A., Fallah, I., Zulfikar, W. B., Maylawati, D. S., & Ramdhani, M. A. (2019). Comparison of naive Bayes classifier and C4.5 algorithms in predicting student study period. Journal of Physics: Conference Series, 1280(2). https://doi.org/10.1088/1742-6596/1280/2/022022

Hairani, H., Nugraha, G. S., Abdillah, M. N., & Innuddin, M. (2018). Komparasi akurasi metode correlated naive Bayes classifier dan naive Bayes classifier untuk diagnosis penyakit diabetes. InfoTekJar: Jurnal Nasional Informatika Dan Teknologi Jaringan, 3(1), 6–11.

Harianto et al. (2020). Optimasi Algoritma Naïve Bayes Classifier untuk Mendeteksi Anomaly dengan Univariate Fitur Selection. Edumatic: Jurnal Pendidikan Informatika, 4(2), 40–49. https://doi.org/10.29408/edumatic.v4i2.2433

Harisinghaney, A., Dixit, A., Gupta, S., & Arora, A. (2014). Text and image based spam email classification using KNN, Naïve Bayes and Reverse DBSCAN algorithm. ICROIT 2014 - Proceedings of the 2014 International Conference on Reliability, Optimization and Information Technology, 153–155. https://doi.org/10.1109/ICROIT.2014.6798302

Hasan, F. N., Hikmah, N., & Utami, D. Y. (2018). Perbandingan Algoritma C4.5, KNN, dan Naive Bayes untuk Penentuan Model Klasifikasi Penanggung jawab BSI Entrepreneur Center. Jurnal Pilar Nusa Mandiri, 14(2), 169. https://doi.org/10.33480/pilar.v14i2.908

Hssina, B., Merbouha, A., Ezzikouri, H., & Erritali, M. (2014). A comparative study of decision tree ID3 and C4.5. International Journal of Advanced Computer Science and Applications, 4(2), 13–19. https://doi.org/10.14569/specialissue.2014.040203

Indrayanti, Sugianti, D., & Al Karomi, M. A. (2017). Optimasi Parameter K pada Algoritma K-Nearest Neighbour untuk Klasifikasi Penyakit Diabetes Mellitus. Prosiding SNATIF, 823–829.

Maulida, A. (2020). Penerapan Metode Klasifikasi K-Nearest Neigbor pada Dataset Penderita Penyakit Diabetes. Indonesian Journal of Data and Science, 1(2), 29–33.

Parthiban, G., S.K.Srivatsa, A., & Rajesh, A. (2011). Diagnosis of Heart Disease for Diabetic Patients using Naive Bayes Method. International Journal of Computer Applications, 24(3), 7–11. https://doi.org/10.5120/2933-3887

Powers, D. M. W. (2020). Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. January 2008.

Pramadhana, D. (2021). Klasifikasi Penyakit Diabetes Menggunakan Metode CFS Dan ROS dengan Algoritma J48 Berbasis Adaboost. Edumatic: Jurnal Pendidikan Informatika, 5(1), 89–98.

Rahim, R., Zufria, I., Kurniasih, N., Simargolang, M. Y., Hasibuan, A., Sutiksno, D. U., Nanuru, R. F., Anamofa, J. N., Ahmar, A. S., & Achmad Daengs, G. S. (2018). C4.5 classification data mining for inventory control. International Journal of Engineering and Technology(UAE), 7, 68–72. https://doi.org/10.14419/ijet.v7i2.3.12618

Rahman, M. F., Alamsah, D., Darmawidjadja, M. I., & Nurma, I. (2017). Klasifikasi Untuk Diagnosa Diabetes Menggunakan Metode Bayesian Regularization Neural Network (RBNN). Jurnal Informatika, 11(1), 36. https://doi.org/10.26555/jifo.v11i1.a5452

Rahmayuni, I. (2014). Perbandingan performansi algoritma c4.5 dan cart dalam klasifiksi data nilai mahasiswa prodi teknik komputer politeknik negeri padang. Teknoif, 2(1), 40–46.

Sari, V. R., Firdausi, F., & Azhar, Y. (2020). Perbandingan Prediksi Kualitas Kopi Arabika dengan Menggunakan Algoritma SGD, Random Forest dan Naive Bayes. Edumatic: Jurnal Pendidikan Informatika, 4(2), 1–9. https://doi.org/10.29408/edumatic.v4i2.2202

Singh, A., N., M., & Lakshmiganthan, R. (2017). Impact of Different Data Types on Classifier Performance of Random Forest, Naïve Bayes, and K-Nearest Neighbors Algorithms. International Journal of Advanced Computer Science and Applications, 8(12), 1–11. https://doi.org/10.14569/ijacsa.2017.081201

Uska, M., Wirasasmita, R., Usuluddin, U., & Arianti, B. (2020). Evaluation of Rapidminer-Aplication in Data Mining Learning using PeRSIVA Model. Edumatic: Jurnal Pendidikan Informatika, 4(2), 164–171. https://doi.org/10.29408/edumatic.v4i2.2688

Vembandasamy, K., Sasipriya, R., & Deepa, E. (2015). Heart Diseases Detection Using Naive Bayes Algorithm. International Journal of Innovative Science, Engineering & Technology, 2(9), 441–444.

Zhang, H. (2005). Exploring conditions for the optimality of naïve bayes. International Journal of Pattern Recognition and Artificial Intelligence, 19(2), 183–198. https://doi.org/10.1142/S0218001405003983

Downloads

Published

2021-12-20