Optimalisasi Akurasi Algoritma Naïve Bayes Dengan Metode Syntetic Minority Oversampling Technique (Smote) Pada Data Numerik

Authors

  • Hizbul Izzi Universitas Amikom Yogyakarta Indonesia
  • Arief Setyanto Universitas Amikom Ygyakarta Indonesia
  • Anggit Dwi Hartanto Universitas Amikom Yogyakarta Indonesia

DOI:

https://doi.org/10.29408/jit.v8i1.28340

Keywords:

Classification, K-Fold Cross Validation, Naive Bayes, SMOTE

Abstract

This research will classify numerical data, namely loan data taken from Kaggle. The data used amounted to 9578 datasets which included data classes with borrowers able to complete credit as many as 8045 records and loans that could not complete credit as many as 1533 records. From the amount of data there is an imbalance of classes so it is necessary to do balancing in order to get more accurate classification results. The purpose of this research is to improve the accuracy of the Naïve Bayes algorithm in classifying numerical data. Fraud in financial transactions is an example of a case of imbalanced data, where the number of legitimate transactions is much greater than those that are fraudulent. Optimizing accuracy in minority (fraud) classes is very important to avoid losses. The method used to improve the accuracy of the algorithm is the Synthetic Minority Oversampling Technique (SMOTE) by over sampling the minority of the dataset. In addition, it also uses the K-Fold Cross Validation method to evaluate the performance of the algorithm process used. Data preprocessing is done to clean the data from missing and invalid values and normalize the data so that all features are on the same scale and suitable for classification analysis. Based on the results of the analysis conducted, before the application of SMOTE the model's ability to recognize minority classes was 16.1%, while after the application of SMOTE the model's ability to recognize minority classes became 48.8%. besides that, before the application of SMOTE the model was able to predict the minority class correctly in 10 cases while after the application of SMOTE, the model was able to predict the minority class correctly in 102 cases. So it can be concluded that the SMOTE technique is able to improve the ability of the model

References

Pratiwi D, Awangga RM, Setyawan MYH. Seleksi Calon Kelulusan Tepat Waktu Mahasiswa Teknik Informatika Menggunakan Metode Naive Bayes. Kreatif; 2020.

Riswandha WAP. Evaluasi Performa Synthetic Minority Oversampling Technique (SMOTE) Untuk Mengatasi Klasifikasi Data Tidak Seimbang Pada Metode K-Nearest Neighbor (KNN) Dan Support Vector Machine (SVM). 2023;

Nursyahfitri R, Rozikin C, Adam RI. Penerapan Metode SMOTE dalam Klasifikasi Daerah Rawan Banjir di Karawang Menggunakan Algoritma Naive Bayes. J Sist dan Teknol Inf. 2022;10(4):339.

Wang S, Dai Y, Shen J, Xuan J. Research on expansion and classification of imbalanced data based on SMOTE algorithm. Sci Rep [Internet]. 2021;11(1):1–11. Available from: https://doi.org/10.1038/s41598-021-03430-5

Hairani H, Anggrawan A, Priyanto D. Improvement Performance of the Random Forest Method on Unbalanced Diabetes Data Classification Using Smote-Tomek Link. Int J Informatics Vis. 2023;7(1):258–64.

Kurniasih A, Isyara K. Penggunaan Metode SMOTE pada Naïve Bayes Gaussian untuk Klasifikasi Mahasiswa Drop Out. Semin Nas Mhs Ilmu Komput dan Apl. 2023;616–23.

Rahman F, Negeri Alauddin Makassar I, Alauddin Makassar N. Optimalisasi Prediksi Kelulusan Mahasiswa Tepat Waktu Menggunakan Binning Dan Synthetic Minority Oversampling Technique (Smote). J Artif Intell Data Sci. 2024;4(1):29–35.

Pambudi L. … Untuk Menganalisis Kepuasan Peserta Program Indonesia Bisa Baca Quran Menggunakan Algoritma Decision Tree (C4. 5) Berbasis …. J Teknorama (Informatika dan …. 2023;1(1):14–20.

Heliyanti Susana. Penerapan Model Klasifikasi Metode Naive Bayes Terhadap Penggunaan Akses Internet. J Ris Sist Inf dan Teknol Inf. 2022;4(1):1–8.

Sobri A, Satrianansyah S, Noverendi BA. Implementasi Sistem Pakar Diagnosis Penyakit Pada Ibu Hamil Menggunakan Metode Naïve Bayes. J Inf Syst Res. 2023;4(4):1245–52.

Arsa D, Weni I, Fahreza A. Analisis Sentimen Opini Publik Terhadap Pariwisara di Masa Pandemi COVID-19 Menggunakan Metode Naïve Bayes. J Telemat. 2022;17(1):49–54.

Kurniadi D, Nuraeni F, Firmansyah M, Komputer JI, Korespondensi P. Klasifikasi Masyarakat Penerima Bantuan Langsung Tunai Dana Desa Menggunakan Naive Bayes dan SMOTE. J Teknol Inf dan Ilmu Komput. 2022;10(2):309–20.

Pulungan MP, Purnomo A, Kurniasih A. Penerapan SMOTE untuk Mengatasi Imbalance Class dalam Klasifikasi Kepribadian MBTI Menggunakan Naive Bayes Classifier. J Teknol Inf dan Ilmu Komput. 2023;10(7):1493–502.

Andreyestha A, Azizah QN. Analisa Sentimen Kicauan Twitter Tokopedia Dengan Optimalisasi Data Tidak Seimbang Menggunakan Algoritma SMOTE. Infotek J Inform dan Teknol. 2022;5(1):108–16.

Alwanda AY, Utami E, Yaqin A. Analisis Klasifikasi Konsentrasi Mahasiswa Menggunakan Algoritma K-Nearest Neighbor. Infotek J Inform dan Teknol. 2024;7(2).

Fathoni FM, Putra CA, Nurlaili AL. Klasifikasi Penyakit Daun Anggur Menggunakan Metode K-Nearest Neighbor Berdasarkan Gray Level Co-Occurrence Matrix. Biner J Ilm Inform dan Komput. 2024;3(1):8–15.

Duan F, Zhang S, Yan Y, Cai Z. An Oversampling Method of Unbalanced Data for Mechanical Fault Diagnosis Based on MeanRadius-SMOTE. Sensors. 2022;22(14).

Aris Sudianto, Lalu Kerta Wijaya, Jumawal Jumawal, and Mahpuz Mahpuz, “Penerapan Aplikasi Warung Media Berbasis Android Guna Meningkatkan Promosi dan Penjualan,” Infotek Jurnal Informatika dan Teknologi, vol. 7, no. 1, pp. 267–275, Jan. 2024, doi: https://doi.org/10.29408/jit.v7i1.24482.

A. Sudianto, M. Wasil, and M. Mahpuz, “Penerapan Sistem Informasi Geografis dalam Pemetaan Sebaran Kasus Gizi Buruk,” Infotek : Jurnal Informatika dan Teknologi, vol. 4, no. 2, pp. 142–150, Jul. 2021, doi: 10.29408/jit.v4i2.3559

Downloads

Published

20-01-2025

How to Cite

Hizbul Izzi, Arief Setyanto, & Anggit Dwi Hartanto. (2025). Optimalisasi Akurasi Algoritma Naïve Bayes Dengan Metode Syntetic Minority Oversampling Technique (Smote) Pada Data Numerik. Infotek: Jurnal Informatika Dan Teknologi, 8(1), 217–227. https://doi.org/10.29408/jit.v8i1.28340

Similar Articles

<< < 1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.