Performa Logistic Regression dan Naive Bayes dalam Klasifikasi Berita Hoax di Indonesia
DOI:
https://doi.org/10.29408/edumatic.v9i1.28987Keywords:
fake news, grid search, logistic regression, naive bayesAbstract
The spread of false information has become a major challenge in Indonesian society, with 2,484 cases recorded in 2022. This highlights the importance of developing a system that can effectively identify and filter out fake news. This research aims to develop a more accurate fake news detection model by applying logistic regression, which is optimized by grid search and oversampling to overcome data imbalance. The main focus of this research is to improve the performance of the model in detecting fake news on unbalanced datasets. The dataset used is the Indonesian Fake News dataset, which consists of 4,231 entries with two categories: valid (3,465 entries) and hoax (766 entries). Preprocessing steps include stemming, stopword removal, and text normalization using TF-IDF. Random oversampling was applied to balance the data between hoax and valid classes, and parameter optimization was performed using grid search to improve model performance. The results show that the optimized logistic regression achieved the highest accuracy of 93%, surpassing naive bayes, which achieved 86% accuracy. These findings suggest that the developed fake news detection model can be used to improve the social media news monitoring system, and increase digital literacy among Indonesians.
References
Afrizal, S., Irmanda, H. N., Falih, N., & Isnainiyah, I. N. (2020). Implementasi Metode Naïve Bayes untuk Analisis Sentimen Warga Jakarta Terhadap. Informatik: Jurnal Ilmu Komputer, 15(3), 157–168. https://doi.org/10.52958/iftk.v15i3.1454
Ahmed Arafa, A. H., Radad, M., Badawy, M. M., & El-Fishawy, N. (2022). Logistic regression hyperparameter optimization for cancer classification. Menoufia Journal of Electronic Engineering Research, 31(1), 1-8. https://doi.org/10.21608/mjeer.2021.70512.1034
Ananda, I. K., Fanani, A. Z., Setiawan, D., & Wicaksono, D. F. (2024). Penerapan Random Oversampling dan Algoritma Boosting untuk Memprediksi Kualitas Buah Jeruk. Edumatic: Jurnal Pendidikan Informatika, 8(1), 282–289. https://doi.org/10.29408/edumatic.v8i1.25836
Ardiansyah, M., Sunyoto, A., & Luthfi, E. T. (2021). Analisis Perbandingan Akurasi Algoritma Naïve Bayes dan C4.5 untuk Klasifikasi Diabetes. Edumatic: Jurnal Pendidikan Informatika, 5(2), 147–156. https://doi.org/10.29408/edumatic.v5i2.3424
Armansyah, A., & Ramli, R. K. (2022). Model prediksi kelulusan mahasiswa tepat waktu dengan metode Naïve Bayes. Edumatic: Jurnal Pendidikan Informatika, 6(1), 1-10. https://doi.org/10.29408/edumatic.v6i1.4789
Fauzi, A., & Yunial, A. H. (2022). Optimasi Algoritma Klasifikasi Naive Bayes, Decision Tree, K – Nearest Neighbor, dan Random Forest menggunakan Algoritma Particle Swarm Optimization pada Diabetes Dataset. Jurnal Edukasi Dan Penelitian Informatika (JEPIN), 8(3), 470–481. https://doi.org/10.26418/jp.v8i3.56656
Fitriani, R. D., Yasin, H., & Tarno, T. (2021). Penanganan Klasifikasi Kelas Data Tidak Seimbang Dengan Random Oversampling Pada Naive Bayes(Studi Kasus: Status Peserta KB IUD di Kabupaten Kendal). Jurnal Gaussian, 10(1), 11–20. https://doi.org/10.14710/j.gauss.v10i1.30243
Gifari, O. I., Adha, Muh., Freddy, F., & Durrand, F. F. S. (2022). Analisis Sentimen Review Film Menggunakan TF-IDF dan Support Vector Machine. Journal of Information Technology, 2(1), 36–40. https://doi.org/10.46229/jifotech.v2i1.330
Hendrawan, I. R., Utami, E., & Hartanto, A. D. (2022). Comparison of Naïve Bayes Algorithm and XGBoost on Local Product Review Text Classification. Edumatic: Jurnal Pendidikan Informatika, 6(1), 143-149. https://doi.org/10.29408/edumatic.v6i1.5613
Kurniawan, A. A., & Mustikasari, M. (2022). Evaluasi Kinerja MLLIB APACHE SPARK pada Klasifikasi Berita Palsu dalam Bahasa Indonesia. Jurnal Teknologi Informasi dan Ilmu Komputer, 9(3), 489-500. https://doi.org/10.25126/jtiik.2022923538
Lindawati, L., Fadhli, M., & Wardana, A. S. (2023). Optimasi Gaussian Naïve Bayes dengan Hyperparameter Tuning dan Univariate Feature Selection dalam Prediksi Cuaca. Edumatic: Jurnal Pendidikan Informatika, 7(2), 237-246. https://doi.org/10.29408/edumatic.v7i2.21179
Matin, I. M. M. (2023). Hyperparameter Tuning Menggunakan GridsearchCV pada Random Forest untuk Deteksi Malware. MULTINETICS, 9(1), 43–50. https://doi.org/10.32722/multinetics.v9i1.5578
Muhabatin, H., Prabowo, C., Ali, I., Lukman Rohmat, C., Rizki Amalia, D., sitasi, C., & Rizki, D. (2021). Klasifikasi Berita Hoax Menggunakan Algoritma Naïve Bayes Berbasis PSO. Informatics for Educators and Professionals, 5(2), 156–165. https://doi.org/10.51211/itbi.v5i2.1531
Nurrokhman, M. Z. (2023). Perbandingan Algoritma Support Vector Machine dan Neural Network untuk Klasifikasi Penyakit Hati. The Indonesian Journal of Computer Science, 12(4), 2096–2106. https://doi.org/10.33022/ijcs.v12i4.3274
Pardede, J., & Ibrahim, R. G. (2020). Implementasi Long Short-Term Memory untuk Identifikasi Berita Hoax Berbahasa Inggris pada Media Sosial. Journal of Computer Science and Informatics Engineering (J-Cosine), 4(2), 179–187. https://doi.org/10.29303/jcosine.v4i2.361
Purnajaya, A. R., & Pernando, Y. (2023). Analisa Sentimen Informasi Hoaks Pasca Pandemi Covid-19 dengan Text Mining. Journal of Computer System and Informatics (JoSYC), 4(3), 460–469. https://doi.org/10.47065/josyc.v4i3.3358
Putri, N. F., Vionia, E., & Michael, T. (2020). Pentingnya Kesadaran Hukum Dan Peran Masyarakat Indonesia Dalam Menghadapi Penyebaran Berita Hoax Covid-19. Media Keadilan: Jurnal Ilmu Hukum, 11(1), 98–111. https://doi.org/10.31764/jmk.v11i1.2262
Ramadhan, N. G., Adhinata, F. D., Segara, A. J. T., & Rakhmadani, D. P. (2022). Deteksi Berita Palsu Menggunakan Metode Random Forest dan Logistic Regression. JURIKOM (Jurnal Riset Komputer), 9(2), 251–256. https://doi.org/10.30865/jurikom.v9i2.3979
Ropikoh, I. A., Abdulhakim, R., Enri, U., & Sulistiyowati, N. (2021). Penerapan Algoritma Support Vector Machine (SVM) untuk Klasifikasi Berita Hoax Covid-19. Journal of Applied Informatics and Computing, 5(1), 64–73. https://doi.org/10.30871/jaic.v5i1.3167
Roy, J., & Junaidi, A. (2020). Pengaruh Terpaan Media Berita Hoax di Instagram terhadap Opini Masyarakat Milenials Akan Sumber Berita. Koneksi, 4(2), 280–285. https://doi.org/10.24912/kn.v4i2.8138
Ruise, A. P., Mashuri, A. S., Sulaiman, M., & Rahman, F. (2023). Studi Komparasi Metode Svm, Logistic Regresion Dan Random Forest Clasifier Untuk Mengklasifikasi Fake News di Twitter. J I M P - Jurnal Informatika Merdeka Pasuruan, 7(2), 64–67. https://doi.org/10.51213/jimp.v7i2.472
Sani, R. R., Pratiwi, Y. A., Winarno, S., Udayanti, E. D., & Alzami, F. (2022). Analisis Perbandingan Algoritma Naive Bayes Classifier dan Support Vector Machine untuk Klasifikasi Berita Hoax pada Berita Online Indonesia. Jurnal Masyarakat Informatika, 13(2), 85–98. https://doi.org/10.14710/jmasif.13.2.47983
Tanggraeni, A. I., & Sitokdana, M. N. N. (2022). Analisis Sentimen Aplikasi E-Government pada Google Play Menggunakan Algoritma Naïve Bayes. JATISI (Jurnal Teknik Informatika Dan Sistem Informasi), 9(2), 785–795. https://doi.org/10.35957/jatisi.v9i2.1835
Zhafira, D. F., Rahayudi, B., & Indriati, I. (2021). Analisis Sentimen Kebijakan Kampus Merdeka Menggunakan Naive Bayes dan Pembobotan TF-IDF Berdasarkan Komentar pada Youtube. Jurnal Sistem Informasi, Teknologi Informasi, Dan Edukasi Sistem Informasi, 2(1), 55–63. https://doi.org/10.25126/justsi.v2i1.24
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Okta Nur Cahyani, Fikri Budiman

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Semua tulisan pada jurnal ini adalah tanggung jawab penuh penulis. Edumatic: Jurnal Pendidikan Informatika bisa diakses secara free (gratis) tanpa ada pungutan biaya, sesuai dengan lisensi creative commons yang digunakan.

This work is licensed under a Lisensi a Creative Commons Attribution-ShareAlike 4.0 International License.