Performa Logistic Regression dan Naive Bayes dalam Klasifikasi Berita Hoax di Indonesia

Authors

DOI:

https://doi.org/10.29408/edumatic.v9i1.28987

Keywords:

fake news, grid search, logistic regression, naive bayes

Abstract

The spread of false information has become a major challenge in Indonesian society, with 2,484 cases recorded in 2022. This highlights the importance of developing a system that can effectively identify and filter out fake news. This research aims to develop a more accurate fake news detection model by applying logistic regression, which is optimized by grid search and oversampling to overcome data imbalance. The main focus of this research is to improve the performance of the model in detecting fake news on unbalanced datasets. The dataset used is the Indonesian Fake News dataset, which consists of 4,231 entries with two categories: valid (3,465 entries) and hoax (766 entries). Preprocessing steps include stemming, stopword removal, and text normalization using TF-IDF. Random oversampling was applied to balance the data between hoax and valid classes, and parameter optimization was performed using grid search to improve model performance. The results show that the optimized logistic regression achieved the highest accuracy of 93%, surpassing naive bayes, which achieved 86% accuracy. These findings suggest that the developed fake news detection model can be used to improve the social media news monitoring system, and increase digital literacy among Indonesians.

References

Afrizal, S., Irmanda, H. N., Falih, N., & Isnainiyah, I. N. (2020). Implementasi Metode Naïve Bayes untuk Analisis Sentimen Warga Jakarta Terhadap. Informatik: Jurnal Ilmu Komputer, 15(3), 157–168. https://doi.org/10.52958/iftk.v15i3.1454

Ahmed Arafa, A. H., Radad, M., Badawy, M. M., & El-Fishawy, N. (2022). Logistic regression hyperparameter optimization for cancer classification. Menoufia Journal of Electronic Engineering Research, 31(1), 1-8. https://doi.org/10.21608/mjeer.2021.70512.1034

Ananda, I. K., Fanani, A. Z., Setiawan, D., & Wicaksono, D. F. (2024). Penerapan Random Oversampling dan Algoritma Boosting untuk Memprediksi Kualitas Buah Jeruk. Edumatic: Jurnal Pendidikan Informatika, 8(1), 282–289. https://doi.org/10.29408/edumatic.v8i1.25836

Ardiansyah, M., Sunyoto, A., & Luthfi, E. T. (2021). Analisis Perbandingan Akurasi Algoritma Naïve Bayes dan C4.5 untuk Klasifikasi Diabetes. Edumatic: Jurnal Pendidikan Informatika, 5(2), 147–156. https://doi.org/10.29408/edumatic.v5i2.3424

Armansyah, A., & Ramli, R. K. (2022). Model prediksi kelulusan mahasiswa tepat waktu dengan metode Naïve Bayes. Edumatic: Jurnal Pendidikan Informatika, 6(1), 1-10. https://doi.org/10.29408/edumatic.v6i1.4789

Fauzi, A., & Yunial, A. H. (2022). Optimasi Algoritma Klasifikasi Naive Bayes, Decision Tree, K – Nearest Neighbor, dan Random Forest menggunakan Algoritma Particle Swarm Optimization pada Diabetes Dataset. Jurnal Edukasi Dan Penelitian Informatika (JEPIN), 8(3), 470–481. https://doi.org/10.26418/jp.v8i3.56656

Fitriani, R. D., Yasin, H., & Tarno, T. (2021). Penanganan Klasifikasi Kelas Data Tidak Seimbang Dengan Random Oversampling Pada Naive Bayes(Studi Kasus: Status Peserta KB IUD di Kabupaten Kendal). Jurnal Gaussian, 10(1), 11–20. https://doi.org/10.14710/j.gauss.v10i1.30243

Gifari, O. I., Adha, Muh., Freddy, F., & Durrand, F. F. S. (2022). Analisis Sentimen Review Film Menggunakan TF-IDF dan Support Vector Machine. Journal of Information Technology, 2(1), 36–40. https://doi.org/10.46229/jifotech.v2i1.330

Hendrawan, I. R., Utami, E., & Hartanto, A. D. (2022). Comparison of Naïve Bayes Algorithm and XGBoost on Local Product Review Text Classification. Edumatic: Jurnal Pendidikan Informatika, 6(1), 143-149. https://doi.org/10.29408/edumatic.v6i1.5613

Kurniawan, A. A., & Mustikasari, M. (2022). Evaluasi Kinerja MLLIB APACHE SPARK pada Klasifikasi Berita Palsu dalam Bahasa Indonesia. Jurnal Teknologi Informasi dan Ilmu Komputer, 9(3), 489-500. https://doi.org/10.25126/jtiik.2022923538

Lindawati, L., Fadhli, M., & Wardana, A. S. (2023). Optimasi Gaussian Naïve Bayes dengan Hyperparameter Tuning dan Univariate Feature Selection dalam Prediksi Cuaca. Edumatic: Jurnal Pendidikan Informatika, 7(2), 237-246. https://doi.org/10.29408/edumatic.v7i2.21179

Matin, I. M. M. (2023). Hyperparameter Tuning Menggunakan GridsearchCV pada Random Forest untuk Deteksi Malware. MULTINETICS, 9(1), 43–50. https://doi.org/10.32722/multinetics.v9i1.5578

Muhabatin, H., Prabowo, C., Ali, I., Lukman Rohmat, C., Rizki Amalia, D., sitasi, C., & Rizki, D. (2021). Klasifikasi Berita Hoax Menggunakan Algoritma Naïve Bayes Berbasis PSO. Informatics for Educators and Professionals, 5(2), 156–165. https://doi.org/10.51211/itbi.v5i2.1531

Nurrokhman, M. Z. (2023). Perbandingan Algoritma Support Vector Machine dan Neural Network untuk Klasifikasi Penyakit Hati. The Indonesian Journal of Computer Science, 12(4), 2096–2106. https://doi.org/10.33022/ijcs.v12i4.3274

Pardede, J., & Ibrahim, R. G. (2020). Implementasi Long Short-Term Memory untuk Identifikasi Berita Hoax Berbahasa Inggris pada Media Sosial. Journal of Computer Science and Informatics Engineering (J-Cosine), 4(2), 179–187. https://doi.org/10.29303/jcosine.v4i2.361

Purnajaya, A. R., & Pernando, Y. (2023). Analisa Sentimen Informasi Hoaks Pasca Pandemi Covid-19 dengan Text Mining. Journal of Computer System and Informatics (JoSYC), 4(3), 460–469. https://doi.org/10.47065/josyc.v4i3.3358

Putri, N. F., Vionia, E., & Michael, T. (2020). Pentingnya Kesadaran Hukum Dan Peran Masyarakat Indonesia Dalam Menghadapi Penyebaran Berita Hoax Covid-19. Media Keadilan: Jurnal Ilmu Hukum, 11(1), 98–111. https://doi.org/10.31764/jmk.v11i1.2262

Ramadhan, N. G., Adhinata, F. D., Segara, A. J. T., & Rakhmadani, D. P. (2022). Deteksi Berita Palsu Menggunakan Metode Random Forest dan Logistic Regression. JURIKOM (Jurnal Riset Komputer), 9(2), 251–256. https://doi.org/10.30865/jurikom.v9i2.3979

Ropikoh, I. A., Abdulhakim, R., Enri, U., & Sulistiyowati, N. (2021). Penerapan Algoritma Support Vector Machine (SVM) untuk Klasifikasi Berita Hoax Covid-19. Journal of Applied Informatics and Computing, 5(1), 64–73. https://doi.org/10.30871/jaic.v5i1.3167

Roy, J., & Junaidi, A. (2020). Pengaruh Terpaan Media Berita Hoax di Instagram terhadap Opini Masyarakat Milenials Akan Sumber Berita. Koneksi, 4(2), 280–285. https://doi.org/10.24912/kn.v4i2.8138

Ruise, A. P., Mashuri, A. S., Sulaiman, M., & Rahman, F. (2023). Studi Komparasi Metode Svm, Logistic Regresion Dan Random Forest Clasifier Untuk Mengklasifikasi Fake News di Twitter. J I M P - Jurnal Informatika Merdeka Pasuruan, 7(2), 64–67. https://doi.org/10.51213/jimp.v7i2.472

Sani, R. R., Pratiwi, Y. A., Winarno, S., Udayanti, E. D., & Alzami, F. (2022). Analisis Perbandingan Algoritma Naive Bayes Classifier dan Support Vector Machine untuk Klasifikasi Berita Hoax pada Berita Online Indonesia. Jurnal Masyarakat Informatika, 13(2), 85–98. https://doi.org/10.14710/jmasif.13.2.47983

Tanggraeni, A. I., & Sitokdana, M. N. N. (2022). Analisis Sentimen Aplikasi E-Government pada Google Play Menggunakan Algoritma Naïve Bayes. JATISI (Jurnal Teknik Informatika Dan Sistem Informasi), 9(2), 785–795. https://doi.org/10.35957/jatisi.v9i2.1835

Zhafira, D. F., Rahayudi, B., & Indriati, I. (2021). Analisis Sentimen Kebijakan Kampus Merdeka Menggunakan Naive Bayes dan Pembobotan TF-IDF Berdasarkan Komentar pada Youtube. Jurnal Sistem Informasi, Teknologi Informasi, Dan Edukasi Sistem Informasi, 2(1), 55–63. https://doi.org/10.25126/justsi.v2i1.24

Downloads

Published

2025-04-10

How to Cite

Cahyani, O. N., & Budiman, F. (2025). Performa Logistic Regression dan Naive Bayes dalam Klasifikasi Berita Hoax di Indonesia. Edumatic: Jurnal Pendidikan Informatika, 9(1), 60–68. https://doi.org/10.29408/edumatic.v9i1.28987