Perbandingan Performansi Model pada Algoritma K-NN terhadap Klasifikasi Berita Fakta Hoaks Tentang Covid-19
DOI:
https://doi.org/10.29408/edumatic.v5i2.3664Keywords:
Euclidean, Jaccard, K-Nearest Neighbor, Manhattan, MinkowskiAbstract
During Covid-19 pandemic, there was various hoax news about Covid-19. There are truth-clarification platforms for hoax news about Covid-19 such as Jala Hoax and Saber Hoax which categorize into misinformation and disinformation. Classification of supervised learning methods is applied to carry out learning from fact labels. Dataset is taken from Jala Hoax and Saber Hoax as many as 559 data which are made into Class 1 (Misleading Content, Satire/Parody, False Connection), Class 2 (False Context, Imposter Content), Class 3 (Fabricated and Manipulated Content). K-Nearest Neighbor (K-NN) is used to classify categories of misinformation and disinformation. Dissimilarity measure Jaccard Distance is compared with Euclidean, Manhattan, and Minkowski and uses k-value variance in the K-NN to determine the performance comparison results for each test. Results of Jaccard Distance at the value of k = 4 get a higher value than other model with an accuracy 0.696, precision 0.710, recall 0.572, and F1-Score. Maximum Results tend to be on the label of the most data class in Class 1 (Misleading Content, Satire or Parody, False Connection) with a total of 58 correct data from 61 test data.
References
Badhani, S., & Muttoo, S. K. (2019). Android Malware Detection Using Code Graphs. System Performance and Management Analytics, 203–215. https://doi.org/10.1007/978-981-10-7323-6_17
Dinata, R. K., Akbar, H., & Hasdyna, N. (2020). Algoritma K-Nearest Neighbor dengan Euclidean Distance dan Manhattan Distance untuk Klasifikasi Transportasi Bus. ILKOM Jurnal Ilmiah, 12(2), 104–111. https://doi.org/10.33096/ilkom.v12i2.539.104-111
Guillet, F., & Hamilton, H. J. (2007). Quality Measures in Data Mining. New York: Springer. https://doi.org/10.1007/978-3-540-44918-8
Jedari, E., Wu, Z., Rashidzadeh, R., & Saif, M. (2015). Wi-Fi based indoor location positioning employing random forest classifier. International Conference on Indoor Positioning and Indoor Navigation, IPIN 2015, 13–16. IEEE. https://doi.org/10.1109/IPIN.2015.7346754
Kosub, S. (2019). A note on the triangle inequality for the Jaccard distance. Pattern Recognition Letters, 120, 36–38. https://doi.org/10.1016/j.patrec.2018.12.007
Kristiawan, K., Somali, D. D., Linggan jaya, T. A., & Widjaja, A. (2020). Deteksi Buah Menggunakan Supervised Learning dan Ekstraksi Fitur untuk Pemeriksa Harga. Jurnal Teknik Informatika Dan Sistem Informasi, 6(3), 541–548. https://doi.org/10.28932/jutisi.v6i3.3029
Le, T. T. N., & Phuong, T. V. X. (2020). Privacy Preserving Jaccard Similarity by Cloud-Assisted for Classification. Wireless Personal Communications, 112(3), 1875–1892. https://doi.org/10.1007/s11277-020-07131-6
Mathur, A., Kubde, P., & Vaidya, S. (2020). Emotional analysis using twitter data during pandemic situation: Covid-19. Proceedings of the 5th International Conference on Communication and Electronics Systems, ICCES 2020, (Icces), 845–848. https://doi.org/10.1109/ICCES48766.2020.09138079
Riefky, M., & Pramesti, W. (2020). Sentiment Analysis of Southeast Asian Games (SEA Games) in Philippines 2019 Based on Opinion of Internet User of Social Media Twitter with K-Nearest Neighbor and Support Vector Machine. Jurnal Matematika, Statistika Dan Komputasi, 17(1), 26–41. https://doi.org/10.20956/jmsk.v17i1.9947
Roy, J., & Junaidi, A. (2020). Pengaruh Terpaan Media Berita Hoax di Instagram terhadap Opini Masyarakat Milenials Akan Sumber Berita. Koneksi, 4(2), 280-285. https://doi.org/10.24912/kn.v4i2.8138
Sabilla, W. I., & Putri, T. E. (2017). Prediksi Ketepatan Waktu Lulus Mahasiswa dengan k- Nearest Neighbor dan Naïve Bayes Classifier ( Studi Kasus Prodi D3 Sistem Informasi Universitas Airlangga ). Jurnal Komputer Terapan, 3(2), 233–240.
Sari, V., Firdausi, F., & Azhar, Y. (2020). Perbandingan Prediksi Kualitas Kopi Arabika dengan Menggunakan Algoritma SGD, Random Forest dan Naive Bayes. Edumatic: Jurnal Pendidikan Informatika, 4(2), 1–9. https://doi.org/10.29408/edumatic.v4i2.2202
Satrian, B., & Gusrianty. (2020). Penerapan Algoritma K-Nn untuk Klasifikasi Gamers Usia Sekolah. Jurnal Mahasiswa Aplikasi Teknologi Komputer Dan Informasi, 2(1), 19–23.
Takdirillah, R. (2020). Penerapan Data Mining Menggunakan Algoritma Apriori Terhadap Data Transaksi Sebagai Pendukung Informasi Strategi Penjualan. Edumatic: Jurnal Pendidikan Informatika, 4(1), 37–46. https://doi.org/10.29408/edumatic.v4i1.2081
Walid, M., & Darmawan, A. K. (2017). Pengenalan Ucapan Menggunakan Metode Linear Predictive Coding ( LPC ) dan K-Nearest Neighbor (KNN). Energy, Universitas Panca Marga, 7(1), 13–22.
Wang, T., Lu, K., Chow, K. P., & Zhu, Q. (2020). COVID-19 Sensing: Negative Sentiment Analysis on Social Media in China via BERT Model. IEEE Access, 8, 138162–138169. https://doi.org/10.1109/ACCESS.2020.3012595
Wibawa, D. W., Nasrun, M., & Setianingsih, C. (2018). Sentiment Analysis on User Satisfaction Level of Cellular Data Service Using the K-Nearest Neighbor (K-NN) Algorithm. International Conference on Control, Electronics, Renewable Energy and Communications, ICCEREC 2018, 235–241. https://doi.org/10.1109/ICCEREC.2018.8711992
Widiyaningsih, S. D., & Pertiwi, A. (2020). Analysis of Ovo Application Sentiment Using Lexicon Based Method and K-Nearest Neighbor. Jurnal Ilmiah Ekonomi Bisnis, 25(1), 14–28. https://doi.org/10.35760/eb.2020.v25i1.2416
Downloads
Published
Issue
Section
License
Semua tulisan pada jurnal ini adalah tanggung jawab penuh penulis. Edumatic: Jurnal Pendidikan Informatika bisa diakses secara free (gratis) tanpa ada pungutan biaya, sesuai dengan lisensi creative commons yang digunakan.
This work is licensed under a Lisensi a Creative Commons Attribution-ShareAlike 4.0 International License.