Comparative Analysis of Naive Bayes and SVM for Improved Emotion Classification on Social Media

Authors

DOI:

https://doi.org/10.29408/edumatic.v9i1.29087

Keywords:

emotion classification, feature extraction, naïve bayes, preprocessing, svm

Abstract

identifying emotions such as happy, angry, sad, and fear. However, Indonesian text processing faces challenges due to language complexity and slang. This research aims to compare Naive Bayes and SVM models, focusing on evaluating the impact of preprocessing, feature extraction, and parameter optimization to improve emotion classification. The dataset was collected from API X using crawling techniques and manually annotated by six annotators. The training process used full and half preprocessing datasets with TF-IDF, BoW, and Word2Vec feature extraction. Naive Bayes and SVM models were evaluated using accuracy, precision, recall, and F1 score. Our results show that full preprocessing improves accuracy, with TF-IDF + BoW achieving 78.01% with SVM and outperforming Naïve Bayes at 75.53%. The results classify emotions into four classes: happy, sad, angry, and fear. This study demonstrates the value of preprocessing and feature selection to deal with slang and complexity in Indonesian texts. These results provide insights for developing optimal emotion classification models and offer applications in sentiment analysis, social media monitoring, and mental health detection.

References

Akbar, B. M., Akbar, A. T., & Husaini, R. (2022). Analysis of Sentiments and Emotions about Sinovac Vaccine Using Naive Bayes. Telematika: Jurnal Informatika Dan Teknologi Informasi, 19(2), 185–200. https://doi.org/10.31315/telematika.v19i2.7601

Fudholi, D. H. (2021). Klasifikasi Emosi pada Teks dengan Menggunakan Metode Deep Learning. Syntax Literate; Jurnal Ilmiah Indonesia, 6(1), 546-553.

Galke, L., & Scherp, A. (2021). Bag-of-words vs. graph vs. sequence in text classification: Questioning the necessity of text-graphs and the surprising strength of a wide MLP. ArXiv Preprint ArXiv:2109.03777.

Habberrih, A., & Abuzaraida, M. A. (2024). Sentiment Analysis of Libyan Dialect Using Machine Learning with Stemming and Stop-words Removal. International conference on communication engineering and computer science, 259-264. https://doi.org/10.24086/cocos2024/paper.1171

Indira, R., & Maharani, W. (2021). Personality detection on social media twitter using long short-term memory with word2vec. IEEE International Conference on Communication, Networks and Satellite (COMNETSAT), 64-69. IEEE. https://doi.org/10.1109/COMNETSAT53002.2021.9530820

Jabasheela, L. (2024). Secured Text-Based Emotion Classification Using Machine Learning With NLP. Educational Administration: Theory and Practice, 30(5), 901-910.

Mokari, A., Guo, S., & Bocklitz, T. (2023). Exploring the steps of infrared (IR) spectral analysis: Pre-processing, (classical) data modelling, and deep learning. Molecules, 28(19), 6886. https://doi.org/10.3390/molecules28196886

Ningsih, M. R., Unjung, J., Pertiwi, D. A. A., Prasetiyo, B., & Muslim, M. A. (2024). Optimized support vector machine with particle swarm optimization to improve the accuracy amazon sentiment analysis classification. Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, 101-108. https://doi.org/10.22219/kinetik.v9i1.1888

Nugroho, R. A., Cholissodin, I., & Indriati, I. (2021). Implementasi Naïve Bayes Classifier untuk Klasifikasi Emosi Tweet Berbahasa Indonesia pada Spark. Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, 5(1), 301-310.

Paskahningrum, Y. K., Utami, E., & Yaqin, A. (2023). A Systematic Literature Review of Stemming in Non-Formal Indonesian Language. International Journal of Innovative Science and Research Technology, 8(1), 62-69.

Prasetija, Z. R. N. S., Romadhony, A., & Setiawan, E. B. (2022). Analisis Pengaruh Normalisasi Teks pada Klasifikasi Sentimen Ulasan Produk Kecantikan. eProceedings of Engineering, 9(3), 1769.

Regita, E., Luthfiyyah, N., & Marsuki, N. R. (2024). Pengaruh Media Sosial Terhadap Persepsi Diri dan Pembentukan Identitas Remaja di Indonesia. Jurnal Kajian Dan Penelitian Umum, 2(1), 46–52. https://doi.org/10.47861/jkpu-nalanda.v2i1.830

Riani, A. P., Sulistyowati, N., Ridwan, T., & Voutama, A. (2023). Klasifikasi Emosi Publik Terhadap Larangan Penggunaan Obat Sirup Menggunakan Algoritma Naive Bayes. METHOMIKA: Jurnal Manajemen Informatika & Komputerisasi Akuntansi, 7(2), 325-339. https://doi.org/10.46880/jmika.Vol7No2.pp325-339

Sarimole, F. M., & Kudrat, K. (2024). Analisis Sentimen Terhadap Aplikasi Satu Sehat Pada Twitter Menggunakan Algoritma Naive Bayes Dan Support Vector Machine. Jurnal Sains Dan Teknologi, 5(3), 783–790. https://doi.org/10.55338/saintek.v5i3.2702

Siddique, Z. B., Khan, M. A., Din, I. U., Almogren, A., Mohiuddin, I., & Nazir, S. (2021). Machine Learning‐Based Detection of Spam Emails. Scientific Programming, 2021(1), 6508784. https://doi.org/10.1155/2021/6508784

Sudianto, S. (2022). Analisis Kinerja Algoritma Machine Learning Untuk Klasifikasi Emosi. Building of Informatics, Technology and Science (BITS), 4(2), 1027-1034. https://doi.org/10.47065/bits.v4i2.2261

Supian, A., Revaldo, B. T., Marhadi, N., Efrizoni, L., & Rahmaddeni, R. (2024). Perbandingan Kinerja Naïve Bayes dan SVM pada Analisis Sentimen Twitter Ibukota Nusantara. JURNAL ILMIAH INFORMATIKA, 12(01), 15-21. https://doi.org/10.33884/jif.v12i01.8721

Zhang, F., Chen, J., Tang, Q., & Tian, Y. (2024). Evaluation of emotion classification schemes in social media text: an annotation-based approach. BMC Psychology, 12(1), 503. https://doi.org/10.1186/s40359-024-02008-w

Zhang, P., Ma, Z., Ren, Z., Wang, H., Zhang, C., Wan, Q., & Sun, D. (2024). Design of an Automatic Classification System for Educational Reform Documents Based on Naive Bayes Algorithm. Mathematics, 12(8), 1127. https://doi.org/10.3390/math12081127

Zharifa, A. H. A., & Ujianto, E. I. H. (2024). Analisis Sentimen Publik di Twitter Pasca Debat Kelima Pilpres 2024 dengan Naive Bayes. Edumatic: Jurnal Pendidikan Informatika, 8(2), 754-763. https://doi.org/10.29408/edumatic.v8i2.28048

Downloads

Published

2025-04-09

How to Cite

Pratama, R. F. P., & Maharani, W. (2025). Comparative Analysis of Naive Bayes and SVM for Improved Emotion Classification on Social Media. Edumatic: Jurnal Pendidikan Informatika, 9(1), 11–20. https://doi.org/10.29408/edumatic.v9i1.29087