Mitigating Class Imbalance in Indonesian Sarcasm Detection: A Cross-Platform Transformer Study
DOI:
https://doi.org/10.29408/edumatic.v10i1.33724Keywords:
class imbalance, cross-domain robustness, generative data augmentation, sarcasm detection, transformer modelsAbstract
Sarcasm detection in Indonesian social media remains challenging due to implicit pragmatic expressions, severe class imbalance, and strong domain variation across platforms. Unlike prior Indonesian sarcasm studies that predominantly focus on in-domain accuracy using conventional balancing methods, this study provides the first systematic cross-platform analysis of generative data balancing under domain shift. We empirically examine whether GPT-4o based generative balancing improves robustness rather than accuracy-centric evaluation in Transformer-based sarcasm detection. Models trained on Twitter data are evaluated across Twitter, Reddit, and TikTok as an unseen domain. The results show that generative balancing yields limited gains in in-domain evaluation but consistently improves cross-domain robustness by increasing sarcasm recall, particularly for Base models. Notably, XLM-R Base achieves an absolute F1-score improvement of +10.8 points on TikTok, while IndoBERT-Large attains the highest in-domain F1-score of 0.7444. These findings indicate that generative augmentation partially mitigates class imbalance by enhancing robustness under domain shift, thereby repositioning sarcasm detection as a robustness-oriented problem and highlighting generative balancing as a complementary strategy rather than a substitute for larger Transformer models in cross-platform NLP settings.
References
A’la, F. Y. (2025). Optimasi Klasifikasi Sentimen Ulasan Game Berbahasa Indonesia: IndoBERT dan SMOTE untuk Menangani Ketidakseimbangan Kelas. Edumatic: Jurnal Pendidikan Informatika, 9(1), 256–265. https://doi.org/10.29408/edumatic.v9i1.29666
An, T., Yan, P., Zuo, J., Jin, X., Liu, M., & Wang, J. (2024). Enhancing Cross-Lingual Sarcasm Detection by a Prompt Learning Framework with Data Augmentation and Contrastive Learning. Electronics (Switzerland), 13(11). https://doi.org/10.3390/electronics13112163
Bayer, M., Kaufhold, M. A., & Reuter, C. (2023). A Survey on Data Augmentation for Text Classification. ACM Computing Surveys, 55(7). https://doi.org/10.1145/3544558
Chen, Z., Zhang, J. M., Sarro, F., & Harman, M. (2023). A Comprehensive Empirical Study of Bias Mitigation Methods for Machine Learning Classifiers. ACM Transactions on Software Engineering and Methodology, 32(4). https://doi.org/10.1145/3583561
Dogra, V., Verma, S., Kavita, Wozniak, M., Shafi, J., & Ijaz, M. F. (2024). Shortcut Learning Explanations for Deep Natural Language Processing: A Survey on Dataset Biases. IEEE Access, 12, 26183–26195. https://doi.org/10.1109/ACCESS.2024.3360306
Fei, H., Chua, T. S., Li, C., Ji, D., Zhang, M., & Ren, Y. (2022). On the Robustness of Aspect-based Sentiment Analysis: Rethinking Model, Data, and Training. ACM Transactions on Information Systems, 41(2). https://doi.org/10.1145/3564281
Gedela, R. T., Baruah, U., & Soni, B. (2024). Deep Contextualised Text Representation and Learning for Sarcasm Detection. Arabian Journal for Science and Engineering, 49(3), 3719–3734. https://doi.org/10.1007/s13369-023-08170-4
Helal, N. A., Hassan, A., Badr, N. L., & Afify, Y. M. (2024). A contextual-based approach for sarcasm detection. Scientific Reports, 14(1). https://doi.org/10.1038/s41598-024-65217-8
Herawan, D. F., & Saputri, T. R. D. (2025). Benchmarking Model Transformer Modern untuk Analisis Sentimen dan Tren Konsumen dalam Industri Fashion. Edumatic: Jurnal Pendidikan Informatika, 9(3), 945–954. https://doi.org/10.29408/edumatic.v9i3.32657
Hu, Y. H., Liu, T. H., Tsai, C. F., & Lin, Y. J. (2025). Handling Class Imbalanced Data in Sarcasm Detection with Ensemble Oversampling Techniques. Applied Artificial Intelligence, 39(1). https://doi.org/10.1080/08839514.2025.2468534
Hupkes, D., Giulianelli, M., Dankers, V., Artetxe, M., Elazar, Y., Pimentel, T., Christodoulopoulos, C., Lasri, K., Saphra, N., Sinclair, A., Ulmer, D., Schottmann, F., Batsuren, K., Sun, K., Sinha, K., Khalatbari, L., Ryskina, M., Frieske, R., Cotterell, R., & Jin, Z. (2023). A taxonomy and review of generalization research in NLP. Nature Machine Intelligence, 5(10), 1161–1174. https://doi.org/10.1038/s42256-023-00729-y
Javid, B., & Mashayekhi, H. (2025). Classification of imbalanced user reviews using a generative approach. Social Network Analysis and Mining, 15(1). https://doi.org/10.1007/s13278-025-01477-0
Liebeskind, C., & Bączkowska, A. (2025). Sarcastic comments on Reddit and Twitter. Topics in Linguistics, 26(1), 174–193. https://doi.org/10.17846/topling-2025-0008
Liu, H., Yang, B., & Yu, Z. (2024). A Multi-View Interactive Approach for Multimodal Sarcasm Detection in Social Internet of Things with Knowledge Enhancement. Applied Sciences (Switzerland), 14(5). https://doi.org/10.3390/app14052146
Nasution, A. H., Onan, A., Murakami, Y., Monika, W., & Hanafiah, A. (2025). Benchmarking Open-Source Large Language Models for Sentiment and Emotion Classification in Indonesian Tweets. IEEE Access, 13, 94009–94025. https://doi.org/10.1109/ACCESS.2025.3574629
Pandey, R., & Singh, J. P. (2023). BERT-LSTM model for sarcasm detection in code-mixed social media post. Journal of Intelligent Information Systems, 60(1), 235–254. https://doi.org/10.1007/s10844-022-00755-z
Šandor, D., & Bagić Babac, M. (2024). Sarcasm detection in online comments using machine learning. Information Discovery and Delivery, 52(2), 213–226. https://doi.org/10.1108/IDD-01-2023-0002
Suhartono, D., Wongso, W., & Tri Handoyo, A. (2024). IdSarcasm: Benchmarking and Evaluating Language Models for Indonesian Sarcasm Detection. IEEE Access, 12, 87323–87332. https://doi.org/10.1109/ACCESS.2024.3416955
Sujana, Y., & Kao, H. Y. (2023). LiDA: Language-Independent Data Augmentation for Text Classification. IEEE Access, 11, 10894–10901. https://doi.org/10.1109/ACCESS.2023.3234019
Sukhavasi, V., & Dondeti, V. (2023). Effective Automated Transformer Model based Sarcasm Detection Using Multilingual Data. Multimedia Tools and Applications, 83(16), 47531–47562. https://doi.org/10.1007/s11042-023-17302-9
Thakkar, G., Preradović, N. M., & Tadić, M. (2024). Examining Sentiment Analysis for Low-Resource Languages with Data Augmentation Techniques. Eng, 5(4), 2920–2942. https://doi.org/10.3390/eng5040152
Zhao, H., Chen, H., Ruggles, T. A., Feng, Y., Singh, D., & Yoon, H. J. (2024). Improving Text Classification with Large Language Model-Based Data Augmentation. Electronics (Switzerland), 13(13). https://doi.org/10.3390/electronics13132535
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 A. Salky Maulana, I Made Artha Agastya

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All articles in this journal are the sole responsibility of the authors. Edumatic: Jurnal Pendidikan Informatika can be accessed free of charge, in accordance with the Creative Commons license used.

This work is licensed under a Lisensi a Creative Commons Attribution-ShareAlike 4.0 International License.


