Revisiting Resampling Strategies under Extreme Class Imbalance: Evidence from Large-Scale Online Payment Fraud Detection
DOI:
https://doi.org/10.29408/edumatic.v10i1.33272Keywords:
adasyn, fraud detection, random oversampling, random undersampling, smoteAbstract
Extreme class imbalance in online payment fraud detection creates an accuracy paradox and an operational risk in which improving fraud capture can generate costly false alarms. This study uses a quantitative, experiment-based design to evaluate the operational impact of common resampling strategies under extreme skew using interpretable linear decision rules. The Online Payments Fraud dataset (6.36 million transactions) from Kaggle is analysed using six monetary balance/amount variables (amount, oldbalanceOrg, newbalanceOrig, oldbalanceDest, newbalanceDest) plus the rule-based isFlaggedFraud indicator to predict the isFraud label. Five training variants (no resampling, ROS, RUS, SMOTE, ADASYN) are compared with two linear decision rules: an ordinary least squares linear scoring model (thresholded at 0.5) and a linear SVM, using a leakage-free protocol in which resampling is applied only to the 80% training split and performance is assessed on an untouched, highly imbalanced 20% test set. The findings indicate that LinReg–RUS achieves the most balanced operating point (Precision 65.938%, Recall 47.718%, F1 55.367%, ROC-AUC 98.720%), whereas ADASYN increases recall but collapses precision (~2.1%), yielding F1 ≈4.17%. These results contribute controlled, large-scale evidence that under extreme imbalance, simpler resampling–model combinations can provide more deployable precision–recall trade-offs than aggressive synthetic sampling, supporting interpretable baselines for capacity-constrained payment screening.
References
Ahmed, K. H., Axelsson, S., Li, Y., & Sagheer, A. M. (2025). A credit card fraud detection approach based on ensemble machine learning classifier with hybrid data sampling. Machine Learning with Applications, 20, 100675. https://doi.org/10.1016/J.MLWA.2025.100675
Baisholan, N., Dietz, J. E., Gnatyuk, S., Turdalyuly, M., Matson, E. T., & Baisholanova, K. (2025). A Systematic Review of Machine Learning in Credit Card Fraud Detection Under Original Class Imbalance. In Computers (Vol. 14, Issue 10). Multidisciplinary Digital Publishing Institute (MDPI). https://doi.org/10.3390/computers14100437
Becerra-Suarez, F. L., Alvarez-Vasquez, H., & Forero, M. G. (2025). Improvement of Bank Fraud Detection Through Synthetic Data Generation with Gaussian Noise. Technologies, 13(4). https://doi.org/10.3390/technologies13040141
Chen, W., Yang, K., Yu, Z., Shi, Y., & Chen, C. L. P. (2024). A survey on imbalanced learning: latest research, applications and future directions. Artificial Intelligence Review, 57(6). https://doi.org/10.1007/s10462-024-10759-6
Chen, Y., Zhao, C., Xu, Y., Nie, C., & Zhang, Y. (2025). Deep Learning in Financial Fraud Detection: Innovations, Challenges, and Applications. Data Science and Management. https://doi.org/10.1016/J.DSM.2025.08.002
Cherif, A., Badhib, A., Ammar, H., Alshehri, S., Kalkatawi, M., & Imine, A. (2023). Credit card fraud detection in the era of disruptive technologies: A systematic review. Journal of King Saud University - Computer and Information Sciences, 35(1), 145–174. https://doi.org/10.1016/J.JKSUCI.2022.11.008
Compagnino, A. A., Maruccia, Y., Cavuoti, S., Riccio, G., Tutone, A., Crupi, R., & Pagliaro, A. (2025). An Introduction to Machine Learning Methods for Fraud Detection. In Applied Sciences (Switzerland) (Vol. 15, Issue 21). Multidisciplinary Digital Publishing Institute (MDPI). https://doi.org/10.3390/app152111787
Ghalwash, M. A., Abdelrazek, S. M., Eladawi, N. H., & Ghalwash, H. A. (2025). Enhancing credit card fraud detection using DBSCAN-augmented disjunctive voting ensemble. Scientific Reports, 15(1). https://doi.org/10.1038/s41598-025-22960-w
Ghosh Dastidar, K., Caelen, O., Granitzer, M., & Ghosh Dastidar, K. (2024). Machine Learning Methods for Credit Card Fraud Detection: A Survey. IEEE Access. https://doi.org/10.1109/ACCESS.2023.0322000
Gupta, R. K., Hassan, A., Majhi, S. K., Parveen, N., Zamani, A. T., Anitha, R., Ojha, B., Singh, A. K., & Muduli, D. (2025). Enhanced framework for credit card fraud detection using robust feature selection and a stacking ensemble model approach. Results in Engineering, 26, 105084. https://doi.org/10.1016/J.RINENG.2025.105084
Hidayat, W., Ardiansyah, M., & Setyanto, A. (2021). Jurnal Pendidikan Informatika Pengaruh Algoritma ADASYN dan SMOTE terhadap Performa Support Vector Machine pada Ketidakseimbangan Dataset Airbnb. Edumatic: Jurnal Pendidikan Informatika, 5(1). https://doi.org/10.29408/edumatic.v5i1.3125
Hilal, W., Gadsden, S. A., & Yawney, J. (2022). Financial Fraud: A Review of Anomaly Detection Techniques and Recent Advances. Expert Systems with Applications, 193, 116429. https://doi.org/10.1016/J.ESWA.2021.116429
Huang, H., Liu, B., Xue, X., Cao, J., & Chen, X. (2024). Imbalanced credit card fraud detection data: A solution based on hybrid neural network and clustering-based undersampling technique. Applied Soft Computing, 154, 111368. https://doi.org/10.1016/J.ASOC.2024.111368
Mohammed, Z. H., Hatem Khorsheed, F., & Ahmed, G. J. (2025). Ensemble Deep Learning Strategy for Handling Imbalanced Credit Card Fraud Data. In Network, and Computer Science) | (Vol. 8, Issue 2).
More, R., & Dashore, P. (2025). A Novel Deep Learning Approach for Online Payment Fraud Detection. EPJ Web of Conferences, 341, 01013. https://doi.org/10.1051/epjconf/202534101013
Mustafa, A. A., Hussein, H. M., Kadhim, M. M., & Hussein, M. J. (2025). A Hybrid Oversampling Approach for Fraud Detection: Integrating SMOTE-ENN and ADASYN. International Journal of Safety and Security Engineering, 15(6), 1243–1250. https://doi.org/10.18280/ijsse.150614
Noviandy, T. R., Idroes, G. M., Maulana, A., Hardi, I., Ringga, E. S., & Idroes, R. (2023). Credit Card Fraud Detection for Contemporary Financial Management Using XGBoost-Driven Machine Learning and Data Augmentation Techniques. Indatu Journal of Management and Accounting, 1(1), 29–35. https://doi.org/10.60084/ijma.v1i1.78
Oztemel, E., & Isik, M. (2025). A Systematic Review of Intelligent Systems and Analytic Applications in Credit Card Fraud Detection. In Applied Sciences (Switzerland) (Vol. 15, Issue 3). Multidisciplinary Digital Publishing Institute (MDPI). https://doi.org/10.3390/app15031356
Pratama, M. D., Raharjo, A. B., & Purwitasari, D. (2023). ENSEMBLE OVERSAMPLING FOR FINANCIAL FRAUD CLASSIFICATION OF IMBALANCED DATA. IPTEK The Journal of Technology and Science, 34(2), 853–4098. https://doi.org/10.12962/j20882033.v34i3.17183
Putra, A. H., & Salam, A. (2025). A Comparative Performance of SMOTE, ADASYN and Random Oversampling in Machine Learning Models on Prostate Cancer Dataset. In Journal of Applied Informatics and Computing (JAIC) (Vol. 9, Issue 3). http://jurnal.polibatam.ac.id/index.php/JAIC
Rahman, R. M. R., & Muslim, M. A. (2024). Online Payment Fraud Prediction With Machine Learning Approach Using Naive Bayes Algorithm. Journal of Student Research Exploration, 2.
Rtayli, N., & Enneya, N. (2020). Enhanced credit card fraud detection based on SVM-recursive feature elimination and hyper-parameters optimization. Journal of Information Security and Applications, 55, 102596. https://doi.org/10.1016/J.JISA.2020.102596
Salifu, D., Chepkemoi, L., Ibrahim, E. A., Nkoba, K., & Tonnang, H. E. Z. (2025). Data Augmentation and Machine Learning algorithms for multi-class imbalanced morphometrics data of stingless bees. Heliyon, 11(3), e42214. https://doi.org/10.1016/J.HELIYON.2025.E42214
Samant, S., Joshi, P., Jain, S., Bankar, S., & Ahuja, S. (2024). SMOTE based Credit Card Fraud Detection for Imbalanced Data: Performance Analysis. 2024 OPJU International Technology Conference on Smart Computing for Innovation and Advancement in Industry 4.0, OTCON 2024. https://doi.org/10.1109/OTCON60325.2024.10688312
Taskiran, S. F., Turkoglu, B., Kaya, E., & Asuroglu, T. (2025). A comprehensive evaluation of oversampling techniques for enhancing text classification performance. Scientific Reports, 15(1). https://doi.org/10.1038/s41598-025-05791-7
Tayebi, M., & El Kafhali, S. (2025). Generative Modeling for Imbalanced Credit Card Fraud Transaction Detection. Journal of Cybersecurity and Privacy, 5(1). https://doi.org/10.3390/jcp5010009
Zhang, Y., Deng, L., & Wei, B. (2024). Imbalanced Data Classification Based on Improved Random-SMOTE and Feature Standard Deviation. Mathematics, 12(11). https://doi.org/10.3390/math12111709
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Mursyid Ardiansyah, Ali Asgar Zainal Abidin

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All articles in this journal are the sole responsibility of the authors. Edumatic: Jurnal Pendidikan Informatika can be accessed free of charge, in accordance with the Creative Commons license used.

This work is licensed under a Lisensi a Creative Commons Attribution-ShareAlike 4.0 International License.


