Revisiting Resampling Strategies under Extreme Class Imbalance: Evidence from Large-Scale Online Payment Fraud Detection

Authors

DOI:

https://doi.org/10.29408/edumatic.v10i1.33272

Keywords:

adasyn, fraud detection, random oversampling, random undersampling, smote

Abstract

Extreme class imbalance in online payment fraud detection creates an accuracy paradox and an operational risk in which improving fraud capture can generate costly false alarms. This study uses a quantitative, experiment-based design to evaluate the operational impact of common resampling strategies under extreme skew using interpretable linear decision rules. The Online Payments Fraud dataset (6.36 million transactions) from Kaggle is analysed using six monetary balance/amount variables (amount, oldbalanceOrg, newbalanceOrig, oldbalanceDest, newbalanceDest) plus the rule-based isFlaggedFraud indicator to predict the isFraud label. Five training variants (no resampling, ROS, RUS, SMOTE, ADASYN) are compared with two linear decision rules: an ordinary least squares linear scoring model (thresholded at 0.5) and a linear SVM, using a leakage-free protocol in which resampling is applied only to the 80% training split and performance is assessed on an untouched, highly imbalanced 20% test set. The findings indicate that LinReg–RUS achieves the most balanced operating point (Precision 65.938%, Recall 47.718%, F1 55.367%, ROC-AUC 98.720%), whereas ADASYN increases recall but collapses precision (~2.1%), yielding F1 ≈4.17%. These results contribute controlled, large-scale evidence that under extreme imbalance, simpler resampling–model combinations can provide more deployable precision–recall trade-offs than aggressive synthetic sampling, supporting interpretable baselines for capacity-constrained payment screening.

References

Ahmed, K. H., Axelsson, S., Li, Y., & Sagheer, A. M. (2025). A credit card fraud detection approach based on ensemble machine learning classifier with hybrid data sampling. Machine Learning with Applications, 20, 100675. https://doi.org/10.1016/J.MLWA.2025.100675

Baisholan, N., Dietz, J. E., Gnatyuk, S., Turdalyuly, M., Matson, E. T., & Baisholanova, K. (2025). A Systematic Review of Machine Learning in Credit Card Fraud Detection Under Original Class Imbalance. In Computers (Vol. 14, Issue 10). Multidisciplinary Digital Publishing Institute (MDPI). https://doi.org/10.3390/computers14100437

Becerra-Suarez, F. L., Alvarez-Vasquez, H., & Forero, M. G. (2025). Improvement of Bank Fraud Detection Through Synthetic Data Generation with Gaussian Noise. Technologies, 13(4). https://doi.org/10.3390/technologies13040141

Chen, W., Yang, K., Yu, Z., Shi, Y., & Chen, C. L. P. (2024). A survey on imbalanced learning: latest research, applications and future directions. Artificial Intelligence Review, 57(6). https://doi.org/10.1007/s10462-024-10759-6

Chen, Y., Zhao, C., Xu, Y., Nie, C., & Zhang, Y. (2025). Deep Learning in Financial Fraud Detection: Innovations, Challenges, and Applications. Data Science and Management. https://doi.org/10.1016/J.DSM.2025.08.002

Cherif, A., Badhib, A., Ammar, H., Alshehri, S., Kalkatawi, M., & Imine, A. (2023). Credit card fraud detection in the era of disruptive technologies: A systematic review. Journal of King Saud University - Computer and Information Sciences, 35(1), 145–174. https://doi.org/10.1016/J.JKSUCI.2022.11.008

Compagnino, A. A., Maruccia, Y., Cavuoti, S., Riccio, G., Tutone, A., Crupi, R., & Pagliaro, A. (2025). An Introduction to Machine Learning Methods for Fraud Detection. In Applied Sciences (Switzerland) (Vol. 15, Issue 21). Multidisciplinary Digital Publishing Institute (MDPI). https://doi.org/10.3390/app152111787

Ghalwash, M. A., Abdelrazek, S. M., Eladawi, N. H., & Ghalwash, H. A. (2025). Enhancing credit card fraud detection using DBSCAN-augmented disjunctive voting ensemble. Scientific Reports, 15(1). https://doi.org/10.1038/s41598-025-22960-w

Ghosh Dastidar, K., Caelen, O., Granitzer, M., & Ghosh Dastidar, K. (2024). Machine Learning Methods for Credit Card Fraud Detection: A Survey. IEEE Access. https://doi.org/10.1109/ACCESS.2023.0322000

Gupta, R. K., Hassan, A., Majhi, S. K., Parveen, N., Zamani, A. T., Anitha, R., Ojha, B., Singh, A. K., & Muduli, D. (2025). Enhanced framework for credit card fraud detection using robust feature selection and a stacking ensemble model approach. Results in Engineering, 26, 105084. https://doi.org/10.1016/J.RINENG.2025.105084

Hidayat, W., Ardiansyah, M., & Setyanto, A. (2021). Jurnal Pendidikan Informatika Pengaruh Algoritma ADASYN dan SMOTE terhadap Performa Support Vector Machine pada Ketidakseimbangan Dataset Airbnb. Edumatic: Jurnal Pendidikan Informatika, 5(1). https://doi.org/10.29408/edumatic.v5i1.3125

Hilal, W., Gadsden, S. A., & Yawney, J. (2022). Financial Fraud: A Review of Anomaly Detection Techniques and Recent Advances. Expert Systems with Applications, 193, 116429. https://doi.org/10.1016/J.ESWA.2021.116429

Huang, H., Liu, B., Xue, X., Cao, J., & Chen, X. (2024). Imbalanced credit card fraud detection data: A solution based on hybrid neural network and clustering-based undersampling technique. Applied Soft Computing, 154, 111368. https://doi.org/10.1016/J.ASOC.2024.111368

Mohammed, Z. H., Hatem Khorsheed, F., & Ahmed, G. J. (2025). Ensemble Deep Learning Strategy for Handling Imbalanced Credit Card Fraud Data. In Network, and Computer Science) | (Vol. 8, Issue 2).

More, R., & Dashore, P. (2025). A Novel Deep Learning Approach for Online Payment Fraud Detection. EPJ Web of Conferences, 341, 01013. https://doi.org/10.1051/epjconf/202534101013

Mustafa, A. A., Hussein, H. M., Kadhim, M. M., & Hussein, M. J. (2025). A Hybrid Oversampling Approach for Fraud Detection: Integrating SMOTE-ENN and ADASYN. International Journal of Safety and Security Engineering, 15(6), 1243–1250. https://doi.org/10.18280/ijsse.150614

Noviandy, T. R., Idroes, G. M., Maulana, A., Hardi, I., Ringga, E. S., & Idroes, R. (2023). Credit Card Fraud Detection for Contemporary Financial Management Using XGBoost-Driven Machine Learning and Data Augmentation Techniques. Indatu Journal of Management and Accounting, 1(1), 29–35. https://doi.org/10.60084/ijma.v1i1.78

Oztemel, E., & Isik, M. (2025). A Systematic Review of Intelligent Systems and Analytic Applications in Credit Card Fraud Detection. In Applied Sciences (Switzerland) (Vol. 15, Issue 3). Multidisciplinary Digital Publishing Institute (MDPI). https://doi.org/10.3390/app15031356

Pratama, M. D., Raharjo, A. B., & Purwitasari, D. (2023). ENSEMBLE OVERSAMPLING FOR FINANCIAL FRAUD CLASSIFICATION OF IMBALANCED DATA. IPTEK The Journal of Technology and Science, 34(2), 853–4098. https://doi.org/10.12962/j20882033.v34i3.17183

Putra, A. H., & Salam, A. (2025). A Comparative Performance of SMOTE, ADASYN and Random Oversampling in Machine Learning Models on Prostate Cancer Dataset. In Journal of Applied Informatics and Computing (JAIC) (Vol. 9, Issue 3). http://jurnal.polibatam.ac.id/index.php/JAIC

Rahman, R. M. R., & Muslim, M. A. (2024). Online Payment Fraud Prediction With Machine Learning Approach Using Naive Bayes Algorithm. Journal of Student Research Exploration, 2.

Rtayli, N., & Enneya, N. (2020). Enhanced credit card fraud detection based on SVM-recursive feature elimination and hyper-parameters optimization. Journal of Information Security and Applications, 55, 102596. https://doi.org/10.1016/J.JISA.2020.102596

Salifu, D., Chepkemoi, L., Ibrahim, E. A., Nkoba, K., & Tonnang, H. E. Z. (2025). Data Augmentation and Machine Learning algorithms for multi-class imbalanced morphometrics data of stingless bees. Heliyon, 11(3), e42214. https://doi.org/10.1016/J.HELIYON.2025.E42214

Samant, S., Joshi, P., Jain, S., Bankar, S., & Ahuja, S. (2024). SMOTE based Credit Card Fraud Detection for Imbalanced Data: Performance Analysis. 2024 OPJU International Technology Conference on Smart Computing for Innovation and Advancement in Industry 4.0, OTCON 2024. https://doi.org/10.1109/OTCON60325.2024.10688312

Taskiran, S. F., Turkoglu, B., Kaya, E., & Asuroglu, T. (2025). A comprehensive evaluation of oversampling techniques for enhancing text classification performance. Scientific Reports, 15(1). https://doi.org/10.1038/s41598-025-05791-7

Tayebi, M., & El Kafhali, S. (2025). Generative Modeling for Imbalanced Credit Card Fraud Transaction Detection. Journal of Cybersecurity and Privacy, 5(1). https://doi.org/10.3390/jcp5010009

Zhang, Y., Deng, L., & Wei, B. (2024). Imbalanced Data Classification Based on Improved Random-SMOTE and Feature Standard Deviation. Mathematics, 12(11). https://doi.org/10.3390/math12111709

Downloads

Published

2026-03-03

How to Cite

Ardiansyah, M., & Abidin, A. A. Z. (2026). Revisiting Resampling Strategies under Extreme Class Imbalance: Evidence from Large-Scale Online Payment Fraud Detection. Edumatic: Jurnal Pendidikan Informatika, 10(1), 21–29. https://doi.org/10.29408/edumatic.v10i1.33272