Peningkatan Akurasi Deteksi Dini Penyakit Parkinson melalui Pendekatan Ensemble Learning dan Seleksi Fitur Optimal

Kang Andini Wulandari; Adhitya Nugraha; Ardytha Luthfiarta; Laila Rahmatin Nisa

doi:10.29408/edumatic.v8i2.27788

Authors

Kang Andini Wulandari Program Studi Teknik Informatika, Universitas Dian Nuswantoro
Adhitya Nugraha Program Studi Teknik Informatika, Universitas Dian Nuswantoro
Ardytha Luthfiarta Program Studi Teknik Informatika, Universitas Dian Nuswantoro
Laila Rahmatin Nisa Program Studi Teknik Informatika, Universitas Dian Nuswantoro

DOI:

https://doi.org/10.29408/edumatic.v8i2.27788

Keywords:

ensemble learning, filter-based, hyperparameter tuning, parkinson classification

Abstract

Early detection of Parkinson's disease (PD) is essential to enhance patient quality of life through timely intervention. This research aims to develop a predictive model using an ensemble learning approach and optimal feature selection. This experimental study employs three machine learning algorithms: random forest, XGBoost, and extra trees, optimized through hyperparameter tuning, feature selection techniques, and Kernel Principal Component Analysis (KPCA) for dimensionality reduction. The study utilizes the UCI Machine Learning Parkinson Dataset, which consists of 80 samples and 44 acoustic features extracted from patients' voices as they sustain the vowel sound "/a/" for five seconds. Results show that XGBoost achieved the highest accuracy at 88.93% after tuning and KPCA, followed by extra trees with 86.15%, and random forest with 85.47%. The application of KPCA successfully reduced data dimensionality without sacrificing accuracy, thereby improving modeling efficiency. These findings suggest that voice data holds significant potential for early PD detection and that selecting appropriate algorithms and dimensionality reduction techniques is crucial for optimizing data-driven diagnostic models.

References

Alalayah, K. M., Senan, E. M., Atlam, H. F., Ahmed, I. A., & Shatnawi, H. S. A. (2023). Automatic and early detection of Parkinson’s disease by analyzing acoustic signals using classification algorithms based on recursive feature elimination method. Diagnostics, 13(11), 1-24. https://doi.org/10.3390/diagnostics13111924

Ananda, I. K., Fanani, A. Z., Setiawan, D., & Wicaksono, D. F. (2024). Penerapan random oversampling dan algoritma boosting untuk memprediksi kualitas buah jeruk. Edumatic: Jurnal Pendidikan Informatika, 8(1), 282–289. https://doi.org/10.29408/edumatic.v8i1.25836

Aprilitaz, W., Akbar, R.., Cahya Prayogi, R., & Rahmaddeni. (2023). Komparasi Algoritma K-Nearest Neighbor (KNN) dan Naive Bayes dalam Klasifikasi Penyakit Parkinson. SENTIMAS: Seminar Nasional Penelitian Dan Pengabdian Masyarakat, 1(1), 188-193.

Deepa, P., & Khilar, R. (2024). Parkinson’s disease detection and classification: Leveraging voice features and ensemble methods with feature selection and ERT classifier. Procedia Computer Science, 235, 1695–1706. https://doi.org/10.1016/j.procs.2024.04.160

Desiani, A., Narti, N., Ramayanti, I., Arhami, M., & Irmeilyana, I. (2023). Diagnosa penyakit Parkinson dengan algoritma K-Nearest Neighbor dan Decision Tree C4.5. SENTIMAS: Seminar Nasional Penelitian dan Pengabdian Masyarakat, 12(1), 47–58.

Fahim, M. I., Islam, S., Noor, S. T., Hossain, M. J., & Setu, M. S. (2021). Machine learning model to analyze telemonitoring dysphemia factors of Parkinson’s disease. International Journal of Advanced Computer Science and Applications, 12(8), 786–795. https://doi.org/10.14569/IJACSA.2021.0120890

Fahira, N. R., Lawi, A., & Aqsha, M. (2023). Early detection model of Parkinson’s disease using random forest method on voice frequency data. Journal of Natural Sciences and Mathematics Research, 9(1), 29-37. https://doi.org/10.21580/jnsmr.2023.9.1.13148

Farida, Y., Ulinnuha, N., Sari, S. K., & Desinaini, L. N. (2023). Comparing support vector machine and Naïve Bayes methods with a selection of fast correlation based filter features in detecting Parkinson’s disease. Lontar Komputer : Jurnal Ilmiah Teknologi Informasi, 14(2), 80-90. https://doi.org/10.24843/LKJITI.2023.v14.i02.p02

Govindu, A., & Palwe, S. (2023). Early detection of Parkinson’s disease using machine learning. Procedia Computer Science, 218, 249–261. https://doi.org/10.1016/j.procs.2023.01.007

Handayani, P. K. (2021). Penerapan algoritma support vector machine (SVM) untuk analisis pola klasifikasi pada Parkinson’s dataset. Indonesian Journal of Technology, Informatics and Science (IJTIS), 3(1), 31–35. https://doi.org/10.24176/ijtis.v3i1.7530

Ibarra, E. J., Arias-Londoño, J. D., Zañartu, M., & Godino-Llorente, J. I. (2023). Towards a corpus (and language)-independent screening of Parkinson’s disease from voice and speech through domain adaptation. Bioengineering, 10(11), 1-19. https://doi.org/10.3390/bioengineering10111316

Iyer, A., Kemp, A., Rahmatallah, Y., Pillai, L., Glover, A., Prior, F., Larson-Prior, L., & Virmani, T. (2023). A machine learning method to process voice samples for identification of Parkinson’s disease. Scientific Reports, 13(1), 1-9. https://doi.org/10.1038/s41598-023-47568-w

Karabayir, I., Goldman, S. M., Pappu, S., & Akbilgic, O. (2020). Gradient boosting for Parkinson’s disease diagnosis from voice recordings. BMC Medical Informatics and Decision Making, 20(1), 1-7. https://doi.org/10.1186/s12911-020-01250-7

Khotiah, T., Abdillah, D. F., K, I. B., Arianto, F., & Rohman, A. (2023). Comparison of machine learning techniques in the classification of Parkinson’s disease sufferers. Journal of Computer Networks, Architecture and High Performance Computing, 5(1), 129–137. https://doi.org/10.47709/cnahpc.v5i1.2035

Malekroodi, H. S., Madusanka, N., Lee, B. Il, & Yi, M. (2024). Leveraging deep learning for fine-grained categorization of Parkinson’s disease progression levels through analysis of vocal acoustic patterns. Bioengineering, 11(3), 1-23. https://doi.org/10.3390/bioengineering11030295

Mittal, V., & Sharma, R. K. (2021). Machine learning approach for classification of Parkinson disease using acoustic features. Journal of Reliable Intelligent Environments, 7(3), 233–239. https://doi.org/10.1007/s40860-021-00141-6

Mondol, S. I. M. M. R., Kim, R., & Lee, S. (2023). Hybrid machine learning framework for multistage Parkinson’s disease classification using acoustic features of sustained Korean vowels. Bioengineering, 10(8), 1-15. https://doi.org/10.3390/bioengineering10080984

Nainggolan, K. R., Purnamasari, F., & Pulungan, A. F. (2023). Prediksi penyakit Parkinson melalui dataset rekam suara dengan menggunakan algoritma deep neural network. Jurnal Minfo Polgan, 12(1), 401-409.

Nijhawan, R., Kumar, M., Arya, S., Mendirtta, N., Kumar, S., Towfek, S. K., Khafaga, D. S., Alkahtani, H. K., & Abdelhamid, A. A. (2023). A novel artificial-intelligence-based approach for classification of Parkinson’s disease using complex and large vocal features. Biomimetics, 8(4), 1-19. https://doi.org/10.3390/biomimetics8040351

Pramanik, M., Pradhan, R., Nandy, P., Bhoi, A. K., & Barsocchi, P. (2023). The ForEx++ based decision tree ensemble approach for robust detection of Parkinson’s disease. Journal of Ambient Intelligence and Humanized Computing, 14(9), 11429–11453. https://doi.org/10.1007/s12652-022-03719-x

Scimeca, S., Amato, F., Olmo, G., Asci, F., Suppa, A., Costantini, G., & Saggio, G. (2023). Robust and language-independent acoustic features in Parkinson’s disease. Frontiers in Aging Neuroscience, 15, 1-19. https://doi.org/10.3389/fneur.2023.1198058

Yudha, E. P., & Muhammad, N. F. (2023). Sistem otomatis untuk deteksi penyakit Parkinson menggunakan fuzzy K-NN. Jurnal Teknik Komputer, 9(2), 176–184. https://doi.org/10.31294/jtk.v9i2.15933