Algorithm-Sensitive Feature Weighting using Mutual Information and ReliefF for Heart Disease Classification
DOI:
https://doi.org/10.29408/edumatic.v10i1.34262Keywords:
classification, feature weighting, heart disease, mutual information, relieffAbstract
Heart disease remains a critical global health issue requiring reliable early prediction for clinical decision-making. This study evaluates the effect of feature selection and feature weighting on the performance of machine learning models with different learning mechanisms, namely K-Nearest Neighbors (KNN), Naïve Bayes (NB), and Logistic Regression (LR), for heart disease prediction. An algorithm-sensitive comparative framework was applied using global feature relevance estimation (Mutual Information), local feature relevance estimation (ReliefF), feature selection, and feature weighting, with raw features as baseline. Experiments were conducted on the Cleveland Heart Disease dataset using stratified 5-fold cross-validation. The results show that ReliefF-based feature weighting achieves the best performance for Naïve Bayes (accuracy = 0.8085; F1-score = 0.7774), while Logistic Regression attains the highest overall performance under the baseline (accuracy = 0.8547; F1-score = 0.8324). Feature selection improves KNN performance due to reduced sensitivity to irrelevant features. These findings indicate that the effectiveness of feature importance strategies depends on model-specific learning behavior, where feature weighting benefits probabilistic models, while feature selection is more effective for distance-based models. This study contributes an algorithm-sensitive evaluation perspective for aligning feature importance strategies with machine learning model characteristics to improve heart disease prediction performance.
References
Bhatt, C. M., Patel, P., Ghetia, T., & Mazzeo, P. L. (2023). Effective heart disease prediction using machine learning techniques. Algorithms, 16(2), 88. https://doi.org/10.3390/a16020088
Chen, W., Cai, Y., Li, A., Su, Y., & Jiang, K. (2023). EEG feature selection method based on maximum information coefficient and quantum particle swarm. Scientific Reports, 13(1), 14515. https://doi.org/10.1038/s41598-023-41682-5
Fira, M., Goras, L., & Costin, H. N. (2025). Evaluating sparse feature selection methods: A theoretical and empirical perspective. Applied Sciences, 15(7), 3752. https://doi.org/10.3390/app15073752
Hassan, W., Hussain, G. A., Wahid, A., Safdar, M., Khalid, H. M., & Jamil, M. K. M. (2024). Hassan, W., Hussain, G. A., Wahid, A., Safdar, M., Khalid, H. M., & Jamil, M. K. M. (2024). Optimum feature selection for classification of PD signals produced by multiple insulation defects in electric motors. Scientific Reports, 14(1), 23446. https://doi.org/10.1038/s41598-024-73196-z
Huang, L., Zhou, X., Shi, L., & Gong, L. (2024). Time series feature selection method based on mutual information. Applied Sciences, 14(5), 1960. https://doi.org/10.3390/app14051960
Iacobescu, P., Marina, V., Anghel, C., & Anghele, A. D. (2024). Evaluating binary classifiers for cardiovascular disease prediction: enhancing early diagnostic capabilities. Journal of Cardiovascular Development and Disease, 11(12), 396. https://doi.org/10.3390/jcdd11120396
Kidambi Raju, S., Ramaswamy, S., Eid, M. M., Gopalan, S., Karim, F. K., Marappan, R., & Khafaga, D. S. (2023). Evaluation of mutual information and feature selection for SARS-CoV-2 respiratory infection. Bioengineering, 10(7), 880. https://doi.org/10.3390/bioengineering10070880
Li, K., & Fard, N. (2022). A novel nonparametric feature selection approach based on mutual information transfer network. Entropy, 24(9), 1255. https://doi.org/10.3390/e24091255
Khan Mamun, M. M. R., & Elfouly, T. (2023). Detection of cardiovascular disease from clinical parameters using a one-dimensional convolutional neural network. Bioengineering, 10(7), 796. https://doi.org/10.3390/bioengineering10070796
Pau, S., Perniciano, A., Pes, B., & Rubattu, D. (2023). An evaluation of feature selection robustness on class noisy data. Information, 14(8), 438. https://doi.org/10.3390/info14080438
Sarra, R. R., Dinar, A. M., Mohammed, M. A., & Abdulkareem, K. H. (2022). Enhanced heart disease prediction based on machine learning and χ2 statistical optimal feature selection model. Designs, 6(5), 87. https://doi.org/10.3390/designs6050087
Sayadi, M., Varadarajan, V., Sadoughi, F., Chopannejad, S., & Langarizadeh, M. (2022). A machine learning model for detection of coronary artery disease using noninvasive clinical parameters. Life, 12(11), 1933. https://doi.org/10.3390/life12111933
Talaat, F. M., Elnaggar, A. R., Shaban, W. M., Shehata, M., & Elhosseini, M. (2024). CardioRiskNet: a hybrid AI-based model for explainable risk prediction and prognosis in cardiovascular disease. Bioengineering, 11(8), 822. https://doi.org/10.3390/bioengineering11080822
Tiwari, A. K., Saini, R., Nath, A., Singh, P., & Shah, M. A. (2024). Hybrid similarity relation based mutual information for feature selection in intuitionistic fuzzy rough framework and its applications. Scientific Reports, 14(1), 5958. https://doi.org/10.1038/s41598-024-55902-z
Victor, O. A., Chen, Y., & Ding, X. (2024). Non-invasive heart failure evaluation using machine learning algorithms. Sensors, 24(7), 2248. https://doi.org/10.3390/s24072248
Wang, Y., Wang, X., Wang, C., & Zhou, J. (2024). Global, regional, and national burden of cardiovascular disease, 1990-2021: results from the 2021 global burden of disease study. Cureus, 16(11). https://doi.org/10.7759/cureus.74333
Wild, R., Sozio, E., Margiotta, R. G., Dellai, F., Acquasanta, A., Del Ben, F., ... & Laio, A. (2024). Maximally informative feature selection using Information Imbalance: Application to COVID-19 severity prediction. Scientific Reports, 14(1), 10744. https://doi.org/10.1038/s41598-024-61334-6
Xi, J., Jiang, Q., Liu, H., & Gao, X. (2023). Lithological mapping research based on feature selection model of ReliefF-RF. Applied Sciences, 13(20), 11225. https://doi.org/10.3390/app132011225
Yan, X., Shang, S., Li, D., & Dang, Y. (2025). An efficient and interactive feature selection approach based on copula entropy for high-dimensional genetic data. Scientific Reports, 15(1), 30100. https://doi.org/10.1038/s41598-025-15068-8
Zhang, L., Lin, G., Wei, L., & Kou, Y. (2024). Feature subset selection for multi-scale neighborhood decision information system via mutual information. Artificial Intelligence Review, 57(1), 15. https://doi.org/10.1007/s10462-023-10626-w
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Dwi Prihantono, Umar Faqih, Indra Indra

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All articles in this journal are the sole responsibility of the authors. Edumatic: Jurnal Pendidikan Informatika can be accessed free of charge, in accordance with the Creative Commons license used.

This work is licensed under a Lisensi a Creative Commons Attribution-ShareAlike 4.0 International License.


