Comparing Support Vector Machine and Naïve Bayes Methods with A Selection of Fast Correlation Based Filter Features in Detecting Parkinson's Disease

  • Yuniar Farida UIN Sunan Ampel Surabaya
  • Nurissaidah Ulinnuha UIN Sunan Ampel Surabaya
  • Silvia Kartika Sari UIN Sunan Ampel Surabaya
  • Latifatun Nadya Desinaini UIN Sunan Ampel Surabaya

Abstract

Dopamine levels fall due to brain nerve cell destruction, producing Parkinson's symptoms. Humans with this illness experience central nervous system damage, which lowers the quality of life. This disease is not deadly, but when people's quality of life decreases, they cannot perform daily activities as people do. Even in one case, this disease can cause death indirectly. Contrast support vector machines (SVM) and naive Bayesian approaches with and without fast correlation-based filter (FCBF) feature selection, this study attempts to determine the optimum model to detect Parkinson's disease categorization. In this study, datasets from the UCI Machine Learning Repository are used. The results showed that SVM with FCBF achieved the highest accuracy among all the models tested. SVM with FCBF provides an accuracy of 86.1538%, sensitivity of 93.8775%, and specificity of 62.5000%. Both methods, SVM and Naive Bayes, have improved in performance due to FCBF, with SVM showing a more significant increase in accuracy. This research contributed to helping paramedics determine if a patient has Parkinson's disease or not using characteristics obtained from data, such as movement, sound, or other pertinent factors.


 

Downloads

Download data is not yet available.

References

[1] WHO, Atlas - Country resources for neurological disorders, vol. 30, no. November. 2017. [Online]. Available: https://www.who.int/publications/i/item/atlas-country-resources-for-neurological-disorders
[2] A. Elbaz, L. Carcaillon, S. Kab, and F. Moisan, "EPIDEMIOLOGY OF PARKINSON' S DISEASE The Rotterdam Study," Reveu Neurologique (Paris)., vol. 172, no. 1, pp. 14–26, 2016, doi: 10.1016/j.neurol.2015.09.012.
[3] R. Pahwa and K. E. Lyons, Handbook of Parkinson's Disease, Fifth Edit. 2013.
[4] Y. Wu et al., "Dysphonic Voice Pattern Analysis of Patients in Parkinson's Disease Using Minimum Interclass Probability Risk Feature Selection and Bagging Ensemble Learning Methods," Computational and Mathematical Methods in Medicine, vol. 2017, 2017, doi: 10.1155/2017/4201984.
[5] E. Muliawan, S. Jehosua, and R. Tumewah, “Diagnosis dan Terapi Deep Brain Stimulation pada Penyakit Parkinson,” Jurnal Neurologi Manado SINAPSIS, vol. 1, no. 1, pp. 67–84, 2018.
[6] S. S. Cui et al., "Prevalence and risk factors for depression and anxiety in Chinese patients with Parkinson's disease," BMC Geriatrics, vol. 17, no. 1, pp. 1–10, 2017, doi: 10.1186/s12877-017-0666-2.
[7] J. N. Mazon, A. H. de Mello, G. K. Ferreira, and G. T. Rezin, "The impact of obesity on neurodegenerative diseases," Life Sciences, vol. 182, pp. 22–28, 2017, doi: 10.1016/j.lfs.2017.06.002.
[8] J. Li, Y. Lei, and S. Yang, "Mid-long term load forecasting model based on support vector machine optimized by improved sparrow search algorithm," Energy Reports, vol. 8, pp. 491–497, 2022, doi: 10.1016/j.egyr.2022.02.188.
[9] G. Yang and X. Gu, "Fault Diagnosis of Complex Chemical Processes Based on Enhanced Naive Bayesian Method," IEEE Transactions on Instrumentation and Measurement, vol. 69, no. 7, pp. 4649–4658, 2020, doi: 10.1109/TIM.2019.2954151.
[10] R. Prashanth, S. Dutta Roy, P. K. Mandal, and S. Ghosh, "High-Accuracy Detection of Early Parkinson's Disease through Multimodal Features and Machine Learning," International Journal of Medical Informatics, vol. 90, pp. 13–21, 2016, doi: 10.1016/j.ijmedinf.2016.03.001.
[11] T. I. Trishna, S. U. Emon, R. R. Ema, G. I. H. Sajal, S. Kundu, and T. Islam, "Detection of Hepatitis (A, B, C, and E) Viruses Based on Random Forest, K-nearest and Naïve Bayes Classifier," 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp. 1–7, 2019, doi: 10.1109/ICCCNT45670.2019.8944455.
[12] A. M. Bashar, H. Nozari, S. Marofi, M. Mohamadi, and A. Ahadiiman, "Investigation of factors affecting rural drinking water consumption using intelligent hybrid models," Water Science and Engineering., vol. 16, no. 2, pp. 175–183, 2022, doi: 10.1016/j.wse.2022.12.002.
[13] X. Li, Y. Zhang, M. Du, and J. Yang, "The forecasting of passenger demand under hybrid ridesharing service modes: A combined model based on WT-FCBF-LSTM," Sustainable Cities and Society., vol. 62, no. April, p. 102419, 2020, doi: 10.1016/j.scs.2020.102419.
[14] H. Djellali and S. Guessoum, "Fast Correlation based Filter combined with Genetic Algorithm and Particle Swarm on Feature Selection Hayet," 2017 5th International Conference on Electrical Engineering - Boumerdes (ICEE-B), vol. 2017-Janua, pp. 1–6, 2017.
[15] D. Avci and A. Dogantekin, "An Expert Diagnosis System for Parkinson's Disease Based on Genetic Algorithm-Wavelet Kernel-Extreme Learning Machine," Parkinson's Disease, vol. 2016, 2016, doi: 10.1155/2016/5264743.
[16] L. Yu and H. Liu, "Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution," ICML'03: Proceedings of the Twentieth International Conference on International Conference on Machine Learning, vol. 2, pp. 856–863, 2003.
[17] Y. Zhang, G. Pan, Y. Zhao, Q. Li, and F. Wang, "Short-term wind speed interval prediction based on artificial intelligence methods and error probability distribution," Energy Conversion and Management. Manag., vol. 224, no. August, p. 113346, 2020, doi: 10.1016/j.enconman.2020.113346.
[18] D. N. Avianty, P. I. G. P. S. Wijaya, and F. Bimantoro, "The Comparison of SVM and ANN Classifier for COVID-19 Prediction," Lontar Komputer - Jurnal Ilmiah Teknologi Informasi, vol. 13, no. 2, p. 128, 2022, doi: 10.24843/lkjiti.2022.v13.i02.p06.
[19] A. Z. Foeady, D. C. R. Novitasari, A. H. Asyhar, and M. Firmansjah, "Automated Diagnosis System of Diabetic Retinopathy Using GLCM Method and SVM Classifier," 2018 5th International Conference on Electrical Engineering, Computer Science and Informatics (EECSI), pp. 154–160, 2019, doi: 10.1109/eecsi.2018.8752726.
[20] A. Manoharan, K. M. Begam, V. R. Aparow, and D. Sooriamoorthy, "Artificial Neural Networks, Gradient Boosting and Support Vector Machines for electric vehicle battery state estimation: A review," Journal of Energy Storage, vol. 55, no. PA, p. 105384, 2022, doi: 10.1016/j.est.2022.105384.
[21] A. Roy and S. Chakraborty, "Support vector machine in structural reliability analysis: A review," Reliability Engineering & System Safety, vol. 233, no. January, p. 109126, 2023, doi: 10.1016/j.ress.2023.109126.
[22] L. P. Wanti, N. W. A. Prasetya, L. Sari, L. Puspitasari, and A. Romadloni, "Comparison of Naive Bayes Method and Certainty Factor for Diagnosis of Preeclampsia," Lontar Komputer - Jurnal Ilmiah Teknologi Informasi, vol. 13, no. 2, p. 105, 2022, doi: 10.24843/lkjiti.2022.v13.i02.p04.
[23] Z. Deng, T. Han, Z. Cheng, J. Jiang, and F. Duan, "Fault detection of petrochemical process based on space-time compressed matrix and Naive Bayes," Process Safety and Environmental Protection, vol. 160, pp. 327–340, 2022, doi: 10.1016/j.psep.2022.01.048.
[24] Q. Tan et al., "A new sensor fault diagnosis method for gas leakage monitoring based on the naive Bayes and probabilistic neural network classifier," Measurement, vol. 194, no. 6, March, p. 111037, 2022, doi: 10.1016/j.measurement.2022.111037.
[25] Y. Farida and N. Ulinnuha, “Klasifikasi Mahasiswa Penerima Program Beasiswa Bidik Misi Menggunakan Naive Bayes,” Systemic Information System and Informatics Journal, vol. 4, no. 1, pp. 17–22, 2018, doi: 10.29080/systemic.v4i1.317.
[26] B. Rajoub, Supervised and unsupervised learning. Elsevier Inc., 2020. doi: 10.1016/b978-0-12-818946-7.00003-2.
[27] A. Roy, R. Manna, and S. Chakraborty, "Support vector regression based metamodeling for structural reliability analysis," Probabilistic Engineering Mechanics, vol. 55, no. September 2018, pp. 78–89, 2019, doi: 10.1016/j.probengmech.2018.11.001.
[28] X. Lin, L. Zhao, C. Shang, W. He, W. Du, and F. Qian, "Data-driven robust optimization for cyclic scheduling of ethylene cracking furnace system under uncertainty based on kernel learning," Chemical Engineering Science, vol. 260, p. 117919, 2022, doi: 10.1016/j.ces.2022.117919.
[29] D. Wilk-Kolodziejczyk, K. Regulski, and G. Gumienny, "Comparative analysis of the properties of the nodular cast iron with carbides and the austempered ductile iron with use of the machine learning and the support vector machine," The International Journal of Advanced Manufacturing Technology, vol. 87, no. 1–4, pp. 1077–1093, 2016, doi: 10.1007/s00170-016-8510-y.
[30] Z. Zhao et al., "Multi support vector models to estimate solubility of Busulfan drug in supercritical carbon dioxide," Journal of Molecular Liquids, vol. 350, p. 118573, 2022, doi: 10.1016/j.molliq.2022.118573.
[31] W. Dong, S. Gao, and S. S. T. Yau, "Hessian matrix non-decomposition theorem and its application to nonlinear filtering," Nonlinear Analysis, vol. 230, p. 113236, 2023, doi: 10.1016/j.na.2023.113236.
[32] M. Xu and C. Shi, "A Hessian recovery-based finite difference method for biharmonic problems," Applied Mathematics Letters, vol. 137, no. 12271482, p. 108503, 2023, doi: 10.1016/j.aml.2022.108503.
[33] Y. Wang, Y. Jia, Y. Tian, and J. Xiao, "Deep reinforcement learning with the confusion-matrix-based dynamic reward function for customer credit scoring," Expert Systems with Applications, vol. 200, no. March, p. 117013, 2022, doi: 10.1016/j.eswa.2022.117013.
[34] D. C. R. Novitasari, A. H. Asyhar, M. Thohir, A. Z. Arifin, H. Mu'jizah, and A. Z. Foeady, "Cervical Cancer Identification Based Texture Analysis Using GLCM-KELM on Colposcopy Data," 2020 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), pp. 409–414, 2020, doi: 10.1109/ICAIIC48513.2020.9065252.
[35] Oxford, "Parkinson's Data Set," UCI Learning Repository, 2007. https://archive.ics.uci.edu/ml/datasets/parkinsons (accessed Mar. 05, 2020).
[36] Y. Jung, "Multiple predicting K-fold cross-validation for model selection," Journal of Nonparametric Statistics, vol. 30, no. 1, pp. 197–215, 2018, doi: 10.1080/10485252.2017.1404598.
[37] P. Refaeilzadeh, L. Tang, H. Liu, L. Angeles, and C. D. Scientist, "Encyclopedia of Database Systems," Encyclopedia of Database Systems, 2020, doi: 10.1007/978-1-4899-7993-3.
Published
2023-11-04
How to Cite
FARIDA, Yuniar et al. Comparing Support Vector Machine and Naïve Bayes Methods with A Selection of Fast Correlation Based Filter Features in Detecting Parkinson's Disease. Lontar Komputer : Jurnal Ilmiah Teknologi Informasi, [S.l.], v. 14, n. 2, p. 80-90, nov. 2023. ISSN 2541-5832. Available at: <https://ojs.unud.ac.id/index.php/lontar/article/view/98322>. Date accessed: 24 nov. 2024. doi: https://doi.org/10.24843/LKJITI.2023.v14.i02.p02.