Comparing Support Vector Machine and Naïve Bayes Methods with A Selection of Fast Correlation Based Filter Features in Detecting Parkinson's Disease

Dopamine levels fall due to brain nerve cell destruction, producing Parkinson's symptoms. Humans with this illness experience central nervous system damage, which lowers the quality of life. This disease is not deadly, but when people's quality of life decreases, they cannot perform daily activities as people do. Even in one case, this disease can cause death indirectly. Contrast support vector machines (SVM) and naive Bayesian approaches with and without fast correlation-based filter (FCBF) feature selection, this study attempts to determine the optimum model to detect Parkinson's disease categorization. In this study, datasets from the UCI Machine Learning Repository are used. The results showed that SVM with FCBF achieved the highest accuracy among all the models tested. SVM with FCBF provides an accuracy of 86.1538%, sensitivity of 93.8775%, and specificity of 62.5000%. Both methods, SVM and Naive Bayes, have improved in performance due to FCBF, with SVM showing a more significant increase in accuracy. This research contributed to helping paramedics determine if a patient has Parkinson's disease or not using characteristics obtained from data, such as movement, sound, or other pertinent factors.


Comparing Support Vector Machine and Naïve Bayes Methods with A Selection of Fast Correlation Based
Filter Features in Detecting Parkinson's Disease

Introduction
Parkinson's disease still affects people, and its prevalence must not be understated.According to data from the World Health Organization (WHO), in 2017, 25% of individuals globally were affected with Parkinson's disease [1].Parkinson's is a cell-based degenerative disease in which the tissue mechanisms in Parkinson's division grow malignant.The growth is an aggressive neoplasm with abnormal increases in excessive amounts, so it is the cause of damage to cell tissue in Parkinson's [2].The second most common neurological ailment, Parkinson's disease, affects 2-3% of adults over 65.Loss of nerve cells in the substantia nigra, which transmits impulses that control movement, is the outcome of the disorder, which affects the brain's most profound neurological system.Reduced sense of smell, constipation, sleeplessness, limb tremors, and trouble moving are common symptoms among Parkinson's patients [3].Parkinson's can be detected using voice recording.Patients were asked to pronounce vowels with a class 1 sound level measuring microphone 8 cm in front of the lips.The amplitude of the resulting signal will be digitally normalized to determine the difference in the patient's behavior.Multidimensional speech program (MDVP) increase was used to assess many aspects of sound variation, including harmonic-to-noise ratio (HNR), harmonic-to-harmonic ratio (NHR), amplitude (simmer), and period (jitter).Other parameters are also calculated to describe the degree of complexity of sound recordings' signals and fractal dimensions [4].Previous studies have identified several risk factors for this illness.In males, the age range of 40 to 70 is higher for this condition [5].Parkinson's patients had anxiety and hopelessness rates of 25.81 percent and 11.17 percent, respectively.In Cui et al.'s research [6], several risk variables that impact the quality of life for individuals with Parkinson's disease were found.Anxiety, dyskinesias, poor sleep, and increased motor function are risk factors for depression in Parkinson's disease patients.More severe autonomic dysfunction is a risk factor for PD patients with compression, but rapid eye movement behavior (RBD) is not.
Further studies by Mazon et al. [7], concentrating on Alzheimer's and Parkinson's, demonstrated the connection between metabolic alterations and neurodegenerative illnesses.He discussed the relationship between obesity and the onset of neurodegenerative diseases, such as Alzheimer's and Parkinson's.It has been shown that obesity significantly affects how Parkinson's and Alzheimer's disease progress.Obesity-induced metabolic alterations in the central nervous system (CNS) may cause apoptosis and cell necrosis, which can ultimately result in neuronal death.These alterations also impact the synaptic plasticity of neurons.
Monocyte chemoattractant protein-1 and APQ3-IR have been linked to Parkinson's disease in Indonesia, as have age, minimum pitch, maximum pitch, average pitch, jitter level, adiponectin, glucose, and shimmer.Information gathered about Parkinson's illness at the University Hospital of Coimbra supports this.This data considers age, minimum vocal basal, maximum vocal basal, average vocal basal, jitter, simmer (dB), glucose, and simmer: APQ3-IR and monocyte chemoattractant protein-1.Early detection efforts are being made to reduce the impact of Parkinson's.This effort can be made using classification methods.Support Vector Machines (SVM) and Nave Bayes (NB) are widely used classification methods in previous research.The SVM approach faces challenges in the pattern (curse of dimensionality), is efficient to apply, and can generalize patterns that do not fit into the class.The SVM model has distinct advantages for handling small samples of nonlinear and high-dimensional issues.It is less prone than the ANN model to become caught in locally optimal solutions.As a result, data classification, pattern recognition, regression analysis, and other processes frequently use the SVM model [8].The NB technique, in comparison, has the benefits of quick calculation times, high accuracy, and straightforward algorithms.The NB algorithm benefits from simple logic and consistent performance because it operates under the assumption that all features are independent.Although it is frequently challenging to meet the independence criteria, the NB classifier performs well in practical applications [9].SVM and NB are frequently employed in studies, as demonstrated by Roy et al. [10], who used NB classifiers, SVM-boosted trees, and random forests to separate early Parkinson's disease patients from the general population and perform early (or preclinical) diagnosis of Parkinson's disease.The results show that the SVM classifier performs best (accuracy 96.40%, sensitivity 97.03%, specificity 95.01%, area under ROC 98.88%).Combining non-motor, CSF, and imaging indices can support the preclinical diagnosis of Parkinson's disease.
Emon et al. [11] carried out a different investigation with Naive Bayes.They used various data mining techniques, including naive Bayes, K-nearest, and Random Forest classifiers, to diagnose hepatitis.The inquiry yielded the following results: K-nearest neighbors had a 95.8 percent accuracy rate with 10-fold cross-validation, Random Forest had a 98.6 percent accuracy rate, and Naive Bayes had a 93.2 percent accuracy rate.Bashar et al. [12] coupled Simulated Annealing (SA) and SVM to determine the elements influencing water consumption.The findings indicated that the SVM-SA's standard error was 0.578.The hybrid SVM-SA model performed better than other models.
Feature selection helps reduce irrelevant features so that the performance of classification algorithms can be improved.One of these methods used is the Fast-Based Correlation Filter (FCBF).This FCBF method can provide exemplary performance in time and accuracy on classification algorithms.Li et al. [13] identified WT-FCBF-LSTM (wavelet transform, fast correlation-based filter, and long short-term memory) was introduced as part of the combined model.The findings indicate that compared to his LSTM, WT-LSTM, and FCBF-LSTM models, the WT-FCBF-LSTM model has a greater prediction accuracy.The MRE and RMSE of the single express and split services forecasts are lower than the combined express and split services forecasts.This suggests that WT-FCBF-LSTM can effectively capture various geographic and temporal aspects of express and side-split services to improve prediction accuracy.Djellali [14] identified WDBC (Wisconsin State Diagnosed Breast Cancer), colon, hepatitis, diffuse large B-cell lymphoma (DLBCL), and lung cancer datasets were classified using feature selection methods using FCBF and particle swarm optimization (PSO).The study's findings demonstrate that the FCBF-PSO approach performs more accurately than the FCBF-GA technique, with the former with an accuracy value of 86.11% and the latter with an accuracy value of 83.33% for PSO without FCBF.
Parkinson's disease stands out uniquely for its distinct suite of biomedical sound measurements used to diagnose and differentiate it from other conditions.This sound measurement includes parameters such as average vocal fundamental frequency, fundamental frequency variation, amplitude variation, and more.This compilation of specific biomedical sound measurements differentiates Parkinson's disease from other conditions.FCBF is specifically used to identify the most relevant and discriminatory features for classification purposes.Finding the most critical characteristics to use in successfully separating people with Parkinson's disease from those who do not has been made easier using FCBF.
Our research explores the early detection of Parkinson's disease by comparing the Support Vector Machines (SVM) and Naïve Bayes (NB) methods.Although SVM and NB are robust, the complexity of Parkinson's disease necessitates a practical approach due to its complex features.We integrate a Fast Correlation-Based Filter (FCBF) for feature selection.Our study focuses on implementing the FCBF method in the context of Parkinson's disease detection.By comparing the performance of SVM and NB with and without FCBF, we aim to explain the effectiveness of these methods.This dataset has previously been the object of research by Avci to diagnose Parkinson's disease using the Kernel Extreme Learning Machine Genetic-Wavelet Algorithm [15].

Research Methods
This study uses two methods, SVM and Naïve Bayes, that will be compared to find the best model to detect Parkinson's disease classification with FCBF feature selection and without feature selection.

Fast Correlation-Based Filter (FCBF)
FCBF was developed by Yu and Liu [16] to perform feature selection.This algorithm has the principle that features related to classes but not excessive to other connecting features are good features, so two random variables can be measured in correlation using symmetrical uncertainty (SU), which is between 0 and 1. Equation 1 shows the SU equation [14], [17].
() is the entropy value of X, and (|) is the entropy value of X if the Y variable is known with (|) as the information gain.

Support Vector Machine (SVM)
SVM is a classification technique for predicting classes using machine-learning training [18].The training process is performed with input data known to be labeled to create the model.The result of the pattern-shaped method is a dividing line of two classes, namely the +1 and -1 classes, called hyperplanes.Hyperplane optimization can be determined by measuring the distance between the hyperplane and the closest pattern within each category.This method's equation can be presented in Equation 2 [19], [20].
Where w is the weight, x is the input variable (data), and b is the bias.Edge values should be maximized to get the best hyperplane values.Figure 1 shows an ideal hyperplane image.

Naive Bayes (NB)
The NB algorithm uses an independent feature model, where in the same data, the features stand alone and have nothing to do with other components [22].NB classifiers are based on Bayes' theorem, which states that each part contributes to the target class independently and equally.Bayes' theorem is the basic rule of the Naive Bayes classifier.Equation 3 will give the Bayes theorem [23], [24], [25].
P(H) is the likelihood that the initial (prior) hypothesis H occurs without seeing any evidence.In contrast, (|) is a hypothesis H occurring if there is evidence that E occurs.The likelihood that evidence E will impact hypothesis H is (|).And P(E) is the likelihood, absent consideration of additional assumptions or evidence, that the original (prior) evidence E will occur [26], [25].As per Bayes' rule [25], the initial probability (P(H)) represents the possibility of a hypothesis before the observation of evidence.In contrast, the final probability H, (|) represents the likelihood of an idea after the preservation of proof.This strategy also offers certain benefits and drawbacks based on the idea.The advantage of this Bayes theorem is that the amount of training data needed is small, can be used by quantitative and discrete data, can function well on various types of data, and can solve not only a single proof but also more than one other.For instance, if you are familiar with the multiple proofs E1, E2, and E3, you may write the final probability of Bayes' theorem using equation 4 [10], [25]: The advantages and disadvantages of the Bayes theorem are that it can only be used in statistical data, can't apply if the conditional probability is zero, and there must be early learning to determine decisions [10].

SVM Kernel
The data in the input feature is transformed into features space using the kernel trick in the SVM learning model procedure.The kernel can translate data into kernel space, a higher-dimensional space, where this process can separate the data linearly [27].Several kernels are used in SVM: linear, polynomial, RBF, and sigmoid.

Hessian matrix
Each element of the Hessian matrix represents the second partial derivative of the function.The Hessian matrix f(x), an n-variable function with a second partial derivative and a continuous derivative, can be represented in equation 6 [31].(5) This Hessian matrix evaluates the derivatives of two functions of more than one variable.More precisely, it is used to identify the static point function of two or more variables.For example, () = ( 1 , . . . . . . . . . . . .,   ) is a real-valued function whose second partial derivatives are all continuous [32].

Confusion matrix
A confusion matrix is an approach to understanding information that contains actual data and predictions based on classification findings.It is anticipated in classification to categorize data precisely and generate good results with few mistakes.As a result, this approach exists to help determine how effective categorization is [33].The Confusion matrix produces the following primary results: accuracy, precision, specificity, and sensitivity.Table 1 displays the confusion matrix table.Before determining the preliminary results, each of the parameters in Table 1 must be recognized [19], [33], [34].

Data collection
The UCI Machine Learning Repository provides access to numeric data from the Parkinson's Disease dataset created by Max Little at the University of Oxford in collaboration with the Colorado National Center for Speech and Language.There are as many as 22 parameters of speech signal recording lab results, with the number of people with Parkinson's as many as 147 people and normal people as many as 48 [35].This data will be split into training and testing, with an 80:20 ratio.There are 156 training and 39 test data, including data from 29 Parkinson's patients and ten normal subjects.After data release, feature selection and classification procedures are used.

Training and testing data
To assess the effectiveness of a model or algorithm, the initial step in this study was to divide the data into training and testing sections using k-fold cross-validation, a statistical approach.The data is split into equal (or nearly equal) k-folds for k-fold cross-validation.In each subsequent round of training and testing, a specific data fold is used for testing, and the remaining k-1 folds are used for activity [36], [37].We employed k-fold cross-validation with a k value of 5 in this study.The next stage is the feature selection process.This process reduces features that are not required to simplify the classification process.The selection of parts used is the FCBF method.Good accuracy is obtained from the threshold parameters entered into the FCBF.The FCBF data results fall into two categories: training data and test data.The data is also transferred to the classification phase using the SVM and NB methods.
The SVM method's first step is to input data on Parkinson's disease.After obtaining the data, the SVM kernel is calculated as the next step.The following computation process finds the Hessian matrix's importance and runs the sequential SVM training and SVM test processes' computation processes.The final step in the SVM process is to evaluate its classification.
The NB method first computes the probability of each label, and the case probability of each brand determines the possibilities of problematic labels (data test) and compares the probability results of each title.In addition, this study also classified Parkinson's disease data without the process of selecting features that will be compared to the level of accuracy.Based on these measures, the results of this method comparison are expected to obtain the best model for diagnosing Parkinson's disease.

Result and Discussion
This study's output is the classifier's accuracy, which gauges how well the recognition process worked.The likelihood of the card succeeding increases with precision.Feature selection and classification are used in both stages of the identification process.

Feature selection
Feature selection aims to reduce redundant functionality and retain related components.The choice of features in this study uses FCBF with a threshold of 0.7, whose results will be used at the classification stage.A comparison of the parameter results with and without FCBF is shown in Table 2. From the table, we can see that there are 22 parameters and ten parameters after selecting the characteristic.

Classification
Feature selection data fall into two classes.Class 0 indicates no Parkinson's disease, and class 1 exhibits symptoms.The data classification process is performed using SVM and NB methods.The initial SVM used is a linear kernel type.The results of accuracy testing by selecting FCBF features of 2 to 11 features for the SVM-Linear Kernel and Naive Bayes (NB) classifier are presented in Table 3.It can be observed that the accuracy of the SVM classifier varies for several different features, ranging from 84.6154% to 86.1538%.Likewise, the Naive Bayes classifier shows accuracy ranging from 74.8718% to 80.5128% as the number of features selected changes.It was found that the best accuracy of the two models, both SVM and Naive Bayes, occurs when the number of features is specified as two, with SVM accuracy reaching 86.1538% and NB accuracy of 80.5128%.The two features used in the classification model are spread1 and PEP.The next step is to test the SVM kernel type to find the kernel with the best performance.Table 4 shows the SVM-FCBF performance test results based on the kernel type.Among these kernels, the linear kernel achieved the highest accuracy of 86.1538%, followed by the polynomial and RBF kernels, which achieved an accuracy of 85.1282% each.However, the Sigmoid kernel has the lowest accuracy of 50.7692%.The running time results, both with and without the application of FCBF feature selection, are presented in Table 6.The running time for the SVM classifier shows a significant decrease when FCBF feature selection is used, mainly when the number of features selected is limited to only two.For the SVM classifier, the running time was significantly reduced from 3.4942 seconds without FCBF to 0.0529 seconds with FCBF.This reduction in running time can attributed to a simplified computational process due to reduced feature dimensions achieved through FCBF.
However, the reduction in running time did not occur in the NB classifier.When FCBF feature selection is implemented, the running time for the NB classifier increases slightly from 0.0255 seconds without FCBF to 0.0383 seconds with FCBF.This may be because the computational efficiency of the NB algorithm is inherently less affected by reduced feature dimensions compared to SVM.On the other hand, the NB model gives the lowest results.Although FCBF aims to maintain relevant features, this may not align with the feature independence assumption in Naive Bayes.Therefore, the lower performance of Naive Bayes can be attributed to the model's inability to capture complex inter-feature relationships.The results can be illustrated with the Characteristic Operating Receiver (ROC) graph.Figure 2 displays a Roc graphic image using the SVM-FCBF technique.

Figure 2. ROC graphics of SVM-FCBF methods
The ROC chart describes the results of the confusion matrix, with horizontal lines being false positive and vertical being genuinely positive.The Area Under Curve (AUC) value obtained from this chart is 0.9, with a reasonable accuracy value.This dataset has previously been the object of research by Avci to diagnose Parkinson's disease using the Kernel Extreme Learning Machine Genetic-Wavelet Algorithm [15].The study achieved the highest accuracy rate of 96.81%, where the accuracy score was much higher than the accuracy of this research model (SVM-FCBF), which reached 86.1538%.This may help clarify the possible avenues for future model performance improvements and the enhancements that can be made in this area.

Conclusion
A system that implements the SVM and NB methods and identifies Parkinson's disease through a feature selection process using a fast correlation-based filter is a system that can perform accurate classification.The feature selection process successfully identifies relevant features that are later used in the identification process.The other step is classification preparation utilizing the SVM and NB strategies.Based on the results obtained, SVM-FCBF is the best result.The SVM-FCBF classification has an accuracy of 86.1538%, a sensitivity of 93.8775%, and a specificity of 62.5000%.As future studies intend to apply SVM and Naive Bayes to detect Parkinson's disease, these results could be used as a reference.The outcomes of this study can also be tested using a widely utilized method, namely the Deep Learning approach, which may obtain better research results.

Table 1 .
Table of Confusion Matrix

Table 2 .
Comparison of parameters using FCBF and without FCBF

Table 3 .
The results of the accuracy of the test are based on the number of features with FCBF

Table 4 .
SVM-FCBF performance test results based on kernel type

Table 5 .
Results of accuracy, sensitivity, and specificity testing with and without FCBF are

Table 5
compares the accuracy, sensitivity, and specificity results with and without FCBF for the SVM and NB methods.SVM with FCBF achieves the highest accuracy among all models.This suggests that when combined with FCBF feature selection, SVM balances correctly identifying positive cases and overall accuracy.Table4also shows that the SVM-FCBF increased the specificity by 6.25%, but the sensitivity decreased by 1.4%.Despite this, the sensitivity value remains high above 90%.The high sensitivity indicates that the model effectively reduces false negatives, ensuring actual disease cases are not missed at diagnosis.If sensitivity is low, the model may misclassify individuals with Parkinson's as healthy, resulting in false negative results.This can delay or prevent early diagnosis and intervention.Meanwhile, NB without FCBF had the highest specificity, indicating that it is better at correctly identifying negative cases, but the sensitivity is lower.The results highlight the importance of feature selection (FCBF) in improving model performance.Both SVM and NB benefit from FCBF and SVM shows more significant accuracy improvements.

Table 6 .
Results of running time with and without FCBF are compared

Table 4
displays the SVM-FCBF method's confusion matrix table.