Comparison of Naive Bayes Method and Certainty Factor for Diagnosis of Preeclampsia

Preeclampsia is a disease often suffered by pregnant women caused by several factors such as a history of heredity, blood pressure, urine protein, and diabetes. The data sample used in this study is data on pregnant women in the 2020 time period recorded at health services in the former Cilacap Regency. This study was conducted to compare the final results of the Naive Bayes method and the certainty factor method in providing the results of a diagnosis of preeclampsia seen from the symptoms experienced by these pregnant women. The naïve Bayes approach provides decisions by managing statistical data and probabilities taken from the prediction of the likelihood of a pregnant woman showing symptoms of preeclampsia. Symptoms of preeclampsia, while the certainty factor method determines the certainty value of the diagnosis of preeclampsia in pregnant women based on the calculation of the CF value. The research output compares the two methods, showing that the certainty factor method provides more accurate diagnostic results than the Naive Bayes method. It happens because the CF method requires a minimum value of 0.2 and a maximum of 1 for each rule on the factors/symptoms involved, while the Naive Bayes method only requires values of 0 and 1 for each factor causing preeclampsia in pregnant women.


Introduction
Preeclampsia is a hypertensive disorder in pregnant women that significantly affects morbidity and is one of the causes of death in pregnant women and fetuses [1], [2]. Maternal Mortality Ratio (MMR), according to the World Health Organization (WHO), is the incidence of death in pregnant women during the period around delivery, which is 42 days after the end of pregnancy, which is caused by all causes related to pregnancy or the wrong way of handling it and is not caused by injury or accident [3]. Maternal Mortality Ratio (MMR) and Infant Mortality Ratio (IMR) are some of the benchmarks for the health and welfare of the people in a country [4]. WHO reports from various sources that the direct cause of maternal deaths occurs during and after childbirth and is caused by bleeding, infection, or high blood pressure during pregnancy by 75% [5]. According to WHO data, the prevalence of preeclampsia is 1.8-18% in developing countries, while in developed countries, it is 1.3-6%. This value indicates that the case of pregnant women with preeclampsia in developing countries is higher than in developed countries because preventive treatment of pregnant women with preeclampsia is handled faster in developed countries than in developing countries [6]. In Indonesia alone, the Maternal Mortality Ratio (MMR) for the last ten years was 459 maternal and fetal deaths from 100,000 births, with a frequency of preeclampsia incidence of around 3% to 10% of all pregnancies. The MMR value in Indonesia as a developing country is still relatively high. Data from the Inter-Census Population Survey (SUPAS) recorded MMR in as many as 305 cases during the last five years; this means that there are 305 cases of maternal death caused by pregnancy until delivery for 42 days after delivery per 100,000 live births [7]. In Cilacap Regency, according to data from the Cilacap Regency Health Office, it shows that during the last two years, MMR was 15 cases while for IMR it was 155 cases. Meanwhile, for the maximum target of the Regional Medium-Term Development Plan (RPJMD) of Cilacap Regency, the MMR is 19 cases and the IMR is 139 cases [8]. Based on this target, the MMR in Cilacap Regency is still quite high even though it is below the maximum standard set [9]. This has become the concern of relevant institutions in Cilacap Regency to continue suppressing MMR and IMR so that the level of community welfare increases. MMR can be identified based on the mother's general condition during the gestation of 40 weeks [10].
One of the identifications can be done through health examination of pregnant women in available health facilities [11]. This identification reduces the risk of death of pregnant women and fetuses, which can be predicted based on the symptoms experienced during pregnancy through prompt and correct handling in the most dangerous period, namely the period around delivery [12]. An expert system can be simply a transfer of knowledge from an expert to a computer through an information system that can be utilized without time and place restrictions [13]. The expert system asks for facts that will later be used as knowledge inference which is then processed to provide conclusions or decisions that are conical to a result of these facts [14]. The conclusion is considered the result of consultation with experts, who provide non-expert advice and explain possible solutions to the consequences [15].
Several studies have been conducted on implementing the naïve Bayes method and certainty factors to detect various diseases, including the research conducted by Hanny, which mapped the spread of respiratory tract infections (ARI) using the Naive Bayes method. Classification is carried out using ARI data so that the community is responsive to the spread of ARI diseases and helps medical personnel to complete the eradication of ARI diseases that have been targeted. The result of this study is the visualization used for mapping the spread of ARI disease based on classification using naïve Bayes [16]. Further research was conducted by Yovita et al., who implemented the naïve Bayes method in an expert system for diagnosing dysmenorrhea. Diagnosis is made to produce a conclusion about the dysmenorrhea suffered by a woman, whether it is included in the category of primary dysmenorrhea or secondary dysmenorrhea using the Naive Bayes classification. The analysis results show that the Naive Bayes method classification accuracy is 90% for the ten tested data [17]. Subsequent research was carried out by Muhammad et al., who used the Naive Bayes algorithm to determine the credit given to prospective customers. The naïve Bayes algorithm is used to predict and classify potentially problematic and non-problematic customers to get credit so that the company does not lose money with customers who have the potential to cause problems with bad loans in the future [18]. Subsequent research by Khairina et al. applied the certainty factor to an expert system for diagnosing ENT diseases. The expert in this study is an ENT specialist who provides complete and detailed information about the causes and symptoms experienced by patients who have problems with their ears, nose, and throat. The results of this study are a website-based information system that can diagnose ENT diseases by selecting the symptoms experienced by patients, and search results provided by the system results in the form of information about ENT diseases suffered based on the selected symptoms [19].
Based on several studies that have been done before, the authors are interested in comparing the certainty factor method and the naive Bayes method in diagnosing preeclampsia in pregnant women. The search results for preeclampsia by comparing the naïve Bayes method and the certainty factor method are used to design and develop an expert system. It is conducted by exploring expert knowledge, used as a knowledge base in an expert system development environment [20]. The consulting environment has a user interface, annotation facilities, and an inference engine connected to the development environment [21]. After extracting expert knowledge, forming rules based on facts on a knowledge base that will later be used in the tracing process, becomes the next step in designing an expert system for diagnosing preeclampsia in pregnant women [22]. The conclusions/decision results given are non-expert; if there are doubts about the results, they can later be consulted with real experts [23]. With the results, it is hoped that the developed expert system will be able to suppress the Maternal Mortality Ratio (MMR) to prevent the death of pregnant women and babies as early as possible. The research used the certainty factor and naive bayes method to find the most effective method in providing recommendations for the category of preeclampsia based on the factors/symptoms, whether it falls into the severe, moderate, or mild category of preeclampsia. The expected benefit of this research is to provide fast and accurate information to stakeholders in diagnosing the category of preeclampsia by involving the factors/symptoms experienced by pregnant women.

Research Methods
At this stage, it is explained about the certainty factor method, the Naive Bayes method, data on factors that cause preeclampsia, rule data for the two methods used for the process of tracing preeclampsia, and flowcharts for each method being compared.

Naïve Bayes Method
The naïve bayes method is better known and more widely used in the classification process, while in the expert system developed the naïve bayes method is used to classify data on symptoms of disease experienced by pregnant women to raise the opportunity for preeclampsia which causes delays in the normal delivery process if not treated early. and lead to a conclusion about preeclampsia with the highest posterior score [24], [25]. The naïve Bayes approach is an appropriate expert system for the early detection of preeclampsia because it defines rules that use probability in producing an appropriate decision/recommendation [26].  Calculations on the Naive Bayes method to generate disease opportunities go through several stages of the process as explained below [28]: a. Calculate the average of each class by using the equation below to find the initial value for each class involved [29]: Description: Qd = the value of the data record in the training data that have a = aj and p = pi X = 1 / many types of class / disease r = number of symptoms/parameter q = the value of the data record in the training data that has a value of a = aj/each class/disease b. Determine the likelihood value for each existing class using the equation below [30]: c. Determine the posterior value for each class involved using the following equation [31]: The final result of the Naive Bayes method is to classify the classes involved in the process of appearing the chance of preeclampsia disease by comparing the posterior end values of each class involved [32]. And the result of the naïve bayes method of classification is the highest posterior value of several classes being compared [33].

Certainty Factor Method
The certainty factor method is a method for tracing a conclusion that begins by observing the symptoms [28]. Tracing a conclusion is used to measure the certainty of a set of facts or rules [34]. In this case, the set of facts in question is the symptoms experienced by pregnant women during pregnancy from the first trimester to the last trimester. The data is collected to make rules for tracing preeclampsia [35]. The certainty factor (CF) value is calculated to show confidence in the facts of an event [36]. One of the reasons for choosing the certainty factor method to diagnose preeclampsia in pregnant women is that this method can measure something certain and uncertain in deciding on an expert system that is being developed [37]. The measure of the certainty of a fact is denoted by MB (Measure of increased Belief), while the measure of uncertainty is denoted by MD (Measure of increased Disbelief) [19]. The stages of the CF value search process are as follows [38]: a. Determine the value of CF The final result of the certainty factor method provides a certainty value for a decision, namely determining diseases that attack pregnant women [11]. The accuracy of the calculation results of this method is maintained because it can only process two data for one calculation [39], [40]. Figure 2 shows the stages of the certainty factor method, starting with determining the CF value for each premise of the rule used, then proceeding with determining the combination CF value determined by one or more premises, and ending with determining the CF value for the same conclusion, namely the diagnosis of preeclampsia [41].

Preeclampsia
The data on symptoms/factors causing preeclampsia used in this study are shown in table 1.  While table 2 shows the data description of elements grouped by symptoms in table 1. Table 3 shows examples of rule data used to diagnose preeclampsia based on data in table 1, and Table  2 is data on symptoms/factors causing preeclampsia. The rules in table 3 are formed based on the knowledge base obtained after consulting with experts, namely obstetricians and midwives. The category itself is divided into four categories, namely severe preeclampsia with the symbol (B), moderate preeclampsia with the symbol (S), mild preeclampsia with the symbol (R), and undetected preeclampsia with the symbol (T).

Result and Discussion
At the stage of the results and discussion of this research, it will be explained about the comparison of the calculation of the certainty factor method and the Naive Bayes method. The results of the calculations of the two approaches will be compared with the level of accuracy. The calculation of the two methods uses the example rule to diagnose preeclampsia in table 3.   0,17*0,17*0,17*0,17*0,17*0,17 : 0,00024 S 0*0*0,11*0,11*0,11*0 : 0 R 0*0*0*0,125*0,125*0,125 : 0 T 0*0*0*0*0*0,143 : 0 d. The posterior value for the class of severe preeclampsia is 0,00024, the class for moderate preeclampsia is 0, the class for mild preeclampsia is 0, and the class for undetected preeclampsia is 0.

Certainty Factor Method
a. Calculations using the Certainty Factor method begin by finding the user CF and expert CF values for each of the factors/symptoms that cause preeclampsia using equation (4). Table  7 shows the user CF and expert CF values for each factor causing preeclampsia. b. After knowing the user CF value and CF expert value, proceed with determining the CF Combine value, which is determined by more than one premise using equation (6). Table 8 shows the results of the CF values for symptom 1 (G1), symptom 2 (G2), and symptom 3 (G3) according to the rules in table 3. c. The last step is to determine the CF Combine value for each rule in the expert system for early detection of preeclampsia in pregnant women using equation (7) [45].  Figure 3 shows the results of the comparison of the CF Combine 1 value and the CF Combine 2 value from the previous calculation process. The graph explains that the value of CF Combine 2, symbolized by a red line, shows the results of the calculation of the value of certainty factors for diseases suffered by pregnant women in the category of severe preeclampsia. This value is better than the CF Combine 1, symbolized by a blue line for all factors/symptoms involved in each rule of the expert system for early detection of preeclampsia in pregnant women.

Comparison Results
The results of the comparison of the naive bayes method and the certainty factor method are as follows: a. Based on the calculation results of the naive Bayes method for the largest probability value, the diagnosis is in the form of preeclampsia with a severe category according to the results in table 6 of 0,00024. b. As for the calculation of the certainty factor method, the diagnosis results show that the disease detected early is preeclampsia with a severe category with the CF Combine value according to Figure 3, where the peak of the curve is shown in the CF C2 value of 0,6912.
From the above comparison results based on the results obtained using the Naive Bayes method and the certainty factor method, the certainty factor method is more accurate in the early detection of preeclampsia in pregnant women than the naive bayes method based on calculations obtained and has been done previously. This is because the certainty factor method requires the provision of values for each rule on all symptoms/factors causing preeclampsia to determine the value of the CF Combine. Different treatment for the Naive Bayes method only requires a value of 0 and a value of 1 for all factors/symptoms involved in the expert system rule base [46].

Conclusion
Based on the background of the problem, the method is compared, the discussion of the calculation of each method, and the final results that have been compared, it can be concluded that the comparison of the results of the Naive Bayes method and the certainty factor method for early detection of preeclampsia in pregnant women shows the certainty factor method is more accurate. The reason is that the certainty factor method requires a minimum certainty value of 0.2 and a maximum of 1 for the user CF value and the expert CF value, while the Naive Bayes method only requires 0 and 1 values for each factor/symptom involved. And the expert system for early detection of preeclampsia produces a more accurate diagnosis based on the tracing process according to the symptoms experienced by the patient by implementing the certainty factor method.