Klasifikasi Jenis Obat Berdasarkan Gejala yang Dimiliki Pasien Menggunakan Metode K-Nearest Neighbors (KNN)
Abstract
This research applies the K-Nearest Neighbors (KNN) algorithm to classify medicine types based on patient symptoms using a dataset from Kaggle with 200 rows and 6 columns. After preprocessing steps such as handling missing values, encoding categorical variables, and splitting data into training and testing sets, exploratory data analysis (EDA) was performed to understand the dataset's structure. The KNN model was evaluated with k values of 1, 2, and 3, finding the optimal k to be 3, achieving an accuracy of 77.50% with average precision of 0.76, recall of 0.69, and f1-score of 0.66. Lower accuracy was observed for k=2 (65.00%) and k=1 (67.50%), indicating that k=3 is the most effective for this dataset. These results suggest that while KNN is a viable method for classifying medicine types based on symptoms, larger datasets are recommended for improved accuracy.
Keywords: K-Nearest Neighbors (KNN), classify, medicine, exploratory data analysis (EDA), preprocessing
This work is licensed under a Creative Commons Attribution 4.0 International License.
The Authors submitting a manuscript do so on the understanding that if accepted for publication, the copyright of the article shall be assigned to JNATIA (Jurnal Nasional Teknologi Informasi dan Aplikasinya) as the publisher of the journal. Copyright encompasses exclusive rights to reproduce and deliver the article in all forms and media, as well as translations. The reproduction of any part of this journal (printed or online) will be allowed only with written permission from JNATIA (Jurnal Nasional Teknologi Informasi dan Aplikasinya). The Editorial Board of JNATIA (Jurnal Nasional Teknologi Informasi dan Aplikasinya) makes every effort to ensure that no wrong or misleading data, opinions, or statements be published in the journal.
This work is licensed under a Creative Commons Attribution 4.0 International License.