Labeling Indonesia COVID-19 Data Using  K-Means Clustering with Optimization

Duman Care Khrisne; I Made Arsa Suyadnya; AA Ngurah Cakra Khana

Duman Care Khrisne Departement of Electrical Engineering, Faculty of Engineering, Udayana University
I Made Arsa Suyadnya Departement of Electrical Engineering, Faculty of Engineering, Udayana University
AA Ngurah Cakra Khana Departement of Electrical Engineering, Faculty of Engineering, Udayana University

Abstract

COVID-19 or Corona Virus Diseases is a virus that spreads throughout the world and causes a pandemic that affects social life, education and tourism, especially in Indonesia. The government has implemented various policies to reduce the rate of cases in Indonesia. In determining policies and regulations, the role of data is very important, especially in Indonesia, but the existence of data is still small and has not been labeled. In this study, the method used to label COVID-19 data in Indonesia is using K-Means Clustering. K-means is a data processing method that produces a group that is divided into 16,936 data. Determination of the number of groups in this study using the Elbow method and optimized by the Davies Bouldin Index method. The result of this study is the number of clusters used as labeling of COVID-19 data in Indonesia. The number of clusters was obtained using the Elbow method and optimized with the Davies Bouldin Index so as to produce a total of 4 clusters and the results of the labeling obtained the number of members in each cluster which amounted to 15315 in cluster 0, 1191 in cluster 1, 222 in cluster 2 and 208 in cluster 3.

Downloads

Download data is not yet available.

References

[1]Abdullah, D., Susilo, S., Ahmar, A. S., Rusli, R., & Hidayat, R. (2021). The application of K-means clustering for province clustering in Indonesia of the risk of the COVID-19 pandemic based on COVID-19 data. Quality & Quantity. https://doi.org/10.1007/s11135-021-01176-w
[2] Darmansah, D. D., & Wardani, N. W. (2021). Analisis Pesebaran Penularan virus corona Di Provinsi Jawa Tengah Menggunakan Metode K-means clustering. JATISI (Jurnal Teknik Informatika dan Sistem Informasi), 8(1), 105-117. https://doi.org/10.35957/jatisi.v8i1.590
[3] Darmi, Y. D., & Setiawan, A. (2017). Penerapan metode clustering K-means dalam pengelompokan penjualan produk. JURNAL MEDIA INFOTAMA, 12(2). https://doi.org/10.37676/jmi.v12i2.418
[4] Indraputra, Fitriana (2020). K-Means Clustering Data COVID-19, Vol (10), No.3, 275-282
[5] M. Abdulkareem, N., Mohsin Abdulazeez, A., Qader Zeebaree, D., & A. Hasan, D. (2021). COVID-19 world vaccination progress using machine learning classification algorithms. Qubahan Academic Journal, 1(2), 100-105. https://doi.org/10.48161/qaj.v1n2a53
[6] Marisa, F., Pribady, B. A., Desi, A., Maukar, A. L. (2021). Pendeteksi Daerah (Provinsi) Rawan COVID19 dengan metode unsupervised learning & algoritma k-means, Vol (12), No. 1, 17-21
[7] Nabila, Isnain, Permata, Abidin (2021). Analisis Data Mining Untuk Clustering Kasus Covid-19 Di Provinsi Lampung Dengan Algoritma K-Means, Vol (2), No. 2, 100-108
[8] Navastara, D. A., Mursidah, E., Gonti, Y. A., Wahyuni, D., Wiyadi, P. D., & Suadi, W. (2019). Clustering topik penelitian berbasis unsupervised learning untuk rekomendasi koleksi pustaka Di perpustakaan its. JUTI: Jurnal Ilmiah Teknologi Informasi, 17 (2), 125. https://doi.org/10.12962/j24068535.v17i2.a788
[9] Parhusip, H. A. (2020). Study on COVID-19 in the world and Indonesia using regression model of SVM, Bayesian ridge and gaussian. JURNAL ILMIAH SAINS, 20(2), 49. https://doi.org/10.35799/jis.20.2.2020.28256
[10] Priati, Ahmad Fauzi (2017). Data Mining dengan Teknik Clustering Menggunakan Algoritma K-Means pada Data Transaksi Superstore
[11] Retnoningsih E, Pramudita R. (2020). Mengenal Machine Learning Dengan Teknik Supervised Learning dan Unsupervised Learning menggunakan Python. Bina Insani ICT journal. 7 (2): 156-165
[12] Solichin, A., & Khairunnisa, K. (2020). Klasterisasi Persebaran virus corona (COVID-19) Di DKI Jakarta Menggunakan Metode K-means. Fountain of Informatics Journal, 5(2), 52. https://doi.org/10.21111/fij.v5i2.4905
[13] Untoro, M. C., Anggraini, L., Andini, M., Retnosari, H., & Nasrulloh, M. A. (2021). Penerapan metode K-means clustering data COVID-19 Di Provinsi Jakarta. Teknologi, 11(2), 59-68. https://doi.org/10.26594/teknologi.v11i2.2323
[14] Uperiati, A., Bettiza, M., & Puspasari, A. (2020). Perbandingan metode fuzzy C-means Dan K-means dalam klasifikasi kelulusan mahasiswa (Studi kasus : Jurusan manajemen, universitas maritim Raja Ali Haji. Jurnal Sustainable: Jurnal Hasil Penelitian dan Industri Terapan, 9(2), 75-81. https://doi.org/10.31629/sustainable.v9i2.1409
[15] Virgantari, F., & Faridhan, Y. E. (2020). K-Means Clustering of COVID-19 Cases in Indonesia’s Provinces. Vol (5). No. 2, 1-7
[16] Wu, J., & Sha, S. (2021). Pattern recognition of the COVID-19 pandemic in the United States: Implications for disease mitigation. International Journal of Environmental Research and Public Health, 18(5), 2493. https://doi.org/10.3390/ijerph18052493
[17] Yunita, F. (2018). Penerapan data mining menggunkan algoritma K-means clustring pada penerimaan mahasiswa baru. SISTEMASI, 7(3), 238. https://doi.org/10.32520/stmsi.v7i3.388