Web-based Application for Classification Using Naïve Bayes and K-means Clustering (Case Study: Tic-tac-toe Game)
Abstract
A database can consist of numerical and non-numerical attributes. However, several data processing algorithms, such as K-means clustering, can be used only in a dataset with numerical attributes. Data generalization by using Naïve Bayes and K-means clustering methods is usually employed WEKA (Waikato environment for knowledge analysis) application. Although the strength of WEKA lies in increasingly complete and sophisticated algorithms, the success of data mining still lies in the knowledge factor of the human implementer. The task of collecting high-quality data and knowledge of modeling and the use of appropriate algorithms is needed to guarantee the accuracy of the expected formulations. In this paper, we propose a simple web-based application that can be used like WEKA. The methodology used in this study includes several stages. The first stage is the preparation of data, which is the tic-tac-toe game dataset that is converted to CSV (comma-separated values) format. The next stage is the process of modifying data from non-numeric to numeric, specifically for clustering with the K-means algorithm. Afterward, the calculation of the distance between data is conducted and followed by data clustering. The final stage is the summary of these processes and results. From the experimental results, it was found that clustering can be done on categorical attributes that are transformed first into the numerical form using web-based applications.
Downloads
References
[2] Sinharay, S. An overview of statistics in education. Elsevier, 2010, pp. 1-11.
[3] Raykov, Y. P., Boukouvalas, A., Baig, F., and Little, M. A., “What to do when K-means clustering fails: a simple yet principled alternative algorithm,” PloS one, vol. 11, no. 9, 2016.
[4] Farid, D. M., Zhang, L., Rahman, C. M., Hossain, M. A., and Strachan, R., “Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks,” Expert systems with applications, vol. 41, no. 4, 2014, pp. 1937-1946.
[5] Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H., “The WEKA data mining software: an update,” ACM SIGKDD explorations newsletter, vol. 11, no. 1, 2009, pp. 10-18.
[6] Najafabadi, M. M., Villanustre, F., Khoshgoftaar, T. M., Seliya, N., Wald, R., and Muharemagic, E., “Deep learning applications and challenges in big data analytics,” Journal of Big Data, vol. 2, no. 1, 2015, pp. 1.
[7] E. Turban, et al., Decision Support Systems and Intelligent Systems. Yogyakarta: Andi Offset, 2005.
[8] J. F. Ulysses, “Data Mining Classification Untuk Prediksi Lama Masa Studi Mahasiswa Berdasarkan Jalur Penerimaan Metode Naïve Bayes,” Magister Teknik Informatika Universitas Atma Jaya Yogyakarta.
[9] Zulfikar, W. B., Lukman, N., “Perbandingan Naive Bayes Classifier Dengan Nearest Neighbor Untuk Identifikasi Penyakit Mata,” Jurnal Online Informatika, vol. 1, no. 2, 2016, pp. 82-86.
[10] A. Jananto, “Algoritma Naïve Bayes Untuk Mencari Perkiraan Waktu Studi Mahasiswa,” Jurnal Teknologi Informasi DINAMIK, vol. 18, no. 1, Jan 2013.
[11] M. S. Suhartinah and Ernastuti, “Graduation Prediction of Gunadarma University Students Using Algorithm Naïve Bayes C4.5 Algorithm,” Faculty of Industrial Engineering, 2010.
[12] I. Budiman, T. Prahasto, and Y. Christyono, “Data Clustering Menggunakan Metodologi Crisp-DM Untuk Pengenalan Pola Proporsi Pelaksanaan Tridharma,” presented in 2012 Seminar Nasional Aplikasi Teknologi Informasi (SNATI 2012), Yogyakarta.
[13] J. O. Ong, “Implementasi Algoritma K-Means Clustering Untuk Menentukan Strategi Marketing President University,” Jurnal Ilmiah Teknik Industri, vol 12, no. 1, June 2013, pp. 10-13.
[14] Indraswari, R., Herulambang, W., Rokhana, R., “Melanoma classification using automatic region growing for image segmentation,” in Proceeding ICTA 2017 UBHARA Surabaya, pp. 165-172.
[15] Zeng, G., “A necessary condition for a good binning algorithm in credit scoring,” Applied Mathematical Sciences, vol. 8, no. 65, 2014, pp. 3229-3242.
[16] S. Jain and N. Kera, “An Intelligent Method for Solving Tic-tac-toe Problem,” presented at the 2015 International Conference on Computing, Communication, and Automation (ICCCA).
[17] Abu Dalffa, M., Abu-Nasser, B. S., Abu-Nasser, S. S., “Tic-Tac-Toe Learning Using Artificial Neural Network,” International Journal of Engineering and Information System (IJEAIS), vol. 3, no. 2, February 2019, pp. 9-19.
[18] Kodinariya, T. M., Makwana, P. R., “Review on determining number of Cluster in K-Means Clustering,” International Journal, vol. 1, no. 6, 2013, pp. 90-95.