Implementation of Document Clustering in Online News Using K-Means Clustering
Abstract
The development of technology when making an explosion of the number of news or news documents that are very much on the internet, it is necessary to have clustering done in dividing these documents so that they can be adjusted based on the category of the online news. Application of document clustering can increase the effectiveness of information retrieval by referring to a hypothesis that relevant documents will tend to be in the same cluster if a collection of documents has been clustered. This research aims to try to do clustering on online news about Covid-19 taken from three online news websites contained in XML files. K-Means clustering is used as a grouping of online news by using an open-source application, Carrot2 Workbench which turns out to be able to generate nine clusters of "Covid-19" queries entered in the Carrot2 application.
Downloads
References
[2] I. B. A. Peling, I. N. Arnawan, I. P. A. Arthawan and I. Janardana, "Implementation of Data Mining To Predict Period of Students Study Using Naive Bayes Algorithm," International Journal of Engineering and Emerging Technology, vol. 2, no. 1, pp. 53-57, 2017.
[3] B. Herwijayanti, D. E. Ratnawati and L. Muflikhah, "Klasifikasi Berita Online dengan menggunakan Pembobotan TF-IDF dan Cosine Similarity," Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 2, no. 1, pp. 306-312, 2018.
[4] Wahyudin, I. P. A. Wijaya and I. B. A. Swamardika, "Data Mining for Clustering Revenue Plan Expense Area (APBD) by using K-Means Algorithm," International Journal of Engineering and Emerging Technology, vol. 2, no. 1, pp. 87-93, 2017.
[5] D. Mustikasari, T. B. Adji and A. Kadir, "Analisis Potensi Daerah Melalui Metode Document Clustering Laporan Pelaksanaan Kegiatan Kuliah Kerja Nyata-Pembelajaran Pemberdayaan Masyarakat," Jurnal Edukasi dan Penelitian Informatika (JEPIN), vol. 1, no. 1, pp. 1-8, 2015.
[6] R. Wahyu and A. V. E. Panjaitan, "Documents Clustering Using K-Means Algorithm," IT FOR SOCIETY, vol. 3, no. 2, pp. 62-66.
[7] M. Sulhan and R. Kurniawan, "METODE STEMMING SEBAGAI PREPROCESSING PADA FILTER KATA PORNO MELALUI ASPEK PENDIDIKAN," in Seminar Nasional Teknologi Informasi dan Komunikasi 2014 (SENTIKA 2014), Yogyakarta, 2014.
[8] D. S. Indraloka and B. Santosa, "Penerapan Text Mining untuk Melakukan Clustering Data Tweet Shopee Indonesia," JURNAL SAINS DAN SENI ITS, vol. 6, no. 2, pp. 51-56, 2017.
[9] N. G. Yudiarta, M. Sudarma and W. G. Ariastina, "Penerapan Metode Clustering Text Mining Untuk Pengelompokan Berita Pada Unstructured Textual Data," Majalah Ilmiah Teknologi Elektro, vol. 17, no. 3, pp. 399-344, 2018.
[10] D. Ardiada, P. A. Ariawan and M. Sudarma, "Evaluation of Supporting Work Quality Using K-Means Algorithm," International Journal of Engineering and Emerging Technology, vol. 3, no. 1, pp. 52-55, 2018.
[11] D. A. Wicaksana, P. P. Adikara and S. Adinugroho, "Clustering Dokumen Skripsi Dengan Menggunakan Hierarchical Agglomerative Clustering," Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 2, no. 12, pp. 6227-6234, 2018.
[12] B. O. Saracoglu, "A Qualitative Multi-Attribute Model for the Selection of the Private Hydropower Plant Investments in Turkey: By Foundation of the Search Results Clustering Engine (Carrot2), Hydropower Plant Clustering, DEXi and DEXiTree," Journal of Industrial Engineering and Management, pp. 152-178, 2016.
[13] W. A, L. Y and W. W, "Text Clustering Based On Key Phrases," The 1st International Conference on Information Science and Engineering, pp. 986-989, 2009.
[14] O. S, "An Algorithm For Clustering Of Web Search Results," in M.Sc thesis Poznan University of Technology, Poznan, 2004.
[15] https://project.carrot2.org (accessed on May 21, 2020).
[16] H. P. B. Surya, R. L. Rahardian and M. Sudarma, "Application of Neural Network Overview In Data Mining," International Journal of Engineering and Emerging Technology, vol. 2, no. 1, pp. 94-96, 2017.