Implementation of Document Clustering in Online News Using K-Means Clustering

  • Yogiswara Dharma Putra Department of Electrical and Computer Engineering, Post Graduate Program, Udayana University
  • Ni Wayan Sri Ariyani Department of Electrical and Computer Engineering, Post Graduate Program, Udayana University
  • Ida Bagus Alit Swamardika Department of Electrical and Computer Engineering, Post Graduate Program, Udayana University

Abstract

The development of technology when making an explosion of the number of news or news documents that are very much on the internet, it is necessary to have clustering done in dividing these documents so that they can be adjusted based on the category of the online news. Application of document clustering can increase the effectiveness of information retrieval by referring to a hypothesis that relevant documents will tend to be in the same cluster if a collection of documents has been clustered. This research aims to try to do clustering on online news about Covid-19 taken from three online news websites contained in XML files. K-Means clustering is used as a grouping of online news by using an open-source application, Carrot2 Workbench which turns out to be able to generate nine clusters of "Covid-19" queries entered in the Carrot2 application.

Downloads

Download data is not yet available.

References

[1] M. S. Hudin, M. A. Fauzi and S. Adinugroho, "Implementasi Metode Text Mining dan K-Means Clustering untuk Pengelompokan Dokumen Skripsi (Studi Kasus: Universitas Brawijaya)," Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 2, no. 11, pp. 5518-5524, 2018.
[2] I. B. A. Peling, I. N. Arnawan, I. P. A. Arthawan and I. Janardana, "Implementation of Data Mining To Predict Period of Students Study Using Naive Bayes Algorithm," International Journal of Engineering and Emerging Technology, vol. 2, no. 1, pp. 53-57, 2017.
[3] B. Herwijayanti, D. E. Ratnawati and L. Muflikhah, "Klasifikasi Berita Online dengan menggunakan Pembobotan TF-IDF dan Cosine Similarity," Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 2, no. 1, pp. 306-312, 2018.
[4] Wahyudin, I. P. A. Wijaya and I. B. A. Swamardika, "Data Mining for Clustering Revenue Plan Expense Area (APBD) by using K-Means Algorithm," International Journal of Engineering and Emerging Technology, vol. 2, no. 1, pp. 87-93, 2017.
[5] D. Mustikasari, T. B. Adji and A. Kadir, "Analisis Potensi Daerah Melalui Metode Document Clustering Laporan Pelaksanaan Kegiatan Kuliah Kerja Nyata-Pembelajaran Pemberdayaan Masyarakat," Jurnal Edukasi dan Penelitian Informatika (JEPIN), vol. 1, no. 1, pp. 1-8, 2015.
[6] R. Wahyu and A. V. E. Panjaitan, "Documents Clustering Using K-Means Algorithm," IT FOR SOCIETY, vol. 3, no. 2, pp. 62-66.
[7] M. Sulhan and R. Kurniawan, "METODE STEMMING SEBAGAI PREPROCESSING PADA FILTER KATA PORNO MELALUI ASPEK PENDIDIKAN," in Seminar Nasional Teknologi Informasi dan Komunikasi 2014 (SENTIKA 2014), Yogyakarta, 2014.
[8] D. S. Indraloka and B. Santosa, "Penerapan Text Mining untuk Melakukan Clustering Data Tweet Shopee Indonesia," JURNAL SAINS DAN SENI ITS, vol. 6, no. 2, pp. 51-56, 2017.
[9] N. G. Yudiarta, M. Sudarma and W. G. Ariastina, "Penerapan Metode Clustering Text Mining Untuk Pengelompokan Berita Pada Unstructured Textual Data," Majalah Ilmiah Teknologi Elektro, vol. 17, no. 3, pp. 399-344, 2018.
[10] D. Ardiada, P. A. Ariawan and M. Sudarma, "Evaluation of Supporting Work Quality Using K-Means Algorithm," International Journal of Engineering and Emerging Technology, vol. 3, no. 1, pp. 52-55, 2018.
[11] D. A. Wicaksana, P. P. Adikara and S. Adinugroho, "Clustering Dokumen Skripsi Dengan Menggunakan Hierarchical Agglomerative Clustering," Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 2, no. 12, pp. 6227-6234, 2018.
[12] B. O. Saracoglu, "A Qualitative Multi-Attribute Model for the Selection of the Private Hydropower Plant Investments in Turkey: By Foundation of the Search Results Clustering Engine (Carrot2), Hydropower Plant Clustering, DEXi and DEXiTree," Journal of Industrial Engineering and Management, pp. 152-178, 2016.
[13] W. A, L. Y and W. W, "Text Clustering Based On Key Phrases," The 1st International Conference on Information Science and Engineering, pp. 986-989, 2009.
[14] O. S, "An Algorithm For Clustering Of Web Search Results," in M.Sc thesis Poznan University of Technology, Poznan, 2004.
[15] https://project.carrot2.org (accessed on May 21, 2020).
[16] H. P. B. Surya, R. L. Rahardian and M. Sudarma, "Application of Neural Network Overview In Data Mining," International Journal of Engineering and Emerging Technology, vol. 2, no. 1, pp. 94-96, 2017.
Published
2020-12-13
How to Cite
DHARMA PUTRA, Yogiswara; ARIYANI, Ni Wayan Sri; SWAMARDIKA, Ida Bagus Alit. Implementation of Document Clustering in Online News Using K-Means Clustering. International Journal of Engineering and Emerging Technology, [S.l.], v. 5, n. 2, p. 61-65, dec. 2020. ISSN 2579-5988. Available at: <https://ojs.unud.ac.id/index.php/ijeet/article/view/60059>. Date accessed: 25 apr. 2024. doi: https://doi.org/10.24843/IJEET.2020.v05.i02.p10.