Penerapan Metode Clustering Text Mining Untuk Pengelompokan Berita Pada Unstructured Textual Data

  • Nyoman Gede Yudiarta Sekretariat Daerah Provinsi Bali
  • Made Sudarma Universitas Udayana
  • Wayan Gede Ariastina Universitas Udayana

Abstract

Good governance was a government whose programs were known and beneficial to the people. In Bali Provincial Government which has duty in disseminating information is Bureau of Public Relations Regional Secretariat Bali through media owned. Because at the time of news input to the media in this case Public Relations Bureau website was not included causing the emergence of problems in the form of difficulty knowing the news, which news that goes into certain categories. Clustering was a method to solve the problem. One of the algorithms used in the Clustering method is the K-Means algorithm. This study focused on designing to classify news data into a category using K-Means. To process the documents obtained to make it easier in the process of clustering, was done by preprocess documents first. Document preparation consists of case folding, tokenization, filtering and stemming. Tf-Idf was done to pass the weighting of the terms obtained on the preprocessed documents. From the results of experiments conducted using different amounts of data that are 50, 100, 200, 300, 400, and 500 data obtained results that the K-Means algorithm applied to cluster news, able to work and provide a satisfactory accuracy, Precision average of 73.11% while Recall of 69.65% and Purity of 0.80 for all test data. When viewed the comparison of each test data, the test on 50 data has the highest average precision and recall rate of 76.92% for its precision and for its recall of 79.58% while for Purity its highest value is on testing 300 data that is equal to 0.83.

Downloads

Download data is not yet available.

References

[1] Herny Februariyanti Dan Dwi Budi Santoso, 2017, “Hierarchical Agglomerative Clustering Untuk Pengelompokan Skripsi Mahasiswa,” Prosiding SINITAK 2017, ISBN: 978-602-8557-20-7.
[2] Pivin Suwrmayanti, I Ketut Gede Darma Putra, I Nyoman Satya Kumara, ”Optimasi Pusat Cluster K-Prototype dengan Algoritma Genetika,” Teknologi Elektro, Vol. 13 No. 2 Juli-Desember 2014.
[3] PenambanganTeks, https://id.wikipedia.org/ (diakses tanggal 27 Juni 2015).
[4] Thopo Martha Akbar, Angelina Prima Kurniati, Moch Arif Bijaksana, 2012 “Analisis Perbandingan Metode Pembobotan Kata Tf.Idf Dan Tf.Rf Terhadap Performansi Kategorisasi Teks”.
[5] Kestrilia Rega Prilianti, Hendra Wijaya, 2014, “Aplikasi Text Mining Untuk Automasi Penentuan Tren Topik Skripsi Dengan Metode K-Means Clustering,” Jurnal Cybermatika, Vol. 2 No. 1.
[6] Mardiani, 2014, ” Perbandingan Algoritma K-Means dan EM untuk Clusterisasi Nilai Mahasiswa Berdasarkan Asal Sekolah,” Citec Journal, Vol. 1, No. 4, ISSN: 2354-5771.
[7] Ni Putu Sutramiani, I Ketut Gede Darma Putra, Made Sudarma, “Local Adaptive Thresholding pada Preprocessing Citra Lontar Aksara Bali,” Jurnal Teknologi Elektro, Vol.14, No.1,Januari-Juni 2015.
[8] Pausta Yugianus, Harry Soekotjo Dachlan, dan Rini Nur Hasanah, 2013 “Pengembangan Sistem Penelusuran Katalog Perpustakaan Dengan Metode Rocchio Relevance Feedback”, EECCIS Vol. 7, No. 1, Juni 2013.
[9] Sendhy Rachmat Wurdianarto, Sendi Novianto, Umi Rosyidah, 2014, Perbandingan Euclidean Distance Dengan Canberra Distance Pada Face Recognition, Techno.COM, Vol. 13, No. 1 : 31-37
[10] Ediyanto, Muhlasah Novitasari Mara, Neva Satyahadewi, 2013, “Pengklasifikasian Karakteristik Dengan Metode K-Means Cluster Analysis,” Buletin Ilmiah Mat. Stat. dan Terapannya (Bimaster) Volume 02 , No. 2, Hal 133 – 136
Published
2018-12-05
How to Cite
YUDIARTA, Nyoman Gede; SUDARMA, Made; ARIASTINA, Wayan Gede. Penerapan Metode Clustering Text Mining Untuk Pengelompokan Berita Pada Unstructured Textual Data. Majalah Ilmiah Teknologi Elektro, [S.l.], v. 17, n. 3, p. 339-344, dec. 2018. ISSN 2503-2372. Available at: <https://ojs.unud.ac.id/index.php/JTE/article/view/41047>. Date accessed: 02 june 2020. doi: https://doi.org/10.24843/MITE.2018.v17i03.P06.