Komparasi Ekstraksi Fitur BoW dan TF-IDF untuk Klasifikasi SMS Menggunakan Naive Bayes

  • I Komang Dwiprayoga Udayana University
  • Made Agung Raharja Udayana University

Abstract

Short Message Service (SMS) has become one of the most popular communication media. However, the ease and speed of sending SMS is also utilized by irresponsible parties to send spam messages. These spam messages not only annoy users, but can also cause financial losses and theft of personal data. The purpose of this research is to compare feature extraction methods that have the best performance such as TF-IDF and Bag of Word tested with Multinomial Naive Bayes machine learning algorithm. For the first research stage,load dataset, data balancing, data preprocessing, feature extraction, modeling with machine learning algorithms, and then testing and comparing confusion matrix models on each feature extraction. The results of this study show that the use of BoW feature extraction has better performance than the TF-IDF feature extraction model with an accuracy value of 94.44%. 


 

Published
2025-02-01
How to Cite
DWIPRAYOGA, I Komang; RAHARJA, Made Agung. Komparasi Ekstraksi Fitur BoW dan TF-IDF untuk Klasifikasi SMS Menggunakan Naive Bayes. Jurnal Nasional Teknologi Informasi dan Aplikasnya, [S.l.], v. 3, n. 2, p. 247-254, feb. 2025. ISSN 3032-1948. Available at: <https://ojs.unud.ac.id/index.php/jnatia/article/view/115986>. Date accessed: 30 jan. 2025. doi: https://doi.org/10.24843/JNATIA.2025.v03.i02.p03.