Improving The Accuracy of Sentiment Analysis using Slang Words Lexicon and Spelling Correction
Abstract
Text pre-processing has long been a research subject to improve accuracy of Natural Language Processing models. In this paper we propose a technique for text sentiment classification with extra steps on text pre-processing using slang word lexicon and spelling correction to annotate non-formal Indonesian text and normalize them. This study aims to improve the accuracy of sentiment analysis models by strengthening text pre-processing methods. We compared the performance of these preprocessing methods using 2 popular classification algorithms: Support Vector Machine (SVM) and Naïve Bayes, and 3 different feature extraction methods: term presence, Bag of Words, and TF-IDF. Model was trained and tested with 1705 datasets of twitter posts from Indonesian users about Covid 19. Result show