Aspect Based Sentiment Analysis on Shopee Application Reviews Using Support Vector Machine
Abstract
One of the e-commerce in Indonesia is Shopee. Feedback from users is needed to improve the quality of e-commerce services and user satisfaction. This research process includes data scraping, labeling, text pre-processing, TF-IDF, aspect, and sentiment classification. The novelty of this research is using the SVM method with SGD to classify Indonesian language application reviews based on aspect categories consisting of 7 dimensions of service quality and sentiment so that the website created in this research can display the aspects and sentiments of the input reviews. This research also builds an Indonesian normalization dictionary to optimize the terms used to increase model accuracy. The test in aspect classification resulted in a precision value of 90%, recall of 88.73%, accuracy of 88.57%, and f1-score of 89%. Meanwhile, the sentiment classification resulted in a precision value of 96.15%, recall of 91.91%, accuracy of 94.28%, and f1-score of 93.98%. In addition, the test results (accuracy, f1-score, precision, recall) show that the lemmatization process is better than stemming and term weighting using the TF-IDF method is better than other methods (raw-term frequency, log-frequency weighting, binary-term weighting).
Downloads
References
[2] R. J. S. I. Abdillah Taufikqurrochman, “The Impact of E-Service Quality and Price on Customer Satisfaction of Tokopedia,” Jurnal Manajemen Bisnis dan Kewirausahaan, vol. 1, nº 2, pp. 88-96, 2021.
[3] K. Çelik, “The effect of e-service quality and after-sales e-service quality on e-satisfaction,” Business & Management Studies: An International Journal, vol. 9, nº 3, pp. 1137-1155, 2021.
[4] M. T. Ibrahim Moge Noor, “Sentiment Analysis using Twitter Dataset,” IJID (International Journal on Informatics for Development), vol. 9, nº 2, pp. 84-94, 2019.
[5] P. Ray e A. Chakrabarti, “A Mixed approach of Deep Learning method and Rule-Based method to improve Aspect Level Sentiment Analysis,” Applied Computing and Informatics, vol. 18, nº 1/2, pp. 163-178, 22 2 2019.
[6] S. C. J. S. V. Shitanshu Jain, “Analysis of Text Classification with various Term Weighting Schemes in Vector Space Model,” International Journal of Innovative Technology and Exploring Engineering (IJITEE), vol. 9, nº 10, pp. 390-393, 2020.
[7] W. P. A. N. R. Fitra A. Bachtiar, “Text Mining for Aspect Based Sentiment Analysis on Customer Review : A Case Study in the Hotel Industry,” em IICST2020: 5th International Workshop on Innovations in Information and Communication Science and Technology, Malang, 2020.
[8] S. A. Nouh Sabri Elmitwally, “Arabic Corpus for Figurative Sentiment Analysis,” International Journal of Advanced Science and Technology, vol. 29, nº 3, pp. 3391- 3404, 2020.
[9] M. M. H. Md. Rajib Hossain, “Automatic Bengali Document Categorization Based on Word Embedding and Statistical Learning Approaches,” Rajshahi, Bangladesh, 2018.
[10] D. J. J. S. A. Rio Pramana, “Systematic Literature Review of Stemming and Lemmatization Performance for Sentence Similarity,” Yogyakarta, Indonesia, 2022.
[11] S. P. Nur Fadilah, “Automatic Essay Scoring Using Data Augmentation in Bahasa Indonesia,” IJCCS (Indonesian Journal of Computing and Cybernetics Systems), vol. 16, nº 4, pp. 401-410, 2022.
[12] M. D. S. M. E. E. M. M.-S. Gustavo Candela, “Reusing digital collections from GLAM institutions,” Journal of Information Science, vol. 48, nº 2, pp. 251-267, 2022.
[13] M. Z. A. Y. R. A. R. M. P. H. P. A. K. A. S. B. Tiara Lailatul Nikmah, “Comparison of LSTM, SVM, and naive Bayes for classifying sexual harassment tweets,” Journal of Soft Computing Exploration, vol. 3, nº 2, pp. 131-137, 2022.
[14] H. D. Muhittin IŞIK, “The impact of text preprocessing on the prediction of review ratings,” Turkish Journal of Electrical Engineering & Computer Sciences, vol. 28, nº 3, pp. 1405-1421, 2020.
[15] K. S. Neha Garg, “Text pre-processing of multilingual for sentiment analysis based on social network data,” International Journal of Electrical and Computer Engineering (IJECE), vol. 12, nº 1, pp. 776-784, 2022.
[16] S. L. Z. Z. Huiru Wang, “Ramp loss for twin multi-class support vector classification,” INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, vol. 51, nº 8, pp. 1448-1463, 2020.
[17] B. D. G. R. P. M. Hajah T. Sueno, “Multi-class Document Classification using Support Vector Machine (SVM) Based on Improved Naïve Bayes Vectorization Technique,” International Journal of Advanced Trends in Computer Science and Engineering, vol. 9, nº 3, pp. 3937-3944, 2020.
[18] S. D. M. A. L. Owais Mujtaba Khanday, “Effect of filter sizes on image classification in CNN: a case study on CFIR10 and Fashion-MNIST datasets,” IAES International Journal of Artificial Intelligence (IJ-AI), vol. 10, nº 4, pp. 872-878, 2021.
[19] D. M. M. B. K. K. E. C. T. Jennifer Jepkoech, “The Effect of Adaptive Learning Rate on the Accuracy of Neural Networks,” International Journal of Advanced Computer Science and Applications, vol. 12, nº 8, pp. 736-751, 2021.
[20] H. J. B. J. M. C. C. Aachal Jakhotiya, “Text Pre-Processing Techniques in Natural Language Processing: A Review,” International Research Journal of Engineering and Technology (IRJET), vol. 09, nº 02, pp. 878-880, 2022.
![Creative Commons License](http://i.creativecommons.org/l/by/4.0/88x31.png)
This work is licensed under a Creative Commons Attribution 4.0 International License.
The Authors submitting a manuscript do so on the understanding that if accepted for publication, the copyright of the article shall be assigned to Jurnal Lontar Komputer as the publisher of the journal. Copyright encompasses exclusive rights to reproduce and deliver the article in all forms and media, as well as translations. The reproduction of any part of this journal (printed or online) will be allowed only with written permission from Jurnal Lontar Komputer. The Editorial Board of Jurnal Lontar Komputer makes every effort to ensure that no wrong or misleading data, opinions, or statements be published in the journal.
This work is licensed under a Creative Commons Attribution 4.0 International License.