Pemodelan Topik Pada Ulasan Hotel Menggunakan Metode BERTopic Dengan Prosedur c-TF-IDF

  • I Komang Tryana Mertayasa Universitas Udayana
  • I Dewa Made Bayu Atmaja Darmawan Universitas Udayana


User review data on travel guidance services can be useful textual data for other users. By knowing what topics are discussed in user reviews in hotel products, travel guidance service providers can group these reviews based on the topics discussed. In grouping textual data into several topics, the use of topic modeling methods can be done. In this study, the author uses the BERTopic method in modeling topics on user review data related to hotel products on one of the TripAdvisor travel guidance services. This study uses secondary data in the form of hotel reviews on the TripAdvisor site. Topic modeling with BERTopic begins with document embedding, dimensionality reduction (UMAP), clustering (HDBSCAN), and c-TF-IDF. Topic modeling using the BERTopic method resulted in 78 topics with a topic coherence value of 0.07287 and a topic diversity of 0.496154. The lower the number of topics to be generated, the value of topic coherence and topic diversity decreases


[1] Cheng, X., Fu, S., Sun, J., Bilgihan, A., & Okumus, F., “An Investigation on Online Reviews in Sharing Economy Driven Hospitality Platforms: A Viewpoint Of Trust” Tourism Management, vol. 71, p. 366-377, 2019.
[2] Taecharungroj, V., “An Analysis of TripAdvisor Reviews of 127 Urban Rail Transit Networks Worldwide” Travel Behaviour and Society, vol. 26, p. 193-205, 2022.
[3] Putranto, Y., Sartono, B., dan Djuraidah, A., “Topic Modelling And Hotel Rating Prediction Based on Customer Review in Indonesia” International Journal of Management and Decision Making, vol. 20, no. 3, p. 282-307, 2021.
[4] Hendry, D., Darari, F., Nurfadillah, R., Khanna, G., Sun, M., Condylis, P. C., dan Taufik, N., “Topic Modeling for Customer Service Chats” International Conference on Advanced Computer Science and Information Systems (ICACSIS), p. 1-6, 2021.
[5] Alam, M. H., Ryu, W.-J., Lee, S., “Joint Multi-Grain Topic Sentiment: Modeling Semantic Aspects for Online Reviews” Information Sciences, vol. 339, p. 206–223, 2016.
[6] Grootendorst, M., “BERTopic: Neural Topic Modeling with a Class-based TF-IDF Procedure” arXiv preprint arXiv:2203.05794, 2022.
[7] McInnes, L., Healy, J., & Melville, J., “UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction” arXiv preprint arXiv:1802.03426, 2018.
[8] Allaoui, M., Kherfi, M. L., dan Cheriet, A., “Considerably improving clustering algorithms using umap dimensionality reduction technique: A comparative study” International Conference on Image and Signal Processing, p. 317–325, 2020.
[9] George, Shini, “Comparison of LDA and NMF Topic Modeling Techniques for Restaurant Reviews” Indian Journal of Natural Sciences, vol. 10, no. 6, p. 28210-28216, 2020.
[10] Terragni, S., Fersini, E., Galuzzi, B. G., Tropeano, P., dan Candelieri, A., “OCTIS: Comparing and Optimizing Topic Models is Simple!” in Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, 2021, p. 263–270.
How to Cite
MERTAYASA, I Komang Tryana; DARMAWAN, I Dewa Made Bayu Atmaja. Pemodelan Topik Pada Ulasan Hotel Menggunakan Metode BERTopic Dengan Prosedur c-TF-IDF. Jurnal Nasional Teknologi Informasi dan Aplikasinya (JNATIA), [S.l.], v. 1, n. 1, p. 307-316, nov. 2022. Available at: <>. Date accessed: 26 jan. 2023.

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.