Mapping Digital Sentiment Landscapes of Hotel Reviews: A Machine Learning-Based Cross-Platform Analysis
DOI:
https://doi.org/10.29408/edumatic.v10i1.33701Keywords:
class imbalance, cross-platform analysis, multinomial naïve bayes, sentiment analysis, tf-idfAbstract
The expansion of online travel agencies (OTAs) has produced large volumes of user-generated hotel reviews, offering important resources for sentiment analysis of consumer perceptions. However, prior studies largely rely on single-platform datasets and focus on classification performance, with limited attention to cross-platform sentiment consistency and the impact of data imbalance. This study aims to analyse and compare sentiment patterns across Traveloka, Tiket.com, and Accor, while evaluating a machine learning framework under imbalanced data conditions. This study adopts a quantitative experimental design using 3,000 Indonesian-language reviews collected via web scraping. The independent variable is reviewing text, and the dependent variable is sentiment classification (positive/negative). Data were preprocessed and transformed using TF-IDF, and classified using Multinomial Naïve Bayes, with performance evaluated by accuracy, precision, recall, and F1-score. The results show that positive sentiment consistently dominates across all platforms, with Accor achieving the highest performance, followed by Tiket.com and Traveloka. However, very high recall values for the positive class indicate substantial class imbalance, which biases predictions and reduces sensitivity to negative sentiment. This study provides empirical evidence of cross-platform sentiment consistency and highlights the importance of addressing data imbalance in sentiment modelling.
References
Abdullah, T., & Ahmet, A. (2022). Deep learning in sentiment analysis: Recent architectures. ACM Computing Surveys, 55(8), 1–37. https://doi.org/10.1145/3548772
Altalhan, M., Algarni, A., & Alouane, M. T.-H. (2025). Imbalanced data problem in machine learning: A review. IEEE Access, 13, 13686–13699. https://doi.org/10.1109/ACCESS.2025.3531662
Ameur, A., Hamdi, S., & Ben Yahia, S. (2023). Sentiment analysis for hotel reviews: a systematic literature review. ACM Computing Surveys, 56(2), 1–38. https://doi.org/10.1145/3605152
Anubha, A., Narang, D., & Jain, M. K. (2025). Online travel reviews and tourist trust: a SOR perspective. Global Knowledge, Memory and Communication, 74(5–6), 1655–1676. https://doi.org/10.1108/GKMC-04-2023-0145
Bi, J.-W., Zhu, X.-E., & Han, T.-Y. (2024). Text analysis in tourism and hospitality: A comprehensive review. Journal of Travel Research, 63(8), 1847–1869. https://doi.org/10.1177/00472875241247318
Burkov, I., & Gorgadze, A. (2023). From text to insights: understanding museum consumer behavior through text mining TripAdvisor reviews. International Journal of Tourism Cities, 9(3), 712–728. https://doi.org/10.1108/IJTC-05-2023-0085
Chen, X., Hyun, S. S., & Lee, T. J. (2022). The effects of parasocial interaction, authenticity, and self‐congruity on the formation of consumer trust in online travel agencies. International Journal of Tourism Research, 24(4), 563–576. https://doi.org/10.1002/jtr.2522
Cui, J., Wang, Z., Ho, S.-B., & Cambria, E. (2023). Survey on sentiment analysis: evolution of research methods and topics. Artificial Intelligence Review, 56(8), 8469–8510. https://doi.org/10.1007/s10462-022-10386-z
Darraz, N., Karabila, I., El-Ansari, A., Alami, N., & El Mallahi, M. (2025). Advancing recommendation systems with DeepMF and hybrid sentiment analysis: Deep learning and Lexicon-based integration. Expert Systems with Applications, 279, 127432. https://doi.org/10.1016/j.eswa.2025.127432
Ghosh, K., Bellinger, C., Corizzo, R., Branco, P., Krawczyk, B., & Japkowicz, N. (2024). The class imbalance problem in deep learning. Machine Learning, 113(7), 4845–4901. https://doi.org/10.1007/s10994-022-06268-8
Kadhuim, Z. A., & Al-Janabi, S. (2023). Codon-mRNA prediction using deep optimal neurocomputing technique (DLSTM-DSN-WOA) and multivariate analysis. Results in Engineering, 17, 100847. https://doi.org/10.1016/j.rineng.2022.100847
Kirilenko, A., Stepchenkova, S., Gromoll, R., & Jo, Y. (2024). Comprehensive examination of online reviews divergence over time and platform types. International Journal of Hospitality Management, 117, 103647. https://doi.org/10.1016/j.ijhm.2023.103647
Lyu, J., Khan, A., Bibi, S., Chan, J. H., & Qi, X. (2022). Big data in action: An overview of big data studies in tourism and hospitality literature. Journal of Hospitality and Tourism Management, 51, 346–360. https://doi.org/10.1016/j.jhtm.2022.03.014
Madzik, P., Falát, L., Copuš, L., & Valeri, M. (2023). Digital transformation in tourism: bibliometric literature review based on machine learning approach. European Journal of Innovation Management, 26(7), 177–205. https://doi.org/10.1108/EJIM-09-2022-0531
Mariani, M., & Baggio, R. (2022). Big data and analytics in hospitality and tourism: a systematic literature review. International Journal of Contemporary Hospitality Management, 34(1), 231–278. https://doi.org/10.1108/IJCHM-03-2021-0301
Mehraliyev, F., Chan, I. C. C., & Kirilenko, A. P. (2022). Sentiment analysis in hospitality and tourism: a thematic and methodological review. International Journal of Contemporary Hospitality Management, 34(1), 46–77. https://doi.org/10.1108/IJCHM-02-2021-0132
Méndez, M., Merayo, M. G., & Núñez, M. (2025). Design of hybrid machine learning and TF-IDF models to discard irrelevant reviews on public transport stations. Journal of Information and Telecommunication, 9(4), 481–504. https://doi.org/10.1080/24751839.2025.2472503
Mydyti, H., & Ware, A. (2025). Integrating Intelligent Web Scraping Techniques in Internship Management Systems: Enhancing Internship Matching. Annals of Emerging Technologies in Computing (AETiC), 9(1), 1–23. https://doi.org/10.33166/AETiC.2025.01.001
Núñez, J. C. S., Gómez‐Pulido, J. A., & Ramírez, R. R. (2024). Machine learning applied to tourism: A systematic review. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 14(5), e1549. https://doi.org/10.1002/widm.1549
Obiedat, R., Qaddoura, R., Ala’M, A.-Z., Al-Qaisi, L., Harfoushi, O., Alrefai, M., & Faris, H. (2022). Sentiment analysis of customers’ reviews using a hybrid evolutionary SVM-based approach in an imbalanced data distribution. Ieee Access, 10, 22260–22273. https://doi.org/10.1109/ACCESS.2022.3149482
Quach, K., Nguyen, T. T., Quach, S., Pham, H. T., Weaven, S., Nguyen, Q. V. H., Nguyen, T. T., & Thaichon, P. (2026). Digital twins: Enhancing personalized experience through multiple and dynamic persona across customer phygital journey. Australasian Marketing Journal, 34(1), 86–102. https://doi.org/10.1177/14413582251358859
Ren, P., Zhu, B., Ren, L., & Ding, N. (2023). Online choice decision support for consumers: Data-driven analytic hierarchy process based on reviews and feedback. Journal of the Operational Research Society, 74(10), 2227–2240. https://doi.org/10.1080/01605682.2022.2129491
Ren, Z., Lin, T., Feng, K., Zhu, Y., Liu, Z., & Yan, K. (2023). A systematic review on imbalanced learning methods in intelligent fault diagnosis. IEEE Transactions on Instrumentation and Measurement, 72, 1–35. https://doi.org/10.1109/TIM.2023.3246470
Sánchez, E. B., Deegan, J., & Ricardo, E. D. C. P. (2022). Influence of internet on tourism consumer behaviour: A systematic review. Advances in Hospitality and Tourism Research (AHTR), 10(1), 130–156. https://doi.org/10.30519/ahtr.917210
Shariffuddin, N. S. M., Azinuddin, M., Yahya, N. E., & Hanafiah, M. H. (2023). Navigating the tourism digital landscape: The interrelationship of online travel sites’ affordances, technology readiness, online purchase intentions, trust, and E-loyalty. Heliyon, 9(8). https://doi.org/10.1016/j.heliyon.2023.e19135
Sharma, G. D., Taheri, B., Gupta, M., & Chopra, R. (2023). Over 33 years of the hospitality research: a bibliometric review of the international journal of contemporary hospitality management. International Journal of Contemporary Hospitality Management, 35(7), 2564–2589. https://doi.org/10.1108/IJCHM-04-2022-0499
Suhaimin, M. S. M., Hijazi, M. H. A., Moung, E. G., Nohuddin, P. N. E., Chua, S., & Coenen, F. (2023). Social media sentiment analysis and opinion mining in public security: Taxonomy, trend analysis, issues and future directions. Journal of King Saud University-Computer and Information Sciences, 35(9), 101776. https://doi.org/10.1016/j.jksuci.2023.101776
Trisna, K. W., & Jie, H. J. (2022). Deep learning approach for aspect-based sentiment classification: a comparative review. Applied Artificial Intelligence, 36(1), 2014186. https://doi.org/10.1080/08839514.2021.2014186
Wasaya, A., Prentice, C., & Hsiao, A. (2024). Norms and consumer behaviors in tourism: A systematic literature review. Tourism Review, 79(4), 923–938. https://doi.org/10.1108/TR-03-2023-0151
Wąsowicz-Zaborek, E. (2025). National culture as a factor in visitors’ evaluations of hotel services. International Journal of Hospitality Management, 125, 104009. https://doi.org/10.1016/j.ijhm.2024.104009
Wu, D. C., Zhong, S., Wu, J., & Song, H. (2025). Tourism and hospitality forecasting with big data: A systematic review of the literature. Journal of Hospitality & Tourism Research, 49(3), 615–634. https://doi.org/10.1177/10963480231223151
Zhang, H., Zang, Z., Zhu, H., Uddin, M. I., & Amin, M. A. (2022). Big data-assisted social media analytics for business model for business decision making system competitive analysis. Information Processing & Management, 59(1), 102762. https://doi.org/10.1016/j.ipm.2021.102762
Zhang, J., Quoquab, F., Mohammad, J., & Li, Z. (2026). Metaverse tourism and generative artificial intelligence: unveiling the key factors shaping Gen Z’s travel satisfaction. Tourism Review, 81(2), 782–806. https://doi.org/10.1108/TR-07-2025-0719
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Muhammad Kholid Ridwan, Yudie Irawan, Raden Rhoedy Setiawan

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All articles in this journal are the sole responsibility of the authors. Edumatic: Jurnal Pendidikan Informatika can be accessed free of charge, in accordance with the Creative Commons license used.

This work is licensed under a Lisensi a Creative Commons Attribution-ShareAlike 4.0 International License.


