Mapping Digital Sentiment Landscapes of Hotel Reviews: A Machine Learning-Based Cross-Platform Analysis

Authors

DOI:

https://doi.org/10.29408/edumatic.v10i1.33701

Keywords:

class imbalance, cross-platform analysis, multinomial naïve bayes, sentiment analysis, tf-idf

Abstract

The expansion of online travel agencies (OTAs) has produced large volumes of user-generated hotel reviews, offering important resources for sentiment analysis of consumer perceptions. However, prior studies largely rely on single-platform datasets and focus on classification performance, with limited attention to cross-platform sentiment consistency and the impact of data imbalance. This study aims to analyse and compare sentiment patterns across Traveloka, Tiket.com, and Accor, while evaluating a machine learning framework under imbalanced data conditions. This study adopts a quantitative experimental design using 3,000 Indonesian-language reviews collected via web scraping. The independent variable is reviewing text, and the dependent variable is sentiment classification (positive/negative). Data were preprocessed and transformed using TF-IDF, and classified using Multinomial Naïve Bayes, with performance evaluated by accuracy, precision, recall, and F1-score. The results show that positive sentiment consistently dominates across all platforms, with Accor achieving the highest performance, followed by Tiket.com and Traveloka. However, very high recall values for the positive class indicate substantial class imbalance, which biases predictions and reduces sensitivity to negative sentiment. This study provides empirical evidence of cross-platform sentiment consistency and highlights the importance of addressing data imbalance in sentiment modelling.

References

Abdullah, T., & Ahmet, A. (2022). Deep learning in sentiment analysis: Recent architectures. ACM Computing Surveys, 55(8), 1–37. https://doi.org/10.1145/3548772

Altalhan, M., Algarni, A., & Alouane, M. T.-H. (2025). Imbalanced data problem in machine learning: A review. IEEE Access, 13, 13686–13699. https://doi.org/10.1109/ACCESS.2025.3531662

Ameur, A., Hamdi, S., & Ben Yahia, S. (2023). Sentiment analysis for hotel reviews: a systematic literature review. ACM Computing Surveys, 56(2), 1–38. https://doi.org/10.1145/3605152

Anubha, A., Narang, D., & Jain, M. K. (2025). Online travel reviews and tourist trust: a SOR perspective. Global Knowledge, Memory and Communication, 74(5–6), 1655–1676. https://doi.org/10.1108/GKMC-04-2023-0145

Bi, J.-W., Zhu, X.-E., & Han, T.-Y. (2024). Text analysis in tourism and hospitality: A comprehensive review. Journal of Travel Research, 63(8), 1847–1869. https://doi.org/10.1177/00472875241247318

Burkov, I., & Gorgadze, A. (2023). From text to insights: understanding museum consumer behavior through text mining TripAdvisor reviews. International Journal of Tourism Cities, 9(3), 712–728. https://doi.org/10.1108/IJTC-05-2023-0085

Chen, X., Hyun, S. S., & Lee, T. J. (2022). The effects of parasocial interaction, authenticity, and self‐congruity on the formation of consumer trust in online travel agencies. International Journal of Tourism Research, 24(4), 563–576. https://doi.org/10.1002/jtr.2522

Cui, J., Wang, Z., Ho, S.-B., & Cambria, E. (2023). Survey on sentiment analysis: evolution of research methods and topics. Artificial Intelligence Review, 56(8), 8469–8510. https://doi.org/10.1007/s10462-022-10386-z

Darraz, N., Karabila, I., El-Ansari, A., Alami, N., & El Mallahi, M. (2025). Advancing recommendation systems with DeepMF and hybrid sentiment analysis: Deep learning and Lexicon-based integration. Expert Systems with Applications, 279, 127432. https://doi.org/10.1016/j.eswa.2025.127432

Ghosh, K., Bellinger, C., Corizzo, R., Branco, P., Krawczyk, B., & Japkowicz, N. (2024). The class imbalance problem in deep learning. Machine Learning, 113(7), 4845–4901. https://doi.org/10.1007/s10994-022-06268-8

Kadhuim, Z. A., & Al-Janabi, S. (2023). Codon-mRNA prediction using deep optimal neurocomputing technique (DLSTM-DSN-WOA) and multivariate analysis. Results in Engineering, 17, 100847. https://doi.org/10.1016/j.rineng.2022.100847

Kirilenko, A., Stepchenkova, S., Gromoll, R., & Jo, Y. (2024). Comprehensive examination of online reviews divergence over time and platform types. International Journal of Hospitality Management, 117, 103647. https://doi.org/10.1016/j.ijhm.2023.103647

Lyu, J., Khan, A., Bibi, S., Chan, J. H., & Qi, X. (2022). Big data in action: An overview of big data studies in tourism and hospitality literature. Journal of Hospitality and Tourism Management, 51, 346–360. https://doi.org/10.1016/j.jhtm.2022.03.014

Madzik, P., Falát, L., Copuš, L., & Valeri, M. (2023). Digital transformation in tourism: bibliometric literature review based on machine learning approach. European Journal of Innovation Management, 26(7), 177–205. https://doi.org/10.1108/EJIM-09-2022-0531

Mariani, M., & Baggio, R. (2022). Big data and analytics in hospitality and tourism: a systematic literature review. International Journal of Contemporary Hospitality Management, 34(1), 231–278. https://doi.org/10.1108/IJCHM-03-2021-0301

Mehraliyev, F., Chan, I. C. C., & Kirilenko, A. P. (2022). Sentiment analysis in hospitality and tourism: a thematic and methodological review. International Journal of Contemporary Hospitality Management, 34(1), 46–77. https://doi.org/10.1108/IJCHM-02-2021-0132

Méndez, M., Merayo, M. G., & Núñez, M. (2025). Design of hybrid machine learning and TF-IDF models to discard irrelevant reviews on public transport stations. Journal of Information and Telecommunication, 9(4), 481–504. https://doi.org/10.1080/24751839.2025.2472503

Mydyti, H., & Ware, A. (2025). Integrating Intelligent Web Scraping Techniques in Internship Management Systems: Enhancing Internship Matching. Annals of Emerging Technologies in Computing (AETiC), 9(1), 1–23. https://doi.org/10.33166/AETiC.2025.01.001

Núñez, J. C. S., Gómez‐Pulido, J. A., & Ramírez, R. R. (2024). Machine learning applied to tourism: A systematic review. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 14(5), e1549. https://doi.org/10.1002/widm.1549

Obiedat, R., Qaddoura, R., Ala’M, A.-Z., Al-Qaisi, L., Harfoushi, O., Alrefai, M., & Faris, H. (2022). Sentiment analysis of customers’ reviews using a hybrid evolutionary SVM-based approach in an imbalanced data distribution. Ieee Access, 10, 22260–22273. https://doi.org/10.1109/ACCESS.2022.3149482

Quach, K., Nguyen, T. T., Quach, S., Pham, H. T., Weaven, S., Nguyen, Q. V. H., Nguyen, T. T., & Thaichon, P. (2026). Digital twins: Enhancing personalized experience through multiple and dynamic persona across customer phygital journey. Australasian Marketing Journal, 34(1), 86–102. https://doi.org/10.1177/14413582251358859

Ren, P., Zhu, B., Ren, L., & Ding, N. (2023). Online choice decision support for consumers: Data-driven analytic hierarchy process based on reviews and feedback. Journal of the Operational Research Society, 74(10), 2227–2240. https://doi.org/10.1080/01605682.2022.2129491

Ren, Z., Lin, T., Feng, K., Zhu, Y., Liu, Z., & Yan, K. (2023). A systematic review on imbalanced learning methods in intelligent fault diagnosis. IEEE Transactions on Instrumentation and Measurement, 72, 1–35. https://doi.org/10.1109/TIM.2023.3246470

Sánchez, E. B., Deegan, J., & Ricardo, E. D. C. P. (2022). Influence of internet on tourism consumer behaviour: A systematic review. Advances in Hospitality and Tourism Research (AHTR), 10(1), 130–156. https://doi.org/10.30519/ahtr.917210

Shariffuddin, N. S. M., Azinuddin, M., Yahya, N. E., & Hanafiah, M. H. (2023). Navigating the tourism digital landscape: The interrelationship of online travel sites’ affordances, technology readiness, online purchase intentions, trust, and E-loyalty. Heliyon, 9(8). https://doi.org/10.1016/j.heliyon.2023.e19135

Sharma, G. D., Taheri, B., Gupta, M., & Chopra, R. (2023). Over 33 years of the hospitality research: a bibliometric review of the international journal of contemporary hospitality management. International Journal of Contemporary Hospitality Management, 35(7), 2564–2589. https://doi.org/10.1108/IJCHM-04-2022-0499

Suhaimin, M. S. M., Hijazi, M. H. A., Moung, E. G., Nohuddin, P. N. E., Chua, S., & Coenen, F. (2023). Social media sentiment analysis and opinion mining in public security: Taxonomy, trend analysis, issues and future directions. Journal of King Saud University-Computer and Information Sciences, 35(9), 101776. https://doi.org/10.1016/j.jksuci.2023.101776

Trisna, K. W., & Jie, H. J. (2022). Deep learning approach for aspect-based sentiment classification: a comparative review. Applied Artificial Intelligence, 36(1), 2014186. https://doi.org/10.1080/08839514.2021.2014186

Wasaya, A., Prentice, C., & Hsiao, A. (2024). Norms and consumer behaviors in tourism: A systematic literature review. Tourism Review, 79(4), 923–938. https://doi.org/10.1108/TR-03-2023-0151

Wąsowicz-Zaborek, E. (2025). National culture as a factor in visitors’ evaluations of hotel services. International Journal of Hospitality Management, 125, 104009. https://doi.org/10.1016/j.ijhm.2024.104009

Wu, D. C., Zhong, S., Wu, J., & Song, H. (2025). Tourism and hospitality forecasting with big data: A systematic review of the literature. Journal of Hospitality & Tourism Research, 49(3), 615–634. https://doi.org/10.1177/10963480231223151

Zhang, H., Zang, Z., Zhu, H., Uddin, M. I., & Amin, M. A. (2022). Big data-assisted social media analytics for business model for business decision making system competitive analysis. Information Processing & Management, 59(1), 102762. https://doi.org/10.1016/j.ipm.2021.102762

Zhang, J., Quoquab, F., Mohammad, J., & Li, Z. (2026). Metaverse tourism and generative artificial intelligence: unveiling the key factors shaping Gen Z’s travel satisfaction. Tourism Review, 81(2), 782–806. https://doi.org/10.1108/TR-07-2025-0719

Downloads

Published

2025-04-02

How to Cite

Ridwan, M. K., Irawan, Y., & Setiawan, R. R. (2025). Mapping Digital Sentiment Landscapes of Hotel Reviews: A Machine Learning-Based Cross-Platform Analysis. Edumatic: Jurnal Pendidikan Informatika, 10(1), 110–119. https://doi.org/10.29408/edumatic.v10i1.33701