A Multilingual BERT-based classification of reviews for enhanced visitors’ experience analysis

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Cultural organizations today can rely on online platforms to study users’ opinions and the most discussed topics related to both general and specific cultural offerings. Despite data acquisition tools, managing unstructured databases remains a hurdle. To overcome this, we propose a classification model that transforms unorganized data into a structured thematic database. The specific case pertains to the Italian city of Brescia. We build a language model that classifies online reviews into four semantic areas defined by the key attractions of the city. We fine-tuned the pre-trained Multilingual BERT model in a multiclassification task. The model shows promising results based on traditional performance metrics. Additionally, clusters of reviews have been detected by applying the HDBSCAN algorithm on their vector representations produced by the model. As a transformation of the chi-square statistic, the Keyness statistic has been employed to extract cluster-specific keywords, which have proven to be highly consistent with the characteristics and offerings of the key cultural attractions, further confirming the good performance of the model. Results show that the proposed model can be profitably employed by policymakers and managers of cultural tourism institutions to understand textual data and derive relevant insights about visitors’ experience at specific attractions of interest.

Article activity feed