Exploring Internet hospital patient demand patterns from online consultation content using text clustering

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background Internet hospital patient generated questions during online consultations are typically short, semantically incomplete, and expressed in a non-standardized manner, which poses challenges for demand recognition and knowledge discovery. This study aims to identify core categories of patient concerns through text clustering based on real-world online consultation data, and to explore their focal issues and dynamic evolution patterns. The findings are expected to provide empirical evidence for optimizing medical resource allocation and improving service delivery models. Methodology We used patient online consultation texts from an Internet hospital as the study material. A hybrid representation was constructed by combining TF-IDF weighted Word2Vec semantic features with LDA topic features, and clustering was performed using the K-means + + algorithm. The clustering performance was evaluated using the Silhouette Coefficient (SC), Davies–Bouldin Index (DBI), and Calinski-Harabasz Index (CHI). In addition, topic analysis and time-series visualization were applied to reveal the distribution and evolution of patient demand themes. Results The proposed model demonstrated superior performance compared with the baseline models, achieving higher stability and interpretability across the evaluation metrics (SC = 0.5473, DBI = 10773.26, CHI = 0.7908). Based on this framework, six major themes were identified: appointment and registration, doctor inquiry and consultation, examinations and tests, medication and inpatient medical records, fee settlement and insurance, and customer services and account management. Temporal evolution analysis further revealed that these themes exhibited stage-specific fluctuations and seasonal aggregation, highlighting the model’s ability to capture both static structures and dynamic trends in patient needs. Conclusion The multi-feature fusion clustering approach enables a more comprehensive exploration of patients’ concerns in online consultations. It provides empirical evidence for understanding the service landscape of internet hospitals. At the same time, it offers important references for advancing precision service delivery and informing policy development in online healthcare.

Article activity feed