Patient Satisfaction-Driven Doctor Recommendation Strategy Using HIS Data: A Real-World Machine Learning Approach
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background Patient satisfaction is a core indicator of healthcare quality, with doctor–patient communication being a primary determinant. Existing doctor recommendation models are typically designed for online or disease-specific contexts, relying on subjective ratings or limited clinical data, and are not well suited for routine in-person care. There is a need for scalable, objective approaches that support personalized doctor recommendations without increasing clinical burden. Objective This study aims to develop and evaluate a doctor recommendation model that leverages structured hospital information system (HIS) data. By combining clustering and random forest methods, the model predicts patient satisfaction based on doctor and patient behavioral patterns, enabling personalized recommendations aligned with real-world clinical workflows. Method We analyzed 49,372 outpatient visit records from a tertiary hospital in China, excluding patients with fewer than three visits. A random forest model was developed to predict patient satisfaction (negative, neutral, positive) using 63 behavioral features from HIS data. Doctor types were identified through clustering. A satisfaction-driven recommendation strategy reassigned patients originally matched with lower-satisfied doctor types to higher-satisfied types within the same department. Model performance and recommendation effects were evaluated using AUC, recall, precision, and F1-score. Results The final dataset included 27,290 visits from patients with ≥3 encounters. The model used 63 features and achieved AUCs of 0.85 (negative), 0.82 (neutral), and 0.81 (positive), with validation and test accuracies of 0.77 and 0.78, respectively. After applying the recommendation strategy, 319 cases shifted from negative or neutral to positive satisfaction, significantly exceeding the 79 cases shifting in the opposite direction (McNemar–Bowker symmetry test, p < .001). Additionally, doctor visit distribution became more balanced post-recommendation, with Type A doctors’ share reduced from 50.1% to a more even allocation across four doctor types (p < .001), indicating improved workload distribution alongside satisfaction gains. Discussion and Conclusion Leveraging objective HIS data allows effective, personalized doctor recommendations that overcome subjective method limitations. The model adapts dynamically to real-time data and requires no additional workload or operational changes, supporting seamless implementation. This approach enhances patient experience while providing actionable insights for quality improvement.