Topic and sentiment in comments on diabetes-related Douyin short videos: a cross-sectional text-mining study

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background Short-form video platforms are increasingly used for diabetes-related health information, and comment sections may capture users’ information needs and affective responses. Methods We analysed publicly visible top-level comments on diabetes-related Douyin (TikTok China) videos using a cross-sectional text-mining design. Videos were drawn from a previously evaluated dataset (n = 276) and stratified by information quality (final consensus modified DISCERN score) and diffusion (Douyin Communication Index) into four quadrants; six videos were selected from each quadrant (24 total). All retrieved comments (raw, n = 3,933) were used for descriptive temporal summaries, while text-based analyses were conducted on valid comments after rule-based cleaning (n = 2,007). We performed Chinese word segmentation (jieba), stop-word removal, term-frequency analysis, keyword co-occurrence network analysis (co-occurrence threshold ≥ 6), LDA topic modelling (K = 5), and SnowNLP sentiment classification (negative < 0.35; neutral 0.35–0.65; positive > 0.65). Results High-frequency terms were concentrated on diabetes, blood glucose, fasting, doctors, and insulin. The most frequent co-occurring pairs included fasting–blood glucose (25) and diabetes–blood glucose (16). Topic modelling identified five topics; Topic 2 accounted for 89.0% of valid comments (1,786/2,007). Sentiment was predominantly neutral (92.18%, 1,850/2,007), with 6.83% positive (137/2,007) and 1.00% negative comments (20/2,007). In the raw corpus, commenting activity peaked on Fridays (16.5%) and during 18:00–22:00 (29.4%), with a single hourly peak at 20:00 (254 comments). Conclusions Comment discourse was primarily oriented toward practice-oriented diabetes self-management, particularly the reporting and interpretation of glycaemic readings and related action-oriented questions. Although negative sentiment was relatively uncommon, such comments often described concrete confusion, worries, or difficulties in disease management. These findings may inform platform-level governance of health-related content and more targeted communication strategies for populations affected by diabetes.

Article activity feed