Leveraging Social Media for Public Health: NLP Implementations for Blood Donation Data Analysis in Japan
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background:: Blood donation is crucial for healthcare systems, yet maintaining an adequate supply is a persistent challenge. Traditional methods to understand public sentiment and donor behavior are often limited. Social media, particularly Twitter, offers a promising alternative for real-time insights. This study explores the viability of using Twitter data to analyze blood donation sentiment in Japan, considering the evolving perspectives of younger generations. Methods:: We replicated previous study results using the Tohoku BERT model and tested a refined Blood Donation Tweets for User Classification (BDT-UC) dataset and another customized version of the model for better classification. We also compared various topic modeling methods, including Latent Dirichlet Allocation (LDA), Non-Negative Matrix Factorization (NMF), and BERT-based models, using two different preprocessing techniques. Finally, we integrated the classification into the Topic Modeling analysis for a final evaluation. Results:: Our findings indicate that although the refined dataset has an overall lower classification performance, it improved the implementation results, ensuring more balanced labeling across the data. Our refined model had a small reduction in overall precision (from 78.4% in the best evaluated model to 75.8% in the refined model). However, we improved the implementation results, ensuring more balanced labeling across the data. For topic modeling, BERT-based topic models, particularly those preprocessed with the MeCab library, achieved higher coherence and diversity scores than traditional methods. Additionally, there were significant differences when the dataset was processed by user category, with increased coherence and diversity for the undetermined one but notably lower coherence values for the other categories. Conclusion:: This study underscores the significance of initial classification and preprocessing for effective topic modeling, which impacts the viability of extracting insights from Japanese social media data. The developed methodologies could support more effective analysis of blood donation groups, and better targeted donation campaigns.