Disputing Your Roots: A Multi-Platform Computational Analysis of Consumer Reactions to Genetic Ancestry Testing
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Direct-to-consumer (DTC) genetic ancestry testing has grown rapidly, yet computational analysis of consumer reactions remains limited. This study presents a cross-platform computational analysis of consumer reactions to ancestry testing across 58,133 posts from Reddit, YouTube, and Google Play. We developed a six-category reaction taxonomy (acceptance, excitement, dispute, surprise, disappointment, identity crisis) and applied natural language processing methods including sentiment analysis, topic modeling, and predictive modeling. Results revealed that acceptance (9.5%) and excitement (9.4%) were most prevalent, followed by dispute (8.6%). Platform differences emerged: Reddit showed highest dispute rates (10.2%), while Google Play exhibited elevated excitement (29.6%). Dispute rates varied substantially by ancestry, with Turkish (23.5%), Greek (19.7%), and Scandinavian (18.5%) ancestries most frequently contested. Among posts containing both self-reported ethnicity and genetic results, concordance was 61.8%, quantifying the discrepancy between social and genetic definitions of ancestry. A logistic regression model predicting dispute expression achieved AUC = 0.79, identifying text length and negative sentiment as key predictors. These findings advance understanding of how consumers engage with genetic ancestry information online, with implications for DTC companies, genetic counselors, and researchers studying the social dimensions of consumer genomics.