Quantifying Social Media Narratives Using Large Language Models: A Scalable Workflow for Social Measurement

Yan Zhang
Mingfei Zhang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Social science research faces an enduring tradeoff between the interpretive depth of qualitative methods and the statistical scale of quantitative approaches. This paper addresses this gap by introducing a scalable measurement workflow that transforms naturally occurring narrative text into reproducible quantitative variables using large language models (LLMs) as configurable and auditable coding instruments. We analyze 6,814 public Reddit posts from caregiving communities to extract multidimensional measures of caregiving burden, emotional sentiment, and selected demographic attributes.A key design feature of the framework is the inclusion of confidence scores for inferred attributes, which makes variation in inferability explicit within the measurement process. The resulting analyses show that LLM-assisted measurement can recover theoretically consistent patterns related to caregiving roles and life-course variation in burden, while also revealing structured differences in what information is narratively available for inference. In particular, demographic attributes vary sharply in inferability across posts, and analyses conditioned on higher-confidence inferences yield more pronounced and interpretable subgroup patterns.As a methodological contribution, the study offers a transparent pipeline for integrating inductive pattern extraction from narrative data with deductive subgroup comparison, helping to close the interpretive-inferential loop. More broadly, the proposed workflow demonstrates how generative AI can be used to scale the analysis of meaning-rich narratives while remaining complementary to established qualitative and quantitative methods.

Version published to 10.31235/osf.io/3ewzv_v1 on OSF Preprints
Mar 1, 2026

Uses and Misuses of Large Language Models in Qualitative Research

This article has 1 author:
1. Jonathan Ben-Menachem
This article has no evaluationsLatest version Mar 17, 2026
Multiple Methods for Visualizing Human Language: A Tutorial for Social and Behavioural Scientists

This article has 9 authors:
1. Veerle Celine Eijsbroek
2. August Nilsson
3. Leon Ackermann
4. Zhuojun Gu
5. Clara Wiebel
6. Adithya V Ganesan
7. Katarina Kjell
8. H. Andrew Schwartz
9. Oscar Nils Erik Kjell
This article has no evaluationsLatest version Mar 13, 2026
Mapping Fertility Narratives at Scale Using Large Language Models

This article has 2 authors:
1. Yan Zhang
2. Mingfei Zhang
This article has no evaluationsLatest version Feb 28, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Uses and Misuses of Large Language Models in Qualitative Research

Multiple Methods for Visualizing Human Language: A Tutorial for Social and Behavioural Scientists

Mapping Fertility Narratives at Scale Using Large Language Models