Synthesizing Social Media Posts: The Development and Validation of Synthetic Tweets for Misinformation Research

Kendall Moore
Benjamin Motz

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This paper describes the development and validation of a synthetic dataset of social media posts designed to support misinformation education research. Because existing datasets of real social media content are topically narrow, inconsistently labeled, and often unsuitable for training use, the authors used ChatGPT to generate a large bank of tweets, each exemplifying one of eight rhetorical manipulation tactics — including ad hominem attacks, emotional language, false dichotomies, and slippery slope arguments — or none in the case of control items. Posts underwent manual review and refinement to improve authenticity and reduce bias, yielding 374 final stimuli rendered as realistic tweet images. A validation study with 480 nationally representative U.S. participants assessed the posts across dimensions including trustworthiness, shareability, and emotional arousal. Results confirmed that the synthetic posts were distinguishable by tactic while remaining representative of authentic social media discourse, supporting their use as training stimuli in misinformation literacy interventions.

Version published to 10.31234/osf.io/uqx4k_v1 on OSF Preprints
Apr 13, 2026

Tracing Bias to Its Sources: A Word Embedding Audit of Racism in South African News Outlets

This article has 3 authors:
1. Nnaemeka Ohamadike
2. Kevin Durrheim
3. Mpho Primus
This article has no evaluationsLatest version Apr 9, 2026
TWON social media: a scalable MERN-Stack platform for experimental research in online social networks

This article has 9 authors:
1. Abdul Sittar
2. Michael Heseltine
3. Francois t’Serstevens
4. Natan Viteznik
5. Corinna Oschatz
6. Mateja Smiljanic
7. Alenka Gucek
8. Damian Trilling
9. Marko Grobelnik
This article has no evaluationsLatest version Apr 9, 2026
Geographically aggregated psychological traits from linguistic analysis of Twitter data predict U.S. voter realignment since 2016

This article has 3 authors:
1. Michael Stewart Cohen
2. Mehak Sachdeva
3. Zening Duan
This article has no evaluationsLatest version Apr 19, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Tracing Bias to Its Sources: A Word Embedding Audit of Racism in South African News Outlets

TWON social media: a scalable MERN-Stack platform for experimental research in online social networks

Geographically aggregated psychological traits from linguistic analysis of Twitter data predict U.S. voter realignment since 2016