Addressing Systematic Non-response Bias with Supervised Fine-Tuning of Large Language Models: A Case Study on German Voting Behaviour

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

A major challenge for survey researchers is dealing with missing data, which restricts the scope of analysis and the reliability of inferences that can be drawn. Recently, researchers have started investigating the potential of Large Language Models (LLMs) to role-play a pre-defined set of ``characters'' and simulate their survey responses with little or no additional training data and costs. Previous research has mostly focused on zero-shot LLM predictions. However, often other survey responses are at least partially available. This work investigates the viability and robustness of supervised fine-tuning on these responses to simulate systematic and random item-level non-responses in the context of German voting behaviour.Our results show when systematic item non-responses are present, fine-tuned LLMs outperform traditional classification approaches on survey data. Fine-tuned LLMs also seem to be more robust to changes in the set of features that the model can use to make predictions. Finally, we see that fine-tuned LLMs match the performance of traditional classification methods when survey responses are missing completely at random.

Article activity feed