An Experimental and Qualitative Comparison of the Quality of Dating Advice from Parents Versus LLMs
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Adolescents’ romantic relationships are consistently rated as some of their most stressful experiences, ranging from breakups to decoding subtle cues on social media. Most adolescents turn to friends (74%) as well as parents (69%) for help. In parallel, nearly 75% of U.S. teens have experimented with Artificial Intelligence (AI) Large Language Models (LLMs). Here we report a preregistered [osf.io/md2ek], randomized experiment (N=404 adolescents aged 13-17) to compare the quality of relationship advice from parents against that generated by LLMs. In a within-subjects, repeated-measures design, participants read three realistic dating conflict scenarios, each drawn from a popular relationship advice application, and then read advice about how to respond. The advice was randomly assigned (and counter-balanced) to either an LLM or a parent. The LLM-provided advice was split into two sources: a fine-tuned LLM (AskElle) that was trained to provide relationship advice or an off-the-shelf LLM (either ChatGPT or Gemini). Participants then rated the responses for help with coping, likelihood of implementation, and trustworthiness. Data were analyzed using a random-intercept model estimated with a Bayesian, machine-learning method (stochtree). The LLMs outperformed parents’ advice by approximately 0.3 standard deviations across the primary outcomes, all pr(ATE>0)>.99. The fine-tuned LLM was just as effective as the off-the-shelf LLMs, when averaging across the three trials. Secondary analyses (see Figure 1) showed that participants preferred the off-the-shelf LLMs when they were the first piece of advice they saw, pr(Diff)>.99. But they strongly preferred the fine-tuned LLM when it was third or after repeated exposure. Once participants could directly compare all three sources, they overwhelmingly preferred the fine-tuned LLM’s advice, pr(Diff)>.99 over both parents and off-the-shelf LLMs. Thus in real-world contexts, with repeated consultation of an AI tool, generic advice from an LLM is likely to become less effective with adolescents seeking more attuned relationship advice. Qualitative, natural-language analyses identified elements of the advice that were most and least successful. Parents’ language tended to diminish adolescents’ feelings, dismissing their relationship problems as temporary and superficial. Off-the-shelf LLMs were overly accommodating and affirming, sometimes reinforcing more extreme appraisals and possibly entrenching young people in harmful patterns of rumination. The highest-performing answers validated participants’ feelings while challenging their appraisals of the situation, leading teens to a more optimistic appraisal, accompanied by clear action steps with example scripts.