Evaluating an LLM’s Performance in Annotating Discourse Strategies

Taylor Meizlish
Chris Ziffo

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Manual annotation remains essential for identifying complex pragmatic and discourse-level features in corpus linguistics, particularly the functional components of speech acts. While part-of-speech and semantic tagging can be automated with high accuracy, annotating discourse strategies remains challenging due to their context-sensitive nature and lack of consistent lexical realizations. These limitations hinder the scalability of function-to-form approaches and constrain the development of richly annotated corpora for pragmatics research and instruction. This study investigates whether a large language model (LLM), specifically ChatGPT-4, can support functional annotation of refusal strategies in English. A corpus of written Discourse Completion Tasks by Japanese university English learners was analyzed for reliability, human-rater agreement, accuracy, and generalizability. The results suggest an LLM can greatly assist the process of pragmatic annotation to increase scalability and accuracy.

Version published to 10.21203/rs.3.rs-7246876/v1 on Research Square
Sep 2, 2025

POLKE: A system for comprehensively annotating pedagogically-oriented grammatical structure use in language production

This article has 2 authors:
1. Nelly Sagirov
2. Xiaobin Chen
This article has no evaluationsLatest version Sep 1, 2025
How LLMs Assess Public Speaking? Methodology of Explaining LLM Judgments through Linguistic Patterns and Rhetorical Criteria

This article has 5 authors:
1. Alisa Barkar
2. Mathieu Chollet
3. Matthieu Labeau
4. Beatrice Biancardi
5. Chloé Clavel
This article has no evaluationsLatest version Sep 8, 2025
Potential Use of ChatGPT for Automated Essay Scoring Based

This article has 3 authors:
1. Roghaye Torki
2. Fariba Rahimi Esfahani
3. Farshad Kiyoumarsi
This article has no evaluationsLatest version Sep 25, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

POLKE: A system for comprehensively annotating pedagogically-oriented grammatical structure use in language production

How LLMs Assess Public Speaking? Methodology of Explaining LLM Judgments through Linguistic Patterns and Rhetorical Criteria

Potential Use of ChatGPT for Automated Essay Scoring Based