GPT vs Human Legal Texts Annotations: A Comparative Study with Privacy Policies
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
High-quality corpora of annotated privacy policies are scarce, yet essential for training, testing, and evaluating accurate machine learning models. However, elaborating new corpora remains an error-prone and resource-intensive task, heavily reliant on highly specialized and hard-to-find human annotators. Recent advancements in Generative Pre-trained Transformers (GPTs) open the possibility of using them to annotate privacy policies with performance comparable to that of human annotators, thereby streamlining the process while reducing human resource demands. This paper presents a novel method for annotating privacy policies based on a codebook, a well-designed prompt, and the analysis of logarithmic probabilities (logprobs) of a GPT's output tokens during the annotation process. We validated our method using the GPT-4o model and the well-known, open, multi-class, and multi-label OPP-115 corpus, achieving performance comparable to 80% of human annotators in segment-level annotation and matching 90% of human annotators in a full-text level annotation. Furthermore, incorporating logprobs analysis allowed the method to match the performance of all human annotators in full-text level annotation, suggesting that context enhances the task. These findings demonstrate the potential of our method to automate annotations with performance similar to human annotators while significantly reducing resource demands.