Methodological Challenges in Content-Based Citation Analysis: Expertise, Reliability, and the Primacy of Citance Identification
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Content-based citation analysis seeks to capture the meaning and functions of citations but continues to face unresolved methodological challenges. This study analyzes a stratified sample of Library and Information Science publications to examine how citance segmentation and annotator expertise influence the consistency of classification. Using two annotators with different professional backgrounds, the findings show that agreement is high when citances are defined identically, but reliability decreases sharply once text boundaries diverge. Citance length, rather than subject category or citation density, emerges as the strongest predictor of disagreement. These results identify segmentation as a methodological rather than a purely technical issue, shaping both human and automated tagging outcomes. By highlighting the interplay between expertise effects and boundary definitions, the study underscores the need for clearer operational frameworks in citation analysis. The contribution lies in demonstrating that methodological refinements in citance identification are essential for improving reproducibility, enhancing hybrid human-machine approaches, and strengthening the validity of citation-based indicators in research evaluation.