Retrieval-Augmented Generation versus Fine-tuning for Turkish Cultural Question Answering: A Comprehensive Evaluation and Analysis

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This study presents the first comprehensive evaluation of Retrieval-Augmented Generation (RAG) systems versus fine-tuned language models for Turkish cultural question answering. While fine-tuned models achieve superior performance in short-context scenarios (F1=0.799), they fail to generalize effectively in long-document or cross-document settings. To address this, we introduce a RAG pipeline augmented with Turkish-specific adaptations, including morphology-aware chunking, cultural context weighting, and agglutination-aware answer normalization. These enhancements yield a 14.0\% improvement in overall RAG performance. We construct and evaluate a domain-specific Turkish cultural QA dataset and analyze the trade-offs between accuracy, scalability, and deployment feasibility. Our findings provide actionable guidelines for deploying QA systems in morphologically rich and culturally nuanced languages like Turkish.

Article activity feed