SynTran-fa: Generating Comprehensive Answers for Farsi QA Pairs via Syntactic Transformation

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Generating coherent and comprehensive responses remains a significant challenge Question-Answering (QA) systems when working with short answers especially for low-resourced languages like Farsi. We present a novel approach to expand these answers into complete, fluent responses, addressing the critical issue of limited Farsi resources and models. Our methodology employs a two-stage process: first, we develop a dataset using rule-based techniques on Farsi text, followed by a BERT-based ranking system to ensure fluency and comprehensibility. The resulting model demonstrates strong compatibility with existing QA systems, particularly those based on knowledge graphs. Notably, our system exhibits enhanced performance when integrated with large language models using Chain-of-Thought (CoT) prompting, leveraging detailed explanations rather than single-word answers. Our approach significantly improves response quality and coherence compared to baseline systems. We release our dataset to support further research in Farsi QA\footnote[1]{\href{https://huggingface.co/datasets/SLPL/syntran-fa}{https://huggingface.co/datasets/SLPL/syntran-fa}}.

Article activity feed