The JANUS Framework: Leveraging Large Language Models for Qualitative Analysis and Manuscript Writing
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Large language models (LLMs) can accelerate qualitative analysis and scholarly writing, but their opacity and tendency to hallucinate undermine trust and reproducibility. JANUS is a human-in-the-loop framework that makes LLM-assisted research auditable, traceable, and FAIR. It combines a phased workflow—(0) context elicitation, (1) input preparation, (2) first-pass coding, (3) theme/archetype synthesis, (4) adversarial review, (5) manuscript drafting, and (6) FAIR packaging—with enforceable quality gates and JSON schemas. Every step produces machine-actionable artefacts: versioned codebooks, prompt logs (with parameters and hashes), reflexive memos, review reports, and claim–evidence links down to segment IDs. These are bundled as FAIR Digital Objects with PROV-O/DCAT metadata and, where appropriate, published as nanopublications (assertion, provenance, pub-info) plus an explicit AI-use disclosure box. Privacy and equity are built in (PII redaction, purpose limitation, and CARE alongside FAIR for Indigenous/local knowledge). JANUS positions LLMs as constrained assistants—never authors—while the human lead retains interpretation and final accountability. A worked example demonstrates how the framework reduces hallucination risk, preserves voice and context, and enables re-use of artefacts across projects and venues (e.g., journals, preprints, policy briefs). By coupling rigorous governance with lightweight practice, JANUS turns LLM outputs from ephemeral text into citable, inspectable, and reusable knowledge assets.