The JANUS Framework: Leveraging Large Language Models for Qualitative Analysis and Manuscript Writing

Alejandro Feged

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Large language models (LLMs) can accelerate qualitative analysis and scholarly writing, but their opacity and tendency to hallucinate undermine trust and reproducibility. JANUS is a human-in-the-loop framework that makes LLM-assisted research auditable, traceable, and FAIR. It combines a phased workflow—(0) context elicitation, (1) input preparation, (2) first-pass coding, (3) theme/archetype synthesis, (4) adversarial review, (5) manuscript drafting, and (6) FAIR packaging—with enforceable quality gates and JSON schemas. Every step produces machine-actionable artefacts: versioned codebooks, prompt logs (with parameters and hashes), reflexive memos, review reports, and claim–evidence links down to segment IDs. These are bundled as FAIR Digital Objects with PROV-O/DCAT metadata and, where appropriate, published as nanopublications (assertion, provenance, pub-info) plus an explicit AI-use disclosure box. Privacy and equity are built in (PII redaction, purpose limitation, and CARE alongside FAIR for Indigenous/local knowledge). JANUS positions LLMs as constrained assistants—never authors—while the human lead retains interpretation and final accountability. A worked example demonstrates how the framework reduces hallucination risk, preserves voice and context, and enables re-use of artefacts across projects and venues (e.g., journals, preprints, policy briefs). By coupling rigorous governance with lightweight practice, JANUS turns LLM outputs from ephemeral text into citable, inspectable, and reusable knowledge assets.

Version published to 10.21203/rs.3.rs-7580516/v1 on Research Square
Sep 12, 2025

Democratizing Deep Expertise: A Framework for Extracting and Codifying Tacit Knowledge Using Large Language Models

This article has 1 author:
1. Irshad Abdulla
This article has no evaluationsLatest version Oct 7, 2025
Issue Detection and Future Proofing Dutch Government Apps Using Language Technologies

This article has 3 authors:
1. Anca-Mihaela Matei
2. Flor Miriam Plaza-del-Arco
3. Natalia Amat-Lefort
This article has no evaluationsLatest version Aug 21, 2025
The Tabular Accessibility Dataset: A Benchmark for LLM-Based Web Accessibility Auditing

This article has 4 authors:
1. Manuel Andruccioli
2. Barry Bassi
3. Giovanni Delnevo
4. Paola Salomoni
This article has no evaluationsLatest version Sep 19, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Democratizing Deep Expertise: A Framework for Extracting and Codifying Tacit Knowledge Using Large Language Models

Issue Detection and Future Proofing Dutch Government Apps Using Language Technologies

The Tabular Accessibility Dataset: A Benchmark for LLM-Based Web Accessibility Auditing