EXa-LM: A Controlled Natural Language Bridge between Large Language Models and First-Order Logic Solvers

Francis Frydman

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Large language models (LLMs) have demonstrated logical reasoning abilities, but their inferences remain non-traceable and lack formal guarantees. We introduce eXa-LM, a controlled natural language (CNL) interface between LLMs and first-order logic solvers. Based on a Controlled Natural Language, our approach aims to create an explicit, verifiable, and interpretable bridge between text and formal logic. It relies on three main components: (1) a reformulation prompt that constrains the LLM to produce a set of facts and rules in CNL, (2) the semantic analyzer eXaSem translating this CNL into a Prolog program made of extended Horn clauses, and (3) the logic engine eXaLog, which integrates a secondorder meta-interpreter capable of inferring ontological properties. We evaluate eXa-LM on three standard benchmarks—PrOntoQA, ProofWriter and FOLIO—comparing it to GPT-4, GPT-4o, Logic-LM and LINC. Results show that eXa-LM matches or exceeds recent neurosymbolic systems while providing full traceability of reasoning and intrinsic explainability. On FOLIO, eXa-LM achieves 92.9% accuracy, a +1.05 point gain over GPT-4o Chain-of-Thought. This approach demonstrates the feasibility of a transparent neuro-symbolic reasoning pipeline in which LLMs produce not direct inferences but formally controlled linguistic representations. eXa-LM opens the way to neuro-symbolic architectures that are safer, verifiable and extensible, ultimately integrating hypothetical, abductive or inductive reasoning. Program and data are publicly available at https://github.com/FFrydman/eXa-LM.

Version published to 10.20944/preprints202512.1848.v1
Dec 22, 2025

Observer State in Large Language Models: The Failure of AI Reasoning and Conceptual Logic

This article has 1 author:
1. Michael Cody
This article has no evaluationsLatest version Dec 12, 2025
A Multimodal AI System: comparing LLMs and Theorem Proving Systems

This article has 2 authors:
1. Phillip G. Bradford
2. Henry Orphys
This article has no evaluationsLatest version Jan 19, 2026
Large Language Model Agents: A Comprehensive Survey on Architectures, Capabilities, and Applications

This article has 8 authors:
1. Yiming Lei
2. Jiawei Xu
3. Chia Xin Liang
4. Ziqian Bi
5. Xiaoming Li
6. Danyang Zhang
7. Junhao Song
8. Zhenyu Yu
This article has no evaluationsLatest version Dec 24, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Observer State in Large Language Models: The Failure of AI Reasoning and Conceptual Logic

A Multimodal AI System: comparing LLMs and Theorem Proving Systems

Large Language Model Agents: A Comprehensive Survey on Architectures, Capabilities, and Applications