LemmaHead: RAG Assisted Proof Generation Using Large Language Models

Tianbo Yang
Mingqi Yang
Hongyi Zhao
Tianshuo Yang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Developing the logic necessary to solve mathematical problems or write mathematical proofs is one of the more difficult objectives for large language models (LLMS). Currently, the most popular methods in literature consists of fine-tuning the model on written mathematical content such as academic publications and textbooks, so that the model can learn to emulate the style of mathematical writing. In this project, we explore the effectiveness of using retrieval augmented generation (RAG) to address gaps in the mathematical reasoning of LLMs. We develop LemmaHead, a RAG knowledge base that supplements queries to the model with relevant mathematical context, with particular focus on context from published textbooks. To measure our model’s performance in mathematical reasoning, our testing paradigm focuses on the task of automated theorem proving via generating proofs to a given mathematical claim in the Lean formal language.

Version published to 10.32388/rfa8le
Feb 7, 2025

Graffiti3 : Compact Theory Libraries for Automated Mathematical Discovery

This article has 1 author:
1. Randy Davila
This article has no evaluationsLatest version Jan 19, 2026
EXa-LM: A Controlled Natural Language Bridge between Large Language Models and First-Order Logic Solvers

This article has 1 author:
1. Francis Frydman
This article has no evaluationsLatest version Dec 22, 2025
A Multimodal AI System: comparing LLMs and Theorem Proving Systems

This article has 2 authors:
1. Phillip G. Bradford
2. Henry Orphys
This article has no evaluationsLatest version Jan 19, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Graffiti3 : Compact Theory Libraries for Automated Mathematical Discovery

EXa-LM: A Controlled Natural Language Bridge between Large Language Models and First-Order Logic Solvers

A Multimodal AI System: comparing LLMs and Theorem Proving Systems