A Process-Centric Survey of AI for Scientific Discovery Through the EXHYTE Framework

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Large language models (LLMs) and agent systems are increasingly transforming scientific discovery, driving progress across chemistry, biology, materials science, and physics. Yet most existing work and surveys remain fragmented, focusing on isolated tasks such as idea generation or experiment design without addressing how these components fit within the broader discovery process. To bridge this gap, we introduce the EXHYTE cycle, an iterative framework that formalizes scientific discovery as a sequence of Exploration, Hypothesis generation, and Testing. We assembled a corpus of recent studies, distilled recurring strategies that characterize how AI methods contribute to each EXHYTE substage, and organized the literature accordingly to representative strategies and domain-specific advances. This process-centric perspective unifies diverse methodologies under a single structured workflow, identifies substages that are mature versus underexplored, and reveals complementarities that enable closed-loop discovery systems. It also clarifies the evolving division of labor between human researchers and AI systems, offering a roadmap for developing adaptive, autonomous frameworks for AIdriven scientific discovery.An accompanying website with paper summaries and an LLM-powered interactive survey based on EXHYTE is available at https: //webapps.crc.pitt.edu/exhyte/

Article activity feed