Closed-Loop Workflow of High-Entropy Materials Discovery: Efficient and Accurate Synthesizability Prediction via Domain-Specific Local LLMs

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

High-entropy materials (HEMs) offer unprecedented opportunities for superior mechanical, thermal, and catalytic properties, but their vast chemical space makes experimental discovery resource-intensive. State-of-the-art commercial large language models (LLMs) notably fail at HEM synthesizability prediction, a critical bottleneck in materials development. We demonstrate that domain-specific fine-tuning transforms open-weight local LLMs into accurate predictors. Using a dataset of 321,083 inorganic compositions with 2,560 HEM examples, we fine-tuned three 4-bit-quantized models (gpt-oss-20b, Qwen3-14b, and DeepSeek-R1-Distill-Qwen-14b), achieving remarkable balanced accuracy of 0.957, 0.961, and 0.956, respectively. Critically, these models operate efficiently on accessible hardware (< 15GB VRAM), eliminating costly API dependencies while ensuring data privacy and consistent reproducibility. This work could open new pathways toward autonomous closed-loop discovery, where distributed local models enable rapid screening and iterative improvement through experimental feedback. Future collaborative efforts in open data sharing, particularly including negative results, would address current fragmentation in synthesis reporting and accelerate community-wide HEM discovery.

Article activity feed