AI-Orchestrated Active Learning for Insulin Delivery Material Discovery
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
We present a comprehensive implementation of an AI-driven material discovery platform for insulin delivery patches, featuring LangChain-orchestrated active learning that integrates three core computational workflows: (1) Retrieval-Augmented Generation (RAG) powered literature mining using Semantic Scholar API with ChromaDB vector storage and OpenAI embeddings, (2) conversational memory-enhanced PSMILES (Polymer SMILES) generation using GPT-4o with chemical validation frameworks, and (3) OpenMM molecular dynamics simulations employing Langevin integrators with AMBER force fields for insulin-polymer systems. The platform implements a closed-loop active learning orchestrator that iteratively refines material discovery through intelligent feedback mechanisms. Our RAG system utilizes text-embedding-3-small (1536-dimensional) vectors with ChromaDB persistent storage, achieving semantic similarity search over scientific literature. The PSMILES generator employs conversation buffer memory with 10-exchange context windows and multi-stage validation pipelines. MD simulations use LangevinMiddleIntegrator with 2 fs timesteps at physiological temperature (310 K) under NPT ensemble conditions with MonteCarloBarostat pressure coupling. The complete system demonstrates the integration of large language models as orchestration agents for complex scientific workflows, providing a template for AI-accelerated materials discovery in biomedical applications.