Active learning-guided optimization of cell-free biosensors for lead testing in drinking water

Brenda M. Wang
Nicole Chiang
Holly M. Ekas
Dylan M. Brown
Garrett Dildine
Tyler J. Lucci
Siyuan Feng
Vanessa Bly
Jean-François Gaillard
Julius B. Lucks
Ashty S. Karim
Diwakar Shukla
Michael C. Jewett

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Point-of-use diagnostics based on allosteric transcription factors (aTFs) are promising tools for environmental monitoring and human health. However, biosensors relying on natural aTFs rarely exhibit the sensitivity and selectivity needed for real-world applications, and traditional directed evolution struggles to optimize multiple biosensor properties at once. To overcome these challenges, we develop a multi-objective, machine learning (ML)-guided cell-free gene expression workflow for engineering aTF-based biosensors. Our approach rapidly generates high-quality sequence-to-function data, which we transform into an augmented paired dataset to train an ML model using directional labels that capture how aTF mutations alter performance. We apply our workflow to engineer the aTF PbrR as a point-of-use diagnostic for lead contamination in water. We tune the sensitivity of PbrR to sense at the U.S. Environmental Protection Agency (EPA) action level for lead and modify the selectivity away from zinc, a common metal found in water supplies. Finally, we show that the engineered PbrR functions in freeze-dried cell-free reactions, enabling a diagnostic capable of detecting lead in drinking water down to ∼5.7 ppb. Our ML-driven, multi-objective framework—powered by directional tokens— can generalize to other biosensors and proteins, accelerating the development of synthetic biology tools for biotechnology applications.

Version published to 10.1101/2025.08.20.671382 on bioRxiv
Aug 20, 2025

Multi-Modal Ensemble Learning for TLR4 Binding Prediction: Addressing Data Scarcity and Leakage in Small Molecule Drug Discovery

This article has 3 authors:
1. Brandon Yee
2. Maximilian Rutkowski
3. Wilson Collins
This article has no evaluationsLatest version Jan 28, 2026
Accelerating personalized medicine: Miniaturized patient-derived organoid drug screening for predicting cancer treatment responses and beyond

This article has 15 authors:
1. Yasmine Abouleila
2. Lidwien P. Smabers
3. Timo Voskuilen
4. Mayke Doorn
5. Roel Verkerk
6. Gakuro Harada
7. Masahiko Watanabe
8. Hideaki Kyan
9. Takahiko Kumagai
10. Yuichi Hikichi
11. Rene Overmeer
12. Jeanine M.L. Roodhart
13. Kiyotaka Matsuno
14. Carla S. Verissimo
15. Sylvia F Boj
This article has no evaluationsLatest version Dec 22, 2025
Drug discovery guided by maximum drug likeness

This article has 3 authors:
1. Hao-Yu Zhu
2. Lu Xu
3. Wei Shi
This article has no evaluationsLatest version Dec 31, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Multi-Modal Ensemble Learning for TLR4 Binding Prediction: Addressing Data Scarcity and Leakage in Small Molecule Drug Discovery

Accelerating personalized medicine: Miniaturized patient-derived organoid drug screening for predicting cancer treatment responses and beyond

Drug discovery guided by maximum drug likeness