Experimental evaluation of AI-driven protein design risks using safe biological proxies
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Advances in machine learning are providing leaps forward for beneficial applications of protein engineering, while also raising concerns about biosecurity. Recently, Wittmann et al. described an in silico pipeline of generative AI tools to reformulate sequences of concern (SOCs) as synthetic homologs that may evade detection by biosecurity screening software (BSS) used by nucleic acid synthesis providers. Experimental testing of synthetic homologs is required to ascertain the true severity of this vulnerability. We present a generalizable framework to assess biosecurity risk consisting of testing, evaluation, validation, and verification (TEVV) of AI-assisted protein design (AIPD). We determine that common AIPD models in use at the time this study was initiated (early 2024) are not yet powerful enough to reliably rewrite the sequence of a given protein, while both maintaining activity and evading detection by BSS.