Machine learning driven acceleration of biopharmaceutical formulation development using Excipient Prediction Software (ExPreSo)

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Formulation development of protein biopharmaceuticals has become increasingly challenging due to new modalities and higher desired drug substance concentrations. The constraint in drug substance supply and the need for a holistic set of analytical methods means that only a small selection of excipients can be thoroughly tested in the wet lab. Until now there have been few in-silico tools developed to refine the candidate excipients selected for wet lab testing. To fill this gap we developed the Excipient Prediction Software (ExPreSo), a machine learning algorithm that suggests inactive ingredients based on the properties of the protein drug substance and target product profile. A dataset of over 350 peptide/protein drug formulations with proven long-term stability was created. The dataset was augmented with predictive features including protein structural properties, protein language model embeddings, and drug product characteristics. Supervised machine learning was conducted to create a model that suggests excipients for each drug substance in the dataset. ExPreSo could successfully predict the presence of the nine most prevalent excipients, with validation scores well above a random prediction, and minimal overfitting. A fast variant of ExPreSo using only sequence-based input features showed similar prediction power to slower models that relied on molecular modeling. Interestingly, an ExPreSo variant with only protein-based input features also showed good performance, proving that the algorithm was resilient to the influence of platform formulations in the dataset. To our knowledge, this is the first time that machine learning has been used to suggest biopharmaceutical excipients. Overall, ExPreSo shows great potential to reduce the time, costs, and risks associated with excipient screening during formulation development.

Article activity feed