Steering Vector Fields for Property-Controlled Molecular Generation with Chemical Language Models

Aleksandar Dimitrievikj
Jude Wells

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Chemical language models have recently become a powerful tool for the de novo generation of drug-like molecules represented as SMILES strings. A central challenge is steering generation toward compounds with favorable properties such as solubility and absorption. To this end, we investigate inference time control of generative chemical language models using activation steering. Using contrastive activation addition, we seek to improve three relevant properties: molecular size, aqueous solubility (log S), and lipophilicity (log P) without changing the model weights. We compare two interventions: a single global vector which is added to the activation in the last transformer layer, and a novel vector field where the addition vector is computed as a function of the current hidden state. Across multiple protein targets and two pre-trained models, the global steering vector yields desired results in just over half of our experiments, while the vector field achieves larger shifts at the expense of a decrease in the validity rate.

Version published to 10.1101/2025.09.24.678080 on bioRxiv
Sep 25, 2025

Large Language Model Agent for Modular TaskExecution in Drug Discovery

This article has 6 authors:
1. Janghoon Ock
2. Radheesh Sharma Meda
3. Srivathsan Badrinarayanan
4. Neha S. Aluru
5. Achuth Chandrasekhar
6. Amir Barati Farimani
This article has no evaluationsLatest version Sep 17, 2025
Pretrained protein language models choose between sequence novelty and structural completeness

This article has 3 authors:
1. Arjuna M. Subramanian
2. Zachary A. Martinez
3. Matt Thomson
This article has no evaluationsLatest version Oct 3, 2025
Pocket-based molecule generation with an SE(3)-equivariant language model leads to a potent and selective HPK1 inhibitor with in vivo efficacy

This article has 16 authors:
1. Bin Xi
2. Han Wang
3. Guanglong Sun
4. Bowen Zhang
5. Ruihan Mao
6. Yuyang Ge
7. Yang Wang
8. Jiangtao Zhang
9. Yiting Pan
10. Feng Zhou
11. Yuji Wang
12. Zhenming Liu
13. Daohua Jiang
14. Huting Wang
15. Wenbiao Zhou
16. Bo Huang
This article has no evaluationsLatest version Sep 25, 2025

Steering Vector Fields for Property-Controlled Molecular Generation with Chemical Language Models

Discuss this preprint

Listed in

Abstract

Article activity feed

Large Language Model Agent for Modular TaskExecution in Drug Discovery

Pretrained protein language models choose between sequence novelty and structural completeness

Pocket-based molecule generation with an SE(3)-equivariant language model leads to a potent and selective HPK1 inhibitor with in vivo efficacy

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Large Language Model Agent for Modular TaskExecution in Drug Discovery

Pretrained protein language models choose between sequence novelty and structural completeness

Pocket-based molecule generation with an SE(3)-equivariant language model leads to a potent and selective HPK1 inhibitor with in vivo efficacy