MDMMO: Multi-Property Molecular Optimization based on Multi-Distribution Mapping
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background: Traditional structure-activity relationship (SAR) studies are often inefficient. While molecular distribution mapping utilizing unpaired data offers a potential solution, current approaches struggle with multi-property optimization (MPO). The primary limitations stem from data sparsity caused by strict multi-criteria constraints and the structural information loss inherent in conventional latent space transformations. Methods: To address these challenges, we propose MDMMO, a Multi-Distribution Mapping Network for Molecular Optimization. Our framework innovatively incorporates auxiliary intermediate distributions combined with contrastive learning to bridge the semantic gap between source and target distributions. Furthermore, we replace the traditional disjoint mapping paradigm with a streamlined, unified encoder-decoder architecture. This design eliminates the redundancy of latent space transformations, thereby preserving critical structural information during the generation process. Results: Extensive experiments on the ZINC250K dataset demonstrate that MDMMO significantly outperforms state-of-the-art baselines across three difficulty-incremental MPO tasks. Specifically, the model achieves substantially higher success rates in scenarios requiring the simultaneous optimization of bioactivity (e.g., DRD2, GSK3B) and drug-likeness (QED), while maintaining high structural similarity. Ablation studies further validate the essential contributions of the auxiliary distributions and contrastive learning mechanisms. Conclusions: MDMMO provides a principled and robust solution for data-sparse optimization problems. By establishing a structured pathway through chemical space, the proposed framework offers a promising tool for accelerating lead compound optimization in drug discovery.