siRNA Features - Reproducible Structure-Based Chemical Features for Off-Target Prediction

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Chemical modifications are the standard for small interfering RNAs (siRNAs) in therapeutic applications, but predicting their off-target effects remains a significant challenge. Current approaches often rely on sequence-based encodings, which fail to fully capture structural and protein–RNA interaction details critical for off-target prediction. In this study, we developed a framework to generate reproducible structure-based chemical features, incorporating both molecular fingerprints and computationally derived siRNA–AGO2 complex structures. Using an RNA-Seq off-target study, we generated over 30,000 siRNA–gene data points and systematically compared nine distinct types of feature representation strategies. Among the datasets, the highest predictive performance was achieved by Dataset 3, which used extended connectivity fingerprints (ECFPs) to encode siRNA and mRNA features. An energy-minimized dataset (7R), representing siRNA–AGO2 structural alignments, was the second-best performer, underscoring the value of incorporating reproducible structural information into feature engineering. Our findings demonstrate that combining detailed structural representations with sequence-based features enables the generation of robust, reproducible chemical features for machine learning models, offering a promising path forward for off-target prediction and siRNA therapeutic design.

Article activity feed