ProteinSight: A Volumetric Deep Learning Model for Carotenoid-Binding Site Prediction
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Carotenoproteins play essential roles across all domains of life, yet identifying them from sequence or structure remains a significant challenge due to the lack of conserved motifs. To address this gap, we present ProteinSight, a deep learning pipeline that identifies potential binding sites for carotenoids and related isoprenoids. Our approach, which utilizes a 3D U-Net architecture for semantic segmentation of physicochemical property maps, serves as a proof-of-concept for a new generation of structure-based protein function predictors. On a rigorously curated test set, ProteinSight functions as a highly sensitive and specific detector, reliably distinguishing positive from negative control proteins. Furthermore, we demonstrate its utility for hypothesis generation by predicting previously uncharacterized, plausible interaction sites on Human Serum Albumin. ProteinSight presents a scalable framework with the potential to aid in accelerating the discovery of novel carotenoproteins from large-scale structural data, potentially opening new avenues for functional annotation and bioengineering.