Sequence alignment and 3D structure similarity searches are necessary to refine phylogeny trees for the identification of gene ancestors: the case of IGF system
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
In eutherians, the different elements of the insulin/insulin-like growth factor (IGF) system (21 genes including ligands, receptors, binding proteins, proteases, inhibitors of proteases…) interact with each other. In the present work, we were interested in the question of when these elements appeared in the course of evolution and whether they appeared at the same evolutive node or one after the other. For this purpose, we have considered phylogenetic relationships extracted from two versions of Ensembl (releases 80 -2015- and 115 -September 2025-). Moreover, we used both sequence similarity (using BLASTP/PHIBLASTP analyses) and 3D structures similarity searches (using Foldseek Search Server analyses), either experimental (in Protein Data Bank, PDB) or predicted (in Alpha Fold protein structure Data Base, AFDB). We showed that insulin-like/IGF peptides and their receptors appeared in non-vertebrate species, as well as the protein encoded by the non-vertebrate ecdysone-inducible gene L2 (Impl2), whereas IGF-binding proteins (IGFBPs) first appeared in the vertebrate ancestor. Importantly , through 3D structure and sequence similarity searches, homologs of IGFBP-1/2/5 were detected in more distant species than recognized to date: in lamprey for IGFBP-1/2 and amphioxus for IGFBP-5. We also showed that the pregnancy-associated plasma protein A (PAPP-A) and PAPP-A2 metzincin proteases, which degrade several IGFBPs, might have appeared in non-vertebrates and that their two inhibitors, proMBP and STC, probably appeared before both proteases. Overall, the combined use of similarity search tools demonstrated -or confirmed- that ancestors of insulin-like peptides appeared before the divergence of vertebrate and non-vertebrate species, as well as their receptors, whereas the regulation strategies of the IGF/insulin ligands might be different between both clades. From a methodological point of view, the most distant homologs were found thanks to sequence and 3D structure similarity searches for 11 elements of the IGF system.