Predicting unknown viral hosts with Dynamic Positive-Unlabeled learning

Gabriele Pignalberi
Andrea Tonelli
Stefano Giagu
Moreno Di Marco

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Most emerging infectious diseases originate from animals (i.e. zoonoses), but our knowledge of host-pathogen links remains scant. AI models have been used to predict unknown zoonotic hosts, but face challenges from biased data and the absence of confirmed negative host-pathogen associations. Here, we introduce the Dynamic Positive-Unlabeled (DPU) learning framework, an extension of classical Positive-Unlabeled learning that enables Graph Neural Networks to predict missing links in incomplete networks. DPU learning integrates a propensity score model that estimates the likelihood of observing existing links with a classifier that predicts true link existence. This approach corrects predictions to account for sampling bias and recognizes that missing links may result from either a true absence of association or gaps in data collection. We applied DPU learning to predict associations between 5,330 wild mammalian species and 33 viral families worldwide, leveraging phylogeographic relationships between mammals, observed mammal-virus association patterns, mammalian life-history traits, and genetic features of the viruses. The approach demonstrated high validation performances, providing unbiased and accurate estimation of pathogen distribution across species. DPU learning emerges a valuable tool to support strategic, data-driven surveillance activities for proactive zoonotic risk mitigation.

Version published to 10.21203/rs.3.rs-7187859/v1 on Research Square
Aug 5, 2025

ViraHInter: a dual-modal artificial intelligence framework for predicting virus-host interactions

This article has 13 authors:
1. Siqi Sun
2. Weiqiang Bai
3. Fei wang
4. Sheng Xu
5. Jialin Wang
6. Lifeng Qiao
7. Juan Li
8. Zhuyi Guo
9. Xiangyun Hou
10. Lei Bai
11. Bowen Zhou
12. Edward Holmes
13. Weifeng Shi
This article has no evaluationsLatest version Apr 13, 2026
Predicting Influenza Virus Host Tropism and Zoonotic Spillover Risk from Protein Sequences

This article has 8 authors:
1. Alexandra K. Longest
2. Taylor M. Grace
3. Minh Tran
4. Blake Northrop
5. Ashlyn Donohue
6. Ahmad Said
7. Stephanie L. Guertin
8. Brian E. Root
This article has no evaluationsLatest version May 24, 2026
Uncertainty-aware graph representation learning with positive-unlabeled classification for biomarker discovery in peripheral artery disease

This article has 4 authors:
1. Venkat Siva Radha Krishna Ayyalasomayajula
2. Max L. Senders
3. Jelmer M. Wolterink
4. Kak Khee Yeung
This article has no evaluationsLatest version May 13, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

ViraHInter: a dual-modal artificial intelligence framework for predicting virus-host interactions

Predicting Influenza Virus Host Tropism and Zoonotic Spillover Risk from Protein Sequences

Uncertainty-aware graph representation learning with positive-unlabeled classification for biomarker discovery in peripheral artery disease