A Prospective Real-time Early Warning System to Anticipate Onsets and Peaks of Respiratory Diseases Outbreaks at the State Level in the U.S. A Transfer Learning Approach Leveraging Digital Traces
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Respiratory disease outbreaks burden U.S. healthcare systems with over one million hospitalizations annually, yet current surveillance systems lag 1-2 weeks behind real-time conditions, preventing timely intervention. We developed a machine learning early warning system that combines Google search trends with traditional epidemiological data using ensemble voting algorithms to predict the timing of outbreak onsets and peaks across multiple respiratory pathogens. The system applies anomaly detection and transfer learning to monitor syndromic Influenza-like illnesses (ILI), and hospitalizations caused by respiratory syncytial virus (RSV) or Influenza, simultaneously, across all 50 US states. During operational real-time deployment from August 2024 through the 2024-2025 season, the system detected 98.0% of outbreak onsets with 5-week average lead time and 97.0% of peaks with 2-week average lead time, achieving positive predictive values that exceed 82%. This framework transforms reactive public health responses into proactive epidemic preparedness by reducing historical timing uncertainty from 10-20 weeks to consistent 2-6 week prediction windows, providing a scalable approach for monitoring both seasonal outbreaks and emerging respiratory threats.