Hybrid Statistical Learning for COVID-19 Testing Operations: A Multitask Modeling of Positivity, Viral Load and Turnaround Time

Subhrajit Saha
Sagnik Acharrya
Debashis Chatterjee

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Timely and reliable COVID-19 testing requires more than accurate diagnostics: laboratories must triage infection risk, quantify viral burden among positives, and de-bottleneck turnaround times (TATs). Using a de-identified cohort of 15{,}524 PCR tests from a large pediatric health system in 2020, we develop a hybrid pipeline that jointly models (i) test positivity, (ii) cycle-threshold (Ct) among positives, and (iii) operational TAT components (collection$\to$receipt and receipt$\to$verification). Methodologically, we combine interpretable generalized additive models with elastic-net GLMs and gradient-boosted trees for classification; semiparametric regression and quantile methods (including quantile forests) for Ct; and accelerated failure-time and quantile regression for TAT. We embed time-aware validation, calibration, explainability, and fairness audits, and estimate policy effects of drive-through collection using orthogonalized Double Machine Learning with causal forests for heterogeneity. On rolling time splits, the stacked classifier attains AUROC $\approx 0.80$ with good low-range calibration; age and pandemic day are the dominant nonlinear drivers of positivity, Ct shows pronounced heterogeneity across settings, and TATs exhibit heavy-tailed, persistent variability. While naive ATE estimates for drive-through are fragile under missingness/overlap, subgroup causal forests highlight operational segments with plausible gains. The result is a deployment-ready blueprint that integrates predictive performance with governance (calibration, shift robustness, fairness) for laboratory operations.

Version published to 10.21203/rs.3.rs-8595144/v1 on Research Square
Mar 2, 2026

Development and Validation of a Machine Learning Model for Hepatitis C Virus Exposure: A Demographic Screening Approach for the US Population

This article has 5 authors:
1. Dorian G Ding
2. Taoyi Chen
3. Yu Sheng
4. Jeffrey S.H. Lin
5. Ye Yuan
This article has no evaluationsLatest version Apr 15, 2026
A Retrospective Evaluation of Backfill Estimation Models for Influenza, COVID-19, and RSV Hospital Admissions

This article has 3 authors:
1. Michal Ben-Nun
2. James Turtle
3. Pete Riley
This article has no evaluationsLatest version Feb 27, 2026
Explainable Machine Learning for Classification of HIV Viral Load Suppression in Resource- Limited Settings

This article has 4 authors:
1. Abraham Keffale Mengistu
2. Aynadis Worku Shime
3. Muluken Belachew Mengistie
4. Gizaw Hailiye Teferi
This article has no evaluationsLatest version Feb 27, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Development and Validation of a Machine Learning Model for Hepatitis C Virus Exposure: A Demographic Screening Approach for the US Population

A Retrospective Evaluation of Backfill Estimation Models for Influenza, COVID-19, and RSV Hospital Admissions

Explainable Machine Learning for Classification of HIV Viral Load Suppression in Resource- Limited Settings