Improving Prevalence Estimates of Hepatitis C in Key Populations: A Simulated Data-Based Comparison of Missing Data Techniques
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background: Hepatitis C virus (HCV) remains a major public health challenge, particularly among People Who Inject Drugs (PWIDs). Missing data in surveillance systems can bias prevalence estimates, affecting decision-making. This study compares Complete-Case Analysis (CCA) and Multiple Imputation (MI) for handling missing data in the estimation of HCV prevalence using a simulated dataset derived from the UK’s Unlinked Anonymous Monitoring (UAM) Survey. Methodology: We conducted a cross-sectional analysis using a simulated version of the UAM dataset, focusing on key demographic and behavioural variables. HCV prevalence was estimated using both CCA and MI approaches. MI was performed using chained equations with five imputations. The effect of missing data handling on prevalence estimates and associated confidence intervals was compared between methods. Results: HCV prevalence estimates obtained via MI were consistently higher than those from CCA, with narrower confidence intervals. The CCA approach excluded a substantial proportion of cases due to missing data, introducing potential bias. MI preserved sample size and yielded more robust estimates, particularly among subgroups with higher missingness. Conclusion: Multiple Imputation outperformed Complete-Case Analysis in estimating HCV prevalence from the simulated UAM data. These findings highlight the importance of appropriate missing data methods in epidemiological surveillance and public health research.