Hybrid Quantum–Classical Benchmarks for Synthetic DNA Risk Classification: A Fully Simulated Study

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Hybrid quantum–classical architectures offer a promising direction for sequence modeling, yet their empirical behavior on realistic bioinformatics tasks remains insufficiently documented. Here, we present a fully simulated, reproducible benchmark for hybrid quantum machine learning applied to synthetic DNA disease-risk classification. A dataset of 5,000 sequences (200 bp) across five balanced classes was generated using motif-injection rules with controlled noise, from which 74 biological features were extracted. Baseline models (CNN, attention, classical ensemble) were compared against two quantum-hybrid models incorporating a 4-qubit ZZFeatureMap + RealAmplitudes ansatz, executed exclusively on Qiskit Aer Simulator with 1,024 shots. The attention model achieved the highest test accuracy (51.7%), while quantum-hybrid models produced comparable performance (51.1–51.3%), showing no measurable quantum advantage under these settings. The study establishes an honest, fully reproducible baseline for QML in genomics, highlighting current limitations of small-qubit encodings and motivating future work with trainable variational circuits and larger biological datasets.

Article activity feed