Statistical inference of the cellular origin of chronic myeloid leukemia using a discrete-parameter ABC–PMC framework
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Chronic myeloid leukemia (CML) arises from the BCR::ABL1 fusion gene, but the exact stage of cellular differentiation at which the first leukemic cell emerges remains uncertain. We develop a stochastic 27-compartment model of hematopoiesis (blood cell development) using a continuous-time multitype branching process to capture the dynamics of both healthy and cancer cells. To infer the origin of CML, we develop a discrete-parameter Approximate Bayesian Computation – Population Monte Carlo (ABC–PMC) algorithm, tailored to estimate the posterior distribution for the stage of differentiation at which the first cancer cell appeared. Applied to patient data, our method consistently identifies the stem cell compartment as the most likely source of CML. These findings improve understanding of disease initiation and demonstrate the power of discrete-parameter ABC–PMC for statistical inference in complex biological systems.
Author summary
Chronic myeloid leukemia is a blood cancer that begins when a genetic change called the BCR::ABL1 fusion gene appears in one cell. Although this disease has been widely studied, some questions remain, particularly about the exact stage of blood cell development at which the first cancer cell arises. In our study, we build a stochastic model based on a biological hematopoiesis model that represents how blood cells grow and mature through many stages, from stem cells to fully developed white blood cells. Using this mathematical model, we develop a statistical approach that can infer from patient data where in this hierarchy the disease most likely began. When we apply the method to clinical data from patients with chronic myeloid leukemia, it consistently points to the stem cell stage as the most probable origin. By linking biological data with mathematical modelling, our work offers new insight into how this cancer starts and shows how quantitative approaches can help answer questions that are difficult to test experimentally.