A complete human pancreatic cancer genome
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Cancer genome sequencing is essential for understanding tumor evolution and advancing precision medicine. 1 However, reference gaps and germline variants obscure detection of small and large somatic variants and methylation in repetitive regions. 1–3 It is common for tumor cells to gain or lose chromosome arms due to somatic structural changes that occur inside highly repetitive satellite DNA sequences in the centromeres. 4 To identify the full spectrum of somatic variants, including complex rearrangements, we construct and curate near-complete, haplotype-resolved assemblies of the most recent common ancestor of an early-passage broadly-consented hypodiploid pancreatic cancer cell line and matched normal tissues. The tumor assembly completely recapitulates all 35 tumor chromosomes observed with karyotyping, with multiple translocation-induced hybrid chromosomes. The hybrid chromosomes contain putative functional dicentric and fused centromeres, nested foldback inversions causing 14 breakpoints with a haplotype switch in a single event, and centromeric satellite tandem duplications up to 136 kbp. Direct comparison of tumor and normal assembly haplotypes uncovers >7,000 variants altering >1 Mbp of sequence in repetitive regions that have been hidden by reference gaps and germline variants. 44 % of somatic small variants change representation because they alter germline variants on GRCh38, impacting mutational signatures and kataegis/omikli clusters. Most somatic LINE insertions originate from two hypomethylated non-reference germline LINE insertions, highlighting their impact on insertion mutation burden. These assemblies demonstrate that centromeric, acrocentric, and telomeric regions conventionally excluded from analysis harbor extensive somatic and epigenetic changes. Resolving complete tumor genomes enables a deeper understanding of cancer structural plasticity and the endpoints of breakage-fusion-bridge cycles. These assembled, curated paired normal-tumor benchmarks will serve as a critical foundation for developing future algorithms to characterize the most intractable regions of cancer genomes.