Gallbladder adenocarcinomas undergo subclonal diversification and selection from precancerous lesions to metastatic tumors

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    The authors collected human samples from a rare cancer type in which evolutionary features have not been well-defined. They describe the clonal evolution through sampling at precancerous, primary tumour, and metastatic stages. Whole exome sequencing was performed and one of the mutation types was confirmed with other techniques.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 agreed to share their name with the authors.)

This article has been Reviewed by the following groups

Read the full article


We aimed to elucidate the evolutionary trajectories of gallbladder adenocarcinoma (GBAC) using multi-regional and longitudinal tumor samples. Using whole-exome sequencing data, we constructed phylogenetic trees in each patient and analyzed mutational signatures. A total of 11 patients including 2 rapid autopsy cases were enrolled. The most frequently altered gene in primary tumors was ERBB2 and TP53 (54.5%), followed by FBXW7 (27.3%). Most mutations in frequently altered genes in primary tumors were detectable in concurrent precancerous lesions (biliary intraepithelial neoplasia [BilIN]), but a substantial proportion was subclonal. Subclonal diversity was common in BilIN (n=4). However, among subclones in BilIN, a certain subclone commonly shrank in concurrent primary tumors. In addition, selected subclones underwent linear and branching evolution, maintaining subclonal diversity. Combined analysis with metastatic tumors (n=11) identified branching evolution in nine patients (81.8%). Of these, eight patients (88.9%) had a total of 11 subclones expanded at least sevenfold during metastasis. These subclones harbored putative metastasis-driving mutations in cancer-related genes such as SMAD4 , ROBO1 , and DICER1 . In mutational signature analysis, six mutational signatures were identified: 1, 3, 7, 13, 22, and 24 (cosine similarity >0.9). Signatures 1 (age) and 13 (APOBEC) decreased during metastasis while signatures 22 (aristolochic acid) and 24 (aflatoxin) were relatively highlighted. Subclonal diversity arose early in precancerous lesions and clonal selection was a common event during malignant transformation in GBAC. However, selected cancer clones continued to evolve and thus maintained subclonal diversity in metastatic tumors.

Article activity feed

  1. Author Response

    Reviewer #1 (Public Review):

    Kang et al. have performed whole exome sequencing of gall bladder carcinomas and associated metastases, including analysis of rapid autopsy specimens in selected cases. They have also attempted to delineate patterns of clonal and subclonal evolution across this cohort. In cases where BilIN was identified, the authors show that subclones within these precursor lesions can expand and diversify to populate the primary tumor and metastatic sites. They also demonstrate subclonal variation and branching evolution across metastatic sites within the same patient, with the suggestion that multiple subclonal populations may metastasize together to seed different sites. Lastly, they highlight ERBB2 amplification as a recurrent event observed in gall bladder carcinomas.

    While these data add to the literature and start to examine important questions related to clonal evolution in a relatively rare malignancy, the authors' findings are very descriptive and it is hard to draw many generalizable conclusions from their data. In addition, the presentation of their figures is somewhat confusing and difficult to interpret. For example, they do not separate their clonal analyses by disease site and by time in a readily interpretable manner, as in some instances of Figure 2 and Figure 3 the clone maps are from different sites collected at the same time point, while others show some samples at different time points. Depicting these hierarchies in a more organized and clearly understandable manner would help readers more easily interpret the authors' findings. In addition, the clinical implications of these clonal hierarchies and their heterogeneity are unclear, as the authors do not relate the observed evolution to intervening therapies and may not be powered to do so with this dataset.

    Thank you for the constructive and valuable comments about 1) figures and 2) clinical implications.

    1. We agree with your opinion that Figures 2 and 3 are confusing. Reflecting on your comment, Figures 2 and 3 have been modified. Now, the time point at which the tissue was obtained and the anatomical location of the tissue are readily visible in the redesigned figures.

    2. From a clinical point of view, we believe that our study highlights the importance of precise genomic analysis of multi-regional and longitudinal samples in individual cancer patients. In the current oncology clinics, cancer panel data of patients are being used to identify druggable mutations usually with a single tumor sample. However, we found that only a part of the mutations was clonal while a substantial proportion was subclonal, which is usually not an effective druggable target. For example, in the GB-S2 patient, after sequencing with GB tissue, ERBB2 targeting treatment would have been performed if a specific clinical trial is available because ERBB2 p.V777L is pathogenic. However, our clonal evolution analysis suggests that ERBB2 targeting strategy may not be effective in subclones without the ERBB2 p.V777L mutation, especially from regional metastasis. We have added the description for this part to the Discussion section (Page 13, Line 12-15).

    Additional areas that would require clarification include:

    1. There are very few details on how the authors performed their subclone analysis to identify major subclones, and what each of the clusters in Supplemental Figure 1 represents. In addition, they do not describe how they determined that the highlighted mutations in Table 2 were drivers for metastasis and subclonal expansion. Were these the only genes that exhibited increased allele frequencies in metastatic sites, or were other statistical criteria used?

    Thank you for the important comment about 1) clone analysis and 2) highlighted mutations in Table 2.

    1. Mutations were timed as clonal or subclonal through PyClone (Roth A et al., Nat Methods. 2014) clustering (Figure 1—figure supplement 1). Phylogenetic trees were constructed using the mutation clusters identified with PyClone as an input of CITUP (Malikic S et al., Bioinformatics. 2015) (Figures 2 and 3). We added the sentence "See Supplementary File 1 to check the matching information for the PyClone clusters and the CITUP clones." to the supplementary figure legend.

    2. A full list of mutations constituting a CITUP clone can be found in Supplementary File 1. Among the mutations, previously reported cancer-associated genes harboring them were selected manually and listed in Table 2. References for each gene are introduced in the 'Evolutionary trajectories and expansion of subclones during regional and distant metastasis' section.

    1. The authors do not discuss the relevance of variation in mutational signatures observed with disease progression/metastasis, e.g., is there any significance that signature 22 (aristolochic acid) and signature 24 (aflatoxin) are increased in metastases? In addition, when comparing their data to previously published reports in Figure 1B and Figure 4A, it would be helpful if the authors discussed possible reasons for some of the large differences in mutational or signature frequencies across datasets. For example, do the authors think the frequency of ERBB2 alterations is so much higher in their cohort than in prior reports due to methodological/data reasons or due to differences in patient population?

    Thank you for the constructive and valuable comments about 1) mutational signatures observed with disease progression/metastasis and 2) differences in mutational or signature frequencies across datasets.

    1. During the revision process, signatures 22 and 24 highlighted in the metastasis stage were validated by two additional tools, Signal (Degasperi A et al., Nat Cancer. 2020) and MuSiCa (Diaz-Gay M et al., BMC Bioinformatics. 2018) (Figure 4—figure supplement 3). Aristolochic acid is an ingredient of oriental herbal medicine (Debelle FD et al., Kidney Int. 2008, Hoang ML et al., Sci Transl Med. 2013). Given that all the patients in our cohort are Korean, and a recent study found that Korean cancer patients are frequently exposed to herbal medicines (Kwon JH et al., Cancer Res Treat 2019), one possible explanation is that some patients might have been exposed to herbal remedies containing aristolochic acid. On the other hand, aflatoxin is known to be contained in soybean paste and soy sauce, which are widely used in Korean food (Ok HE et al., J Food Prot. 2007). Considering that the signatures 22 and 24 are found not in early carcinogenesis but in late carcinogenesis and metastasis (Figure 4B and Figure 4—figure supplement 3), the two carcinogens appear to have little impact on the early stage of cancer development, but their impacts might be highlighted in overt cancer cells. Further investigation is required because it is difficult to determine the etiology of signatures 22 and 24 with this limited patient data. We updated this part in the Discussion section (Page 13, Line 4-7).

    2. In the two previous genomics studies on GBAC, the prevalence of ERBB2 alteration was 7.9% (Narayan RR et al., Cancer. 2019) and 9.4% (Li M et al., Nat Genet. 2014), respectively. Compared with these data, our data is characterized by relatively higher ERBB2 alterations (54.5%: amplification in 27.3% and SNV in 27.3%) (Figure 1B). A higher prevalence of ERBB2 alteration was also reported in other studies on GBAC, with corresponding rates of 28.6% (amplification and overexpression, Nam AR et al., Oncotarget. 2016) and 36.4% (amplification only, Lin J et al., Nat Commun. 2021). The variations in ethnicity and culture might have contributed to the differences. This part is described in the Discussion section (Page 11, Line 19-23). In addition, the discrepancy in Figure 4A might be attributed to the difference in analyzed samples: our study included precancerous and metastatic lesions while the other two studies uniformly analyzed primary tumors.

    Reference for reply 1)

    • Kwon JH, Lee SC, Lee MA, Kim YJ, Kang JH, Kim JY, et al. Behaviors and Attitudes toward the Use of Complementary and Alternative Medicine among Korean Cancer Patients. Cancer Res Treat. 2019;51(3):851-60.
    1. The authors try to describe and draw conclusions about the possibility of metastasis to metastasis spread in p.6, lines 6-10 "In our study, of 7 patients with 2 or more metastatic lesions, evidence of metastasis-to-metastasis spread was found in 2 patients (28.6%). In GB-A1 (Figure 2A), it appears that CBD, omentum 1-2, mesentery, and abdominal wall 2-4 lesions may originate from abdominal wall 1 (old) rather than from primary GBAC considering clone F." The authors conclude here that the spread arose from abdominal wall 1, but this lesion is only separated from the CBD lesion by 1 month. There is no history given about whether this timing difference is significant or if it was simply due to clinically-driven differences in when each lesion was sampled. Given the proximity of the CBD lesion to the original gall bladder cancer, it seems just as likely that all of these distant lesions were seeded from the CBD lesion. If this is the case, the author's conclusion about "metastasis to metastasis" spread does not seem strongly supported. It would be helpful if the authors could clarify this point and/or provide additional data to strengthen this conclusion.

    We appreciate your valuable comment. As addressed above, the manuscript has been modified to reflect your comments.

    Reviewer #2 (Public Review):

    Minsu Kang et al. analyzed 11 patients with gallbladder adenocarcinoma using multi-point sampling. Mutational analysis revealed evolutional patterns during progression where the authors found metastasis-to-metastasis spread and the migration of a cluster of tumor cells are common in gallbladder adenocarcinomas. The signature analysis detected signatures 22 (aristolochic acid) and 24 (aflatoxin) in metastatic tumors. Overall, the analyses are well-performed using established algorithms. However, the manuscript is highly descriptive. Therefore, it is very difficult to understand what the novel findings are.

    Major comments

    1. The sections "Evolutionary trajectories and expansion of subclones during regional and distant metastasis", "Polyclonal metastasis and intermetastatic heterogeneity", "Mutational signatures during clonal evolution", and "Discussion" are highly descriptive which makes it difficult to understand what the novel and/or important findings are. Those sections would profit from reorganization.

    Thank you for the important comment. We have reorganized the manuscript according to your comments.

    1. In the "Evolutionary trajectories and expansion of subclones during regional and distant metastasis" section, unnecessary sentences have been removed and Figures 2 and 3 have been changed to make it simpler to understand how subclones spread during metastasis.

    2. In the "Polyclonal metastasis and intermetastatic heterogeneity" section, after receiving feedback on statements that were conflicting (Reviewer #1's comment 4), we clarified the statements and removed any other extraneous sentences. Figures 2 and 3 have been changed to make it simpler to understand polyclonal metastasis and intermetastatic heterogeneity.

    3. In the "Mutational signatures during clonal evolution" section, after receiving comments that Figures 4B and 4C were confusing (Essential Revisions #6), we moved Figure 4B to Figure 4—figure supplement 2. Unnecessary sentences have been removed. We emphasized signatures 22 and 24 highlighted during metastasis. This result was validated by using two additional tools, Signal (Degasperi A et al., Nat Cancer. 2020) and MuSiCa (Diaz-Gay M et al., BMC Bioinformatics. 2018).

    4. In the Discussion section, duplicate descriptions and unnecessary extraneous explanations have been deleted. We emphasized that whereas aflatoxin and aristolochic acid had little impact on early cancer formation, their impacts could be more clearly seen in cancer cells that had already manifested (Page 13 Line 2-7). In addition, the limitations of the NGS test currently used in the clinical field were pointed out, and the clinical significance of this study was described (Page 13 Line 8-16).

    1. What would enhance this paper is more of a connection between the bioinformatics analysis and the biology. Although the authors analyzed multi-point sequencing data well, this paper lacks in-depth discussion. I understand that the results in the paper are "computationally" the most likely. However, the impact is lost by an incomplete connection to biology.

    As you commented, we analyzed the WES data obtained from patient samples by computational methods. In this study, we did not validate the various results using in vitro or in vivo models. However, we would like to emphasize the significance of our work because it is the first human study, covering the current theory of carcinogenesis from precancerous lesions to metastasis in GBAC. For example, polyclonal seeding has been previously confirmed in animal models (Cheung KJ et al., Science 2016). In humans, there have been reports in breast cancer (Ullah I et al., J Clin Invest. 2018) and colorectal cancer (Wei Q et al., Ann Oncol. 2017), but not in GBAC yet.

    1. In addition to the above concern, it is difficult to comprehend the cohort as the detailed information is lacking. I would suggest providing a brief table that contains the number of collected samples, frozen or FFPE, the clinical information, etc. by sample.

    Thank you for the constructive comment. Supplementary Table 1 was modified as you mentioned. It is now indicated from which organ, when, and by what method the tissue was obtained, what the tumor purity of the tissue was, and whether the tissue was fresh-frozen or FFPE. In addition, we updated the information about tissue acquisition sites in Figure 1A.

    1. The mutations with very low allele frequency (< 1%) are discussed in the manuscript. However, no validation data is provided. Please add a description of the accuracy of the mutation calling considering the following concerns.

    • FFPE samples are analyzed using the same method as frozen samples. FFPE contains much more artifacts. Is it adequate to use the same methods for both frozen and FFPE samples?

    Thank you for the valuable comment. We also considered the FFPE artifacts. However, we did not remove the possible artifacts. This part has been described above. Please see Essential Revisions #5.

    • How were those mutations with low allele frequency validated? Are those variants validated by other methods? Especially in FFPE.

    Thank you for the important comment. Firstly, we discarded any low-quality, unreliable reads and variants according to the pre-specified filtering criteria used in previous literature analyzed with the Genomon2 pipeline (Yokoyama A et al., Nature. 2019, Kakiuchi N et al., Nature. 2020, Ochi Y et al., Nat Commun. 2021). In the Method section, we have added an explanation for this part (Page 16 Line 5-12).

    As you commented, validation of low VAF mutation is required if the mutation is sample-specific. However, in this study, if a mutation in Supplementary File 1 has a low VAF in one sample, one of the other samples always has a higher VAF, which has passed our pre-specified filter. Therefore, validation is not required for that mutation. In addition, possible sequencing artifacts with low VAFs in FFPE tissues have been discussed above. Please see Essential Revisions #5.

    • Is the low variant allele frequency (0.2~1%) significantly higher than the background noise level?

    Thank you for the important comment. As you expected, FFPE samples had a higher number of sample-specific mutations than fresh-frozen ones in our study. However, we did not remove these mutations in the analysis of the FFPE samples. For a more detailed description, please see Essential Revisions #5.

    1. The authors compared mutational signatures divided by stages or timings. How are the signatures calculated although each sample has a distinct number of somatic mutations? Did the authors correct the difference?

    Thank you for the helpful comment. We classified all the mutations according to the specific criteria (Page 9 Line 9-18). For example, in Figure 4B (before revision, Figure 4C), mutations were classified by the timing of development during clonal evolution. After that, we could calculate the relative contributions of mutational signatures in each group using the three tools, Mutalisk (Lee J et al., Nucleic Acids Res. 2018), Signal (Degasperi A et al., Nat Cancer. 2020) and MuSiCa (Diaz-Gay M et al., BMC Bioinformatics. 2018). Although the number of mutations is different for each group, no additional correction was required because we compared the relative contributions among the groups.

    1. In distant metastasis tumors, signatures 22 and 24 are increased. Those two signatures are strongly associated with a specific carcinogen. Although the clinical information lacks, do the authors think that those patients were exposed to those chemicals after the diagnosis? Why do the authors think the two signatures increased in the metastatic tumors? Were those signatures validated by other methods?

    We appreciate your important and constructive comment.

    1. We think that the patients might have been exposed to aristolochic acid or aflatoxin before or after the cancer diagnosis. Aristolochic acid is an ingredient of oriental herbal medicine (Debelle FD et al., Kidney Int. 2008, Hoang ML et al., Sci Transl Med. 2013). Given that all the patients in our cohort are Korean, and a recent study found that Korean cancer patients are frequently exposed to herbal medicines (Kwon JH et al., Cancer Res Treat 2019), one possible explanation is that some patients might have been exposed to herbal remedies containing aristolochic acid. On the other hand, aflatoxin is known to be contained in soybean paste and soy sauce, which are widely used in Korean food (Ok HE et al., J Food Prot. 2007). Nevertheless, we believe that further investigation is required because it is difficult to determine the etiology of signatures 22 and 24 with this limited patient data.

    2. Summarizing the mutational signature results using the 3 different tools (Figure 4B and Figure 4—figure supplement 3), the signatures 22 and 24 are relatively rare in early carcinogenesis. However, the two signatures contributed more to late carcinogenesis and metastasis. Therefore, it is speculated that the two carcinogens appear to have little impact on the early stage of cancer development but might be highlighted in overt cancer cells. Further studies on this novel hypothesis are necessary.

    3. During the revision process, signatures 22 and 24 highlighted in the metastasis stage were validated by two additional tools, Signal (Degasperi A et al., Nat Cancer. 2020) and MuSiCa (Diaz-Gay M et al., BMC Bioinformatics. 2018) (Figure 4—figure supplement 3). We updated this part in the Result (Page 9 Line 18-21) and Discussion (Page 13 Line 2-7) sections.

    Reference for reply 1)

    • Kwon JH, Lee SC, Lee MA, Kim YJ, Kang JH, Kim JY, et al. Behaviors and Attitudes toward the Use of Complementary and Alternative Medicine among Korean Cancer Patients. Cancer Res Treat. 2019;51(3):851-60.
    1. Figures 2 are well-described. However, they are difficult for readers to fully understand. The colors for each clone are sometimes similar. The results of multi-time point and regional analyses in the cases with multiple sampling are not integrated. Driver mutations are separately described in the small phylogenetic trees. Evolutional patterns (linear or branching) are not described in the figures. Modifying the above concerns would improve the manuscript.

    We appreciate your important comment.

    1. In GB-S1, clones of similar colors were modified to be different colors.

    2. Figures 2 and 3 have been modified to make them easier to understand by separating time and space more clearly.

    3. Driver mutations are now indicated in both the phylogenetic tree and TimeScape result (Figures 2 and 3).

    4. Evolutional patterns (linear or branching) can be discovered by examining the phylogenetic tree in Figures 2 and 3. In addition, we described each patient's evolutionary pattern more clearly in the manuscript.

    8)"Among 6 patients having concurrent BilIN tissues, two patients were excluded from the further analysis because of low tumor purity in one patient and different mutational profiles between BilIN and primary GBAC in the other patient, suggesting different origins of the two tumors (Figure 1-figure supplement 2)." This seems cherry-picking. More explanation is necessary.

    • How is the tumor purity? Although the authors use 0.2% variant allele frequency as true mutation (for example Table 2), is the tumor purity lower than 0,2%?

    Thank you for the important comment. The calculated tumor purity of BilIN in the GB-S8 patient was 0.03 based on the WES data. We added this value to the manuscript (Page 6 Line 9) and Supplementary Table 1. Although variants were called in this case, the tumor purity was too low to estimate the allele-specific copy number, and thus sophisticated analysis as in other patients was not possible. In addition, the value of 0.2% in Table 2 is not the VAF, but cellular prevalence calculated by PyClone and CITUP. Although the value is low in the primary tumor, it is mentioned because it is high in metastatic lesions.

    • BilIN and GBAC of GB-S7 have some shared mutations. Why do the authors conclude that BilIN and GBAC have distinct origins? Do the authors think that those shared mutations are germline mosaic mutations?

    Thank you for the important comment.

    1. We think that the BilIN and GBAC of the GB-S7 patient are tumors of different origins because BilIN and GBAC of the GB-S7 patient have different truncal mutations (Figure 1—figure supplement 2C). This is a markedly different feature compared to BilIN and GBAC samples of other patients. We have added an explanation for this part to the Results section (Page 6 Line 9-11).

    2. We do not think that mosaicism occurred at the developmental stage. In addition, although some mutations were identified from both BilIN and GBAC, we cannot determine their importance because either one of the lesions had a very low VAF ranging from 0.001 to 0.04. If the mosaicism occurred only in the GB at the developmental stage, the VAF values of the shared mutations should be much higher than the current values, and the VAF values of the two BilIN and GBAC lesions should be similar.

    • Was the copy number profile compared between BilIN and GBAC?

    Thank you for the constructive comment. In this study, we obtained allele-specific copy numbers using Control-FREEC version 11.5 (Boeva V et al., Bioinformatics. 2012). The copy number of the mutations in the GB-S8 patient's BilIN could not be estimated by Control-FREEC due to low tumor purity (0.03). In the case of GB-S7, BilIN and GBAC were determined to be of a different tumor origin and thus disregarded from the analysis.

  2. Evaluation Summary:

    The authors collected human samples from a rare cancer type in which evolutionary features have not been well-defined. They describe the clonal evolution through sampling at precancerous, primary tumour, and metastatic stages. Whole exome sequencing was performed and one of the mutation types was confirmed with other techniques.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 agreed to share their name with the authors.)

  3. Reviewer #1 (Public Review):

    Kang et al. have performed whole exome sequencing of gall bladder carcinomas and associated metastases, including analysis of rapid autopsy specimens in selected cases. They have also attempted to delineate patterns of clonal and subclonal evolution across this cohort. In cases where BilIN was identified, the authors show that subclones within these precursor lesions can expand and diversify to populate the primary tumor and metastatic sites. They also demonstrate subclonal variation and branching evolution across metastatic sites within the same patient, with the suggestion that multiple subclonal populations may metastasize together to seed different sites. Lastly, they highlight ERBB2 amplification as a recurrent event observed in gall bladder carcinomas.

    While these data add to the literature and start to examine important questions related to clonal evolution in a relatively rare malignancy, the authors' findings are very descriptive and it is hard to draw many generalizable conclusions from their data. In addition, the presentation of their figures is somewhat confusing and difficult to interpret. For example, they do not separate their clonal analyses by disease site and by time in a readily interpretable manner, as in some instances of Figure 2 and Figure 3 the clone maps are from different sites collected at the same time point, while others show some samples at different time points. Depicting these hierarchies in a more organized and clearly understandable manner would help readers more easily interpret the authors' findings. In addition, the clinical implications of these clonal hierarchies and their heterogeneity are unclear, as the authors do not relate the observed evolution to intervening therapies and may not be powered to do so with this dataset.

    Additional areas that would require clarification include:
    1. There are very few details on how the authors performed their subclone analysis to identify major subclones, and what each of the clusters in Supplemental Figure 1 represents. In addition, they do not describe how they determined that the highlighted mutations in Table 2 were drivers for metastasis and subclonal expansion. Were these the only genes that exhibited increased allele frequencies in metastatic sites, or were other statistical criteria used?

    2. The authors do not discuss the relevance of variation in mutational signatures observed with disease progression/metastasis, e.g., is there any significance that signature 22 (aristolochic acid) and signature 24 (aflatoxin) are increased in metastases? In addition, when comparing their data to previously published reports in Figure 1B and Figure 4A, it would be helpful if the authors discussed possible reasons for some of the large differences in mutational or signature frequencies across datasets. For example, do the authors think the frequency of ERBB2 alterations is so much higher in their cohort than in prior reports due to methodological/data reasons or due to differences in patient population?

    3. The authors try to describe and draw conclusions about the possibility of metastasis to metastasis spread in p.6, lines 6-10 "In our study, of 7 patients with 2 or more metastatic lesions, evidence of metastasis-to-metastasis spread was found in 2 patients (28.6%). In GB-A1 (Figure 2A), it appears that CBD, omentum 1-2, mesentery, and abdominal wall 2-4 lesions may originate from abdominal wall 1 (old) rather than from primary GBAC considering clone F." The authors conclude here that the spread arose from abdominal wall 1, but this lesion is only separated from the CBD lesion by 1 month. There is no history given about whether this timing difference is significant or if it was simply due to clinically-driven differences in when each lesion was sampled. Given the proximity of the CBD lesion to the original gall bladder cancer, it seems just as likely that all of these distant lesions were seeded from the CBD lesion. If this is the case, the author's conclusion about "metastasis to metastasis" spread does not seem strongly supported. It would be helpful if the authors could clarify this point and/or provide additional data to strengthen this conclusion.

  4. Reviewer #2 (Public Review):

    Minsu Kang et al. analyzed 11 patients with gallbladder adenocarcinoma using multi-point sampling. Mutational analysis revealed evolutional patterns during progression where the authors found metastasis-to-metastasis spread and the migration of a cluster of tumor cells are common in gallbladder adenocarcinomas. The signature analysis detected signatures 22 (aristolochic acid) and 24 (aflatoxin) in metastatic tumors. Overall, the analyses are well-performed using established algorithms. However, the manuscript is highly descriptive. Therefore, it is very difficult to understand what the novel findings are.

    Major comments
    1. The sections "Evolutionary trajectories and expansion of subclones during regional and distant metastasis", "Polyclonal metastasis and intermetastatic heterogeneity", "Mutational signatures during clonal evolution", and "Discussion" are highly descriptive which makes it difficult to understand what the novel and/or important findings are. Those sections would profit from reorganization.

    2. What would enhance this paper is more of a connection between the bioinformatics analysis and the biology. Although the authors analyzed multi-point sequencing data well, this paper lacks in-depth discussion. I understand that the results in the paper are "computationally" the most likely. However, the impact is lost by an incomplete connection to biology.

    3. In addition to the above concern, it is difficult to comprehend the cohort as the detailed information is lacking. I would suggest providing a brief table that contains the number of collected samples, frozen or FFPE, the clinical information, etc. by sample.

    4. The mutations with very low allele frequency (< 1%) are discussed in the manuscript. However, no validation data is provided. Please add a description of the accuracy of the mutation calling considering the following concerns.
    • FFPE samples are analyzed using the same method as frozen samples. FFPE contains much more artifacts. Is it adequate to use the same methods for both frozen and FFPE samples?
    • How were those mutations with low allele frequency validated? Are those variants validated by other methods? Especially in FFPE.
    • Is the low variant allele frequency (0.2~1%) significantly higher than the background noise level?

    5. The authors compared mutational signatures divided by stages or timings. How are the signatures calculated although each sample has a distinct number of somatic mutations? Did the authors correct the difference?

    6. In distant metastasis tumors, signatures 22 and 24 are increased. Those two signatures are strongly associated with a specific carcinogen. Although the clinical information lacks, do the authors think that those patients were exposed to those chemicals after the diagnosis? Why do the authors think the two signatures increased in the metastatic tumors? Were those signatures validated by other methods?

    7. Figures 2 are well-described. However, they are difficult for readers to fully understand. The colors for each clone are sometimes similar. The results of multi-time point and regional analyses in the cases with multiple sampling are not integrated. Driver mutations are separately described in the small phylogenetic trees. Evolutional patterns (linear or branching) are not described in the figures. Modifying the above concerns would improve the manuscript.

    8. "Among 6 patients having concurrent BilIN tissues, two patients were excluded from the further analysis because of low tumor purity in one patient and different mutational profiles between BilIN and primary GBAC in the other patient, suggesting different origins of the two tumors (Figure 1-figure supplement 2)." This seems cherry-picking. More explanation is necessary.
    • How is the tumor purity? Although the authors use 0.2% variant allele frequency as true mutation (for example Table 2), is the tumor purity lower than 0,2%?
    • BilIN and GBAC of GB-S7 have some shared mutations. Why do the authors conclude that BilIN and GBAC have distinct origins? Do the authors think that those shared mutations are germline mosaic mutations?
    • Was the copy number profile compared between BilIN and GBAC?