The novel, recurrent mutation in the TOP2A gene results in the enhanced topoisomerase activity and transcription deregulation in glioblastoma

This article has been Reviewed by the following groups

Read the full article

Listed in

Log in to save this article

Abstract

Background

High grade gliomas (HGGs) are aggressive, primary brain tumors with poor clinical outcomes. We aim to better understand glioma pathobiology and find potential therapeutic susceptibilities.

Methods

We designed a custom panel of 664 cancer- and epigenetics-related genes, and employed targeted next generation sequencing to study the genomic landscape of somatic and germline variants in 182 gliomas of different malignancy grades. mRNA sequencing was performed to detect transcriptomic abnormalities.

Results

In addition to known alterations in TP53 , IDH1 , ATRX , EGFR genes found in this cohort, we identified a novel, recurrent mutation in the TOP2A gene coding for Topoisomerase 2A occurring only in glioblastomas (GBM, WHO grade IV gliomas). Biochemical assays with recombinant proteins demonstrated stronger DNA binding and DNA supercoil relaxation activities of the variant proteins. GBM patients carrying the mutated TOP2A had shorter overall survival than those with the wild type TOP2A . Computational analyses of transcriptomic data showed that GBMs with the mutated TOP2A have different transcriptomic patterns suggesting higher transcriptomic activity.

Conclusion

We identified a novel TOP2A E948Q variant that strongly binds to DNA and is more active than the wild type protein. Our findings suggest that the discovered TOP2A variant is gain–of-function mutation.

Key points

  • The most frequent genetic alterations in high grade gliomas are reported.

  • A new mutation in the TOP2A gene was found in 4 patients from Polish population.

  • A E948Q substitution changes TOP2A activities towards DNA.

  • The recurrent TOP2A variant is a gain-of-function mutation.

Importance of the study

Glioblastoma is a deadly disease. Despite recent advancements in genomics and innovative targeted therapies, glioblastoma therapy has not shown improvements. Insights into glioblastoma biology may improve diagnosis, prognosis, and treatment prediction, directing to a better outcome. We performed targeted sequencing of 664 cancer genes, and identified a new variant of the TOP2A gene encoding topoisomerase 2A in glioblastomas. The TOP2A protein variant shows a higher affinity towards DNA and causes transcriptional alterations, suggesting a higher de novo transcription rate.

Article activity feed

  1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Reply to the reviewers

    Reviewer #1 (Evidence, reproducibility and clarity (Required)):

    **Summary:**

    Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate). Please place your comments about significance in section 2.

    In this paper the authors used a targeted approach to identify rare mutations in a cohort of glioma patients. Using this approach they identified a recurrent mutation in the TOP2A gene encoding for Topoisomerase 2A, and suggest that this mutation creates a more effective protein, binding DNA strongly and maybe more enzymatically active. RNAseq analysis of TOP2A WT and TOP2A mut tumor samples suggest different transcription patterns and points to possible splicing defects. The most recurrent variant (E9448Q) is described in depth and some experimental information shows this variant might be a gain-of-function mutation.

    **Major comments:**

    • Are the key conclusions convincing? The validation of both the methodology and the presence of never described TOP2A variations in HGG is done quite successfully. Interesting evidence about relevance of the most frequent mutation is provided. However, besides having computational and biochemistry assays performed, lack of details about in vitro experiment statistics (no p-values are provided in figures 4 and 5, neither sample size, repetitions) weakens the conclusions claimed by the authors about the properties of the mutated topoisomerase. Ad. In the revised version we provided more details about in vitro experiments, including statistics when is applicable, sample size and a number of repetitions. In the fig. 4 we show the results of two repetitions (so we can’t calculate statistics) but I would like to stress that we tested independently two fragments of the protein and the results were similar, so our conclusion was justified. However, we do agree with the reviewer that a statistical analysis of those biochemical tests is required. We already started to produce a new batch of recombinant proteins and we will add repetitions to reinforce our claims. We will provide statistical analysis details once all experiments are performed. ____

    • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? Claims about E948Q variant function should be revised. Data is not presented in a convincing way, plus there is ambiguous language used from the results ("We conclude that the E9448Q TOP2A protein is functional, and MIGHT have a higher activity than the WT protein") to the rest of the paper where they strongly support the claims about the TOP2A activity. Ad. We will provide more data on biochemical features of the TOP2A variant to confirm the impact of the E948Q substitution on enzyme activities, which would allow more strong conclusions. This will present our results in more convincing way. A language of the manuscript has been critically revised and modified (see a version with tracked changes).

    • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. In line with the presented data in the paper, additional experiments that show catalytic changes of the E9448Q variation must be added. It is shown that there are differences in the DNA binding capacity by EMSA compared to the WT form, however, the DNA supercoil relaxation activities is not that different, at least the way the results are presented. The authors suggest that TOP2A mutation is a driver mutation but no validation in vitro of this claim is shown. Can this mutation alone or in combination with e.g. tumor suppressors transform normal cells to cancer cells? Do cell lines expressing this mutation (compared to parental TOP2A wt expressing cells) display increased transcription? Increased invasion? Ad. In the revised version we moderated our conclusions and we do not state that the mutated TOP2A is an oncogenic driver. We suggest this mutation (and possibly other TOP2A mutations, as we analyzed the impact of other variants on the TOP2A protein function) contribute to gliomagenesis. This conclusion is based not only on the changes in biochemical properties, but also on the observation of the impact of the mutation of transcription and patient survival. We expanded the analysis of TOP2A mutations and expression levels on TCGA datasets and those new results support our conclusions about a pathogenic nature of *TOP2A *overexpression and mutations (the supplementary fig.4). We believe in such situation, there is no rationale to make a classical oncogenic driver experiment.

    Due to a rarity of the TOP2A mutations it is impossible to find a patient derived cell line with such defect. We attempted to overexpress TOP2A in glioma cells but apparently there is some autoregulation preventing overexpression of this protein is cells with endogenous TOP2A expression. Therefore, we can’t verify if cell lines expressing this variant (compared to parental TOP2A wt expressing cells) have increased transcription. Moreover, such experiments are costly and require more time investment for substantial experiments

    I would like to stress that modeling some events in cell cultures is difficult and we found in GBMs the link between the mutated *TOP2A *and increased transcription along with decrease of splicing factors expression.

    We have attempted to make CRISPR/Cas9 mediated knock-in in glioma cells but without success. This is a difficult and time consuming procedure. Although in principle, we agree on the rationale for such experiment, we think that the current data are consistent and convincing. If reviewers find it necessary we may attempt to create glioma cell lines with TOP2A knock-out and overexpression of the mutated *TOP2A *gene and study it functionally, but it would require more time.

    • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. If the authors can complement the already presented in vitro experiments with additional ones supporting their hypothesis, this should be feasible. The authors can use patient derived glioma cells or glioma cell lines manipulated to express either the parental TOP2A wt enzyme or the identified mutated form. ____Ad. Due to a rarity of the TOP2A mutations it is impossible to find a patient derived cell line with such defect. Our findings partly relied on frozen historical samples, so it is not possible to develop patient-derived cell lines. As mentioned above, we can create a TOP2A knock-out cell line and overexpress a wild type or mutated version but there is no certainty that TOP2A deficient cells would survive (this is an essential enzyme) and such manipulation would be feasible.

    • Are the data and the methods presented in such a way that they can be reproduced? Yes, the authors provide a quite detailed explanation of the methods implemented to reach each one of the results they are presenting.

    • Are the experiments adequately replicated and statistical analysis adequate? No, there is no information about the statistical analysis or number of replicates in any of the in vitro experiments performed. This information should be added to the manuscript.

    Ad. In the revised version the requested information was added where was possible and additional repetitions for biochemical experiments are currently in progress.

    **Minor comments:**

    • Specific experimental issues that are easily addressable.

    • Are prior studies referenced appropriately? Yes, authors clearly address the state of the art regarding previous NGS methodologies and let us know the advantages and novelty of their approach.

    • Are the text and figures clear and accurate? There are some discrepancies between the strength of the language used in different sections of the paper to refer the conclusions they can infer from the results they are showing. While they are all valid, authors should revise it. Ad. The text of the manuscript has been unified and revised.

    • Do you have suggestions that would help the authors improve the presentation of their data and conclusions? First of all, describe the statistical analysis used in every figure, include number of biological and technical replicates. I would also suggest to change the title or the scope of the discussion, there is too much focus on the TOP2A in the introduction, neglecting all the technical NGS work that actually lead to several new variants being described. This may be confusing when it collides with a conclusion that is heavily focused on the first half describing potential implications of at least another 3 proteins where genetic alterations were described. Given the fact there is not much experimental work that shows TOP2A mutations relevance in HGG or strong enough evidence of the variant's function I would suggest to change a bit the scope of the title. Ad. The description of the results and discussion have been revised to include additional data/discussion on technicalities and other finding not related to TOP2A. We performed additional computational analyses of *TOP2A *expression/mutations in the TCGA datasets. We believe that the planned experiments on genetically modified cell lines would provide additional support for our claims. We think that in the revised version a balance between landscape/NGS content and TOP2A content is well balanced.

    Reviewer #1 (Significance (Required)):

    • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field. The authors describe a methodology that proved to be sensitive and specific enough in order to allow them to detect rare genetic alterations in patient glioma samples. This information could be valuable to describe new driver mutations or infer in genetic pathway alterations that could be potential therapeutic targets. As the authors state at the beginning of the paper, given the poor therapeutical approaches existing for HGG currently, information of this kind could still be highly useful and provide a better outcome to a specific cohort of patients.

    On a personal note, I think there is too much speculation about how TOP2A mutations could be interesting from a biological point of view (authors referred to evidence about implications of this mutation in other forms of cancer) but since no experimental validation is provided in glioma cells, it is difficult to conclude that this enzyme gain-of-function mutation could have a relevant role in HGG and thus make these variants a potential therapeutic target. There are no experiments conducted in glioma cells that express TOP2A variants, it would be interesting to see if it has an effect in the migratory/invasive phenotype like described in other cancer types or like it is suggested by analysis of the genetic pathways activated in the HGG patients samples harboring TOP2A mutation. In addition, there is no evidence of the TOP2A mutations possible role as a driver mutation, which is an interesting aspect that could be further explored from both a computational and an experimental approach.

    Ad. As mentioned above, there is no glioma cells that express TOP2A variants and we are not convinced that such experiment will be feasible taking into account an essential role of TOP2A. We will attempt to perform experiments with CRISP/Cas9 knock-in cell lines and functional validation, but until now we did not accomplish knock-in in glioma cells. We will try to knock-out the endogenous TOP2A using CRISPR and express a TOP2A WT or E948Q variant from plasmids encoding these proteins, but we can’t predict if TOP2A KO cell would survive. If we manage to produce such cells, then we will investigate proliferation, migration and invasion of cells expressing TOP2A WT or mutated variant.

    __We do agree with the reviewer that our previous conclusions were too strong, and in the revised version we moderated our claims. We do not say that the mutated TOP2A is an oncogenic driver. We suggest this mutation (and possibly other TOP2A mutations, as we analyzed the impact of other variants on the TOP2A protein structure) contribute to gliomagenesis. __

    ____Data on the Fig. 1A suggests that TOP2A has a mutational hotspot in the position E948Q in our dataset. In the revised version of the manuscript we have extended RNA-seq analysis of our datasets and TCGA PanCancer datasets to search for TOP2A mutations/ overexpression. We found that another computational prediction using CADD algorithm strongly confirms that TOP2A E948Q is in the top 1% of most deleterious variants in the human genome (CADD score >20). This results was added to Supplementary Table 2.

    • Place the work in the context of the existing literature (provide references, where appropriate). The quality of the paper is high and in line with other studies in the literature that perform genome and transcriptome analysis of tumor samples. It is only the experimental validation that is lacking data supporting the "in silico" findings. __Ad. We would like to point that we provided the results of experimental, biochemical validation (2 assays) showing that the variant TOP2A proteins have different properties. The associations of transcriptional dysregulation in variant TOP2A bearing gliomas was not a *in silico *prediction but the result of the analysis of real tumor samples. __

    As stated above, we are ready to perform further biological validation if the editors find it necessary.

    • State what audience might be interested in and influenced by the reported findings. Computational biologists are the right audience to target this paper. If additional experimental work further validating their initial bioinformatic findings is added to the manuscript then probably a wider population could be targeted.

    Ad. As stated above, we are working now on providing more replicates of biochemical assays and we are ready to perform further biological validation if the editors find it necessary. I would like to stress that genome editing by knock-in is not always possible/feasible, and these type of experiments is time and money consuming.

    • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. Brain tumors, immunotherapy, cancer stem cells, tumor microenvironment, tumor heterogeneity. I do not have sufficient expertise to evaluate the bioinformatic analysis and software/programs used to analyze the NGS data.

    Reviewer #2 (Evidence, reproducibility and clarity (Required)):

    By exon targeted resequencing of 664 genes frequently mutated in cancer, authors identify novel mutations associated to Glioma in a cohort of 182 Polish and Canadian samples. Most of these novel mutations have been identified as potential rare germline mutations, somatic mosaicism or loss-of-heterozygosity variants. Among them, authors focus on mutations associated to the TOP2A gene, which encodes one of the two Type II topoisomerases paralogs present in humans. By a limited number of in vitro experiments, authors conclude that TOP2A recurrent variant E948Q, displays increased binding to DNA and topoisomerase activity. Therefore, authors suggest that the TOP2A E948Q variant is a gain-of-function mutation.

    **Major comments:**

    • Authors show an interesting plethora of new exon mutations associated with High Grade Glioma. Nevertheless, the characterization of TOP2A E948Q variant, which is the main focus of the study, although very interesting and potentially clinically relevant, remains incomplete. Association of the TOP2A E948Q glioma variant with a gain-of-function mutation would require to improve the statistical power of the presented experiments (increase number of replicates). With the existing experimental evidence, the increased DNA binding and activity of the TOP2A E948Q variant should be considered as preliminary, especially in the case of 431-1193 aa fragment. I would consider mandatory to increase experimental replicates and to analyse statistical significance in the case of DNA binding experiments and DNA relaxation assays with the TOP2A 431-1193 aa fragment. A more detailed biochemical characterization should be performed. A titration of different amounts of protein should be included in these experiments, and at least two batches of purified proteins should be analysed. Decatenation assays should also be performed to characterize the activity of the mutant protein in more detail. Recapitulation of DNA binding and activity results with other TOP2A variants obtained in this study will significantly reinforce authors claims too. This improved biochemical characterization should not take longer than two months.

    Ad. We would like to stress that while two replicates are presented, we were testing two forms of TOP2A proteins and the results were similar, confirming our conclusions. But we agree that additional replicates would strengthen our claims. Therefore, we are in the process of producing another batch of recombinant proteins to increase a number of replicates and calculate statistics for the biochemical assays (binding and relaxation assay). We will perform titration of different amounts of the protein using two batches of purified proteins.

    The occurrence of other TOP2A variants is low (identified in only a single patient sample), therefore we will perform experimental validation only for E948Q. However, we performed additional computational analysis for other TOP2A variants showing the influence of the substitution on DNA binding by docking the DNA fragment into TOP2A binding pocket (Supplementary table 4).

    • To increase the significance of the results, I would encourage authors to include experiments showing the functional impact of this TOP2A mutation in cells. The connection with transcriptomic alterations is merely correlative, and would be greatly strengthened by functional experiments in cellular models. To draw definitive conclusions regarding the changes in transcription, I would encourage authors to complement the results with experiments that point to the physiological impact of TOP2A variants within the cell. Overexpression of WT and E948Q variants in a cell model and transcriptomic analysis would be desirable, but validation in these experimental models of some of the target genes identified as deregulated in patients could suffice. These experiments could be accomplished in no more than 3-4 months.

    Ad. We agree that the connection of the TOP2A mutation with transcriptomic alterations is correlative, and would be greatly strengthened by functional experiments in cellular models. If we develop a TOP2A E948Q knock-in cell line or TOP2A KO cell line with E948Q over-expression, we are planning to evaluate transcriptomic changes on selected genes by qPCR or whole transcriptome by RNAseq. We estimate that developing a stable CRISPR/Cas9 cell line may take up to 6 months.

    We provided additional results showing that the connection of the TOP2A mutation with transcriptomic alterations may be due to different expression of splicing factors (Supplementary Fig. 6).

    • Some of the methods are not presented with sufficient detail. Regarding the DNA and RNA sequencing experiments, I consider necessary to specify the DNA fragmentation method, reference for the indexed adapters and ligation and amplification procedures (ligase reference, number of PCR cycles, etc). It would be helpful to clarify or reference which are the "special oligonucleotide probes" that are mentioned. Finally, a reference for the "special beads" and final amplification number of cycles is needed. The sequence of primers used for TOP2A cloning and mutagenesis should be included. The reference for the "site mutagenesis kit" used is missing. When studying the survival rate of glioma patients depending of TOP2A expression levels, it should be clarified what is considered HIGH or LOW expression (i.e: which percentiles are used).

    Ad. We expanded the description of methodological aspects of DNA and RNA sequencing experiments. This description was revised and more details are provided in the revised version. Regarding cloning and mutagenesis, we added a table with primer sequences (Supplementary Table 5). We did not use any kit for cloning and mutagenesis. Standard methods and primers with modified nucleotides were used.

    __We have included information about the partitioned groups in the survival analyses in the figure 2 caption. “D - Kaplan-Meier overall survival curve for patients with high (> TOP2A mRNA median expression x 1.25) or low (- There is a major concern about how the experiments are replicated and about the statistical analysis, which is inexistent in some cases. Indeed, Figures 4 and 5 do not present any statistical analysis, it is therefore hard to draw any conclusion. In Figure 4b, the results for the 890-996 aa fragment looks qualitatively clear, but this is not the case for the 431-1193 aa fragment. More replicates and statistical analysis are mandatory, together with a protein titration. The replicates should be performed with at least two independent batches of protein purifications. The individual values of each experiment should be included in the graph to provide a better understanding of experimental variability. All this also applies to Figure 5.

    Ad. We will increase a number of replicates for the binding and relaxation assay. We will perform a titration of different amounts of protein in these experiments using two batches of purified proteins.

    **Minor comments:**

    • The effect on transcription of co-occurrence of TOP2A mutations with other mutations could also be analysed with the already available data. Also, a more detailed analysis of genome-wide transcription could also be used to at least partially address the proposed hypotheses of increased transcriptional rate or splicing aberrations.

    Ad. We don’t have enough samples with the TOP2A mutation to analyze the effect on transcription of co-occurrence of TOP2A with other mutations.

    We addressed the hypothesis of increased transcriptional rate or splicing aberrations by performed additional analyses of RNA-seq data to confirm splicing aberrations. Indeed we found splicing machinery genes down-regulated in the E948Q TOP2A glioma samples (Supplementary Fig.6).

    • There is no reference for the following argument "As the identified germline variants were exceptionally rare in the general population ... it is likely that these variants are pathogenic". I also find low number of references to support the suggested high frequency of altered genes in gliomas compare to other cancer types. I miss specific works relating TOP2 activity with transcriptional regulation.

    Ad. The appropriate references are provided to back-up these statements.

    • At several points in the text there are quantitative and comparative statements that should be backed up by the actual numbers (e.g. "The results of the targeted sequencing indicate a high frequency of altered genes", "The most altered gene was TP53, followed by IDH1...", "Other genes that were found to be frequently altered included KDM6B...", "These partial results combined with a low frequency of this variant in the Polish population suggest a somatic mutation"). The same thing applies to the co-occurrence of mutations, in which the percentage of co-occurrence and significance is not indicated. This lack of detail in the description is also observed in the description of the transcriptomic alterations in which no detail is provided regarding how many of the 105 analyzed samples correspond to low or high gliomas.

    __Ad. We apologize that the frequencies of mutated genes were not specified. This information is included in the main text of the revised version. We now provide a gnomAD frequency for all variants of interest, confirming the low frequency in the population (AF____ ____

    Regarding the total number of samples in the transcriptomic analysis, we provided an updated supplementary table covering also samples that were used for transcriptomic analyses (Supplementary Table 1).


    • For TOP2A mutation analysis, sometimes is not clear when the analysis is done with the 9 mutated samples and when with the 4 recurrent TOP2A E948Q variants. For example, in figure 2b and 2c analysis are done with 9 samples while the figure 2e is based on the 4 E948Q variants. At least this is what I have deduced from the main text, it should be clarified in the figure legend).

    Ad. This information has been included in the captions of Figure 2B, 2C and 2E and now we specify how many samples were used in each analysis.

    • Fig1. In figure 1b it would be interesting to color-code patients by glioma grade. This would also apply to Figure S1a, S1c, 2a, S3 and S4. In figure 1D it would be very informative to distinguish mutations that passed the quality control or not with different colors.

    Ad. Following reviewer’s suggestions, we have added this information, and oncoplot figures derived from the germline analysis have a distinct color for each glioma grade. In the figure 1D, all of the presented mutations have passed a quality control in terms of quality of sequencing. One additional criterion that was used for all genomic results (except some of the TOP2A variants) was a criterion of 20% variant penetration (20% of reads in the position had to come from the alternative allele). We corrected the description in the Supplementary Table to “passed 20% penetration criterion”. The rationale behind this criterion for *TOP2A *variants was a fact that for one of the E948Q samples it was ~13% and we didn’t want to lose this sample from the analysis due to rarity of the mutation.


    • Fig2. In figure 2b and 2c the statistical significance of differences between TOP2A and the rest of genotypes should be included. Looking at Figures 2d and 2e it looks surprising how similar is the overall survival of HIGH TOP2A mRNA expression (500 days, fig 2d) with the overall survival of the TOP2A WT samples (400 days, fig 2e). Here a I would include a graph that summarizes the TOP2A mRNA expression levels of each group in fig 2d and 2e.

    Ad. We agree that median overall survival is similar comparing patients with high TOP2A mRNA expression to *TOP2A *WT patients in our cohort. It is worth noting, however, that both datasets were produced using different library protocols, and the methodology is different, so it can’t be expected the levels to be equal. We think that adding two more graphs, as suggested, would add another layer of information to this section of the analysis. We have included two boxplots depicting TOP2A mRNA RPKMs, and it is clear now that the medians of High TOP2 mRNA and TOP2A mutant (E948Q) are more closely related, despite the fact that we only have a few patients with the mutation.

    • Fig3. It would be interesting to include the same simulation for the rest of TOP2A mutations as supplementary figure.

    Ad. We agree that the other *TOP2A *SNPs could potentially affect DNA binding. We focused on the recurrent mutation and did not analyze those occurring in a single patient. In the revised version we included predictions whether these variants could affect TOP2A DNA binding. For WT TOP2A and variants, we calculated the Gibbs free energy (ΔG). This information can be found in Supplementary Table 4. We have extended description in the Results section: “The TOP2A E948Q substitution may affect protein-DNA interactions”

    • Fig4 and Fig5. Include statistical analysis and dots representing individual replicates.

    Ad. For Fig 4 we have two replicates for two protein fragments, so we can’t present statistics now. As mentioned above we are preparing a new batch of proteins and will make more repetitions of EMSA and relaxation assays. For Fig 5. we have 3 replicates but despite a trend there is no statistical significance. We intent to make more replicates and a separate protein preparation. After including additional repetitions we will present the results as dots representing individual replicates.

    • Fig6. In Figure 6d I would increase the size differences in the dots representing the gene counts, as it is not easily perceived with current parameters.

    Ad. The dot size in Fig 6d did not reflect the true meaning. To make it easier to understand, we changed a plot type to a barplot, which now represents the number of differentially expressed genes involved in each pathway.

    • FigS2. In figure S2B, it would be informative to establish which dots are significatively above or below the diagonal.

    Ad. The purpose of this figure was to show which oncogenic signaling pathways from TCGA cohorts were affected in our cohort. The pathway's size is a variable that is used to normalize the calculation (shown in abscissa axis in S2B). RTK-RAS and NOTCH pathways contain hundreds of genes, whereas other pathways, such as the NRF2 oncogenic pathway, contains only a few. On the other hand, we counted how many genes in each pathway in our cohort were mutated (shown in ordinate axis, S2B). We used logarithms in both axes for visualization purposes, but this has no effect on the enrichment of these pathways, which is shown in the color-coded legend.

    • FigS3. How were the samples shown selected from the total?

    Ad. In this plot we show only somatic variants that were found in at least two different patients. We apologize that this information was missing, and we have added it to the figure's caption.

    • FigS4. I would include a line with the TOP2A mutation to have an idea of how these mutations are distributed between groups.

    Ad. Based on the feedback of the reviewer, this figure has been modified and improved. A new row has been added to the figure, displaying TOP2A mutations alongside other highly frequent mutations in other genes.

    Reviewer #2 (Significance (Required)):

    In this work authors have identified new mutations associated to gliomas by targeted exome sequencing using an important cohort of 182 samples. Among these new mutations epigenetic enzymes and modifiers are found. These results potentially increase the repertoire of putative molecular targets for future cancer therapies. Authors focus in mutations associated to TOP2A gene, that provides stronger DNA binding and DNA relaxation capacity in vitro. Although further characterization is needed, tumours harbouring this kind of mutations could show higher level of sensitivity to TOP2 drugs, providing potentially interesting clinical implications. Although the link between TOP2A expression and cancer prognosis is well established, the relevance of specific mutations in still largely unexplored.

    On one hand this work brings novelties in the field of Glioma providing a series of putative new players in the development of this type of cancer. Audience interested in basic or clinical aspects of these tumours would be a good target for this work. On the other hand, this putative gain-of-function mutation of TOP2A represent an interesting aspect for the DNA topology and topoisomerases field. Although, as stated above a more detailed biochemical and functional characterization would be required to draw the attention of this audience-

    Scientifically, I have experience in the DNA topology and topoisomerases field, 3D genome organization and gene regulation. I have no experience in Gliomas or any other clinical aspect of cancer, so it is difficult for me to properly establish the potential impact of the newly discovered mutations. Technically I have no capacity to critically evaluate the aspects related to the targeted exome sequencing and the suitability of the analysis performed for mutation identification.

    **Referee Cross-commenting**

    I fully agree with the comments of the other reviewer, which are perfectly aligned with my own regarding the preliminary nature of the conclusions about the biochemical and functional characterization of the TOP2A mutations.

  2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #2

    Evidence, reproducibility and clarity

    By exon targeted resequencing of 664 genes frequently mutated in cancer, authors identify novel mutations associated to Glioma in a cohort of 182 Polish and Canadian samples. Most of these novel mutations have been identified as potential rare germline mutations, somatic mosaicism or loss-of-heterozygosity variants. Among them, authors focus on mutations associated to the TOP2A gene, which encodes one of the two Type II topoisomerases paralogs present in humans. By a limited number of in vitro experiments, authors conclude that TOP2A recurrent variant E948Q, displays increased binding to DNA and topoisomerase activity. Therefore, authors suggest that the TOP2A E948Q variant is a gain-of-function mutation.

    Major comments:

    • Authors show an interesting plethora of new exon mutations associated with High Grade Glioma. Nevertheless, the characterization of TOP2A E948Q variant, which is the main focus of the study, although very interesting and potentially clinically relevant, remains incomplete. Association of the TOP2A E948Q glioma variant with a gain-of-function mutation would require to improve the statistical power of the presented experiments (increase number of replicates). With the existing experimental evidence, the increased DNA binding and activity of the TOP2A E948Q variant should be considered as preliminary, especially in the case of 431-1193 aa fragment. I would consider mandatory to increase experimental replicates and to analyse statistical significance in the case of DNA binding experiments and DNA relaxation assays with the TOP2A 431-1193 aa fragment. A more detailed biochemical characterization should be performed. A titration of different amounts of protein should be included in these experiments, and at least two batches of purified proteins should be analysed. Decatenation assays should also be performed to characterize the activity of the mutant protein in more detail. Recapitulation of DNA binding and activity results with other TOP2A variants obtained in this study will significantly reinforce authors claims too. This improved biochemical characterization should not take longer than two months.
    • To increase the significance of the results, I would encourage authors to include experiments showing the functional impact of this TOP2A mutation in cells. The connection with transcriptomic alterations is merely correlative, and would be greatly strengthened by functional experiments in cellular models. To draw definitive conclusions regarding the changes in transcription, I would encourage authors to complement the results with experiments that point to the physiological impact of TOP2A variants within the cell. Overexpression of WT and E948Q variants in a cell model and transcriptomic analysis would be desirable, but validation in these experimental models of some of the target genes identified as deregulated in patients could suffice. These experiments could be accomplished in no more than 3-4 months.
    • Some of the methods are not presented with sufficient detail. Regarding the DNA and RNA sequencing experiments, I consider necessary to specify the DNA fragmentation method, reference for the indexed adapters and ligation and amplification procedures (ligase reference, number of PCR cycles, etc). It would be helpful to clarify or reference which are the "special oligonucleotide probes" that are mentioned. Finally, a reference for the "special beads" and final amplification number of cycles is needed. The sequence of primers used for TOP2A cloning and mutagenesis should be included. The reference for the "site mutagenesis kit" used is missing. When studying the survival rate of glioma patients depending of TOP2A expression levels, it should be clarified what is considered HIGH or LOW expression (i.e: which percentiles are used).
    • There is a major concern about how the experiments are replicated and about the statistical analysis, which is inexistent in some cases. Indeed, Figures 4 and 5 do not present any statistical analysis, it is therefore hard to draw any conclusion. In Figure 4b, the results for the 890-996 aa fragment looks qualitatively clear, but this is not the case for the 431-1193 aa fragment. More replicates and statistical analysis are mandatory, together with a protein titration. The replicates should be performed with at least two independent batches of protein purifications. The individual values of each experiment should be included in the graph to provide a better understanding of experimental variability. All this also applies to Figure 5.

    Minor comments:

    • The effect on transcription of co-occurrence of TOP2A mutations with other mutations could also be analysed with the already available data. Also, a more detailed analysis of genome-wide transcription could also be used to at least partially address the proposed hypotheses of increased transcriptional rate or splicing aberrations.
    • There is no reference for the following argument "As the identified germline variants were exceptionally rare in the general population ... it is likely that these variants are pathogenic". I also find low number of references to support the suggested high frequency of altered genes in gliomas compare to other cancer types. I miss specific works relating TOP2 activity with transcriptional regulation.
    • At several points in the text there are quantitative and comparative statements that should be backed up by the actual numbers (e.g. "The results of the targeted sequencing indicate a high frequency of altered genes", "The most altered gene was TP53, followed by IDH1...", "Other genes that were found to be frequently altered included KDM6B...", "These partial results combined with a low frequency of this variant in the Polish population suggest a somatic mutation"). The same thing applies to the co-occurrence of mutations, in which the percentage of co-occurrence and significance is not indicated. This lack of detail in the description is also observed in the description of the transcriptomic alterations in which no detail is provided regarding how many of the 105 analyzed samples correspond to low or high gliomas.
    • For TOP2A mutation analysis, sometimes is not clear when the analysis is done with the 9 mutated samples and when with the 4 recurrent TOP2A E948Q variants. For example, in figure 2b and 2c analysis are done with 9 samples while the figure 2e is based on the 4 E948Q variants. At least this is what I have deduced from the main text, it should be clarified in the figure legend).
    • Fig1. In figure 1b it would be interesting to color-code patients by glioma grade. This would also apply to Figure S1a, S1c, 2a, S3 and S4. In figure 1D it would be very informative to distinguish mutations that passed the quality control or not with different colors.
    • Fig2. In figure 2b and 2c the statistical significance of differences between TOP2A and the rest of genotypes should be included. Looking at Figures 2d and 2e it looks surprising how similar is the overall survival of HIGH TOP2A mRNA expression (500 days, fig 2d) with the overall survival of the TOP2A WT samples (400 days, fig 2e). Here a I would include a graph that summarizes the TOP2A mRNA expression levels of each group in fig 2d and 2e.
    • Fig3. It would be interesting to include the same simulation for the rest of TOP2A mutations as supplementary figure.
    • Fig4 and Fig5. Include statistical analysis and dots representing individual replicates.
    • Fig6. In Figure 6d I would increase the size differences in the dots representing the gene counts, as it is not easily perceived with current parameters.
    • FigS2. In figure S2B, it would be informative to establish which dots are significatively above or below the diagonal.
    • FigS3. How were the samples shown selected from the total?
    • FigS4. I would include a line with the TOP2A mutation to have an idea of how these mutations are distributed between groups.

    Significance

    In this work authors have identified new mutations associated to gliomas by targeted exome sequencing using an important cohort of 182 samples. Among these new mutations epigenetic enzymes and modifiers are found. These results potentially increase the repertoire of putative molecular targets for future cancer therapies. Authors focus in mutations associated to TOP2A gene, that provides stronger DNA binding and DNA relaxation capacity in vitro. Although further characterization is needed, tumours harbouring this kind of mutations could show higher level of sensitivity to TOP2 drugs, providing potentially interesting clinical implications. Although the link between TOP2A expression and cancer prognosis is well established, the relevance of specific mutations in still largely unexplored.

    On one hand this work brings novelties in the field of Glioma providing a series of putative new players in the development of this type of cancer. Audience interested in basic or clinical aspects of these tumours would be a good target for this work. On the other hand, this putative gain-of-function mutation of TOP2A represent an interesting aspect for the DNA topology and topoisomerases field. Although, as stated above a more detailed biochemical and functional characterization would be required to draw the attention of this audience-

    Scientifically, I have experience in the DNA topology and topoisomerases field, 3D genome organization and gene regulation. I have no experience in Gliomas or any other clinical aspect of cancer, so it is difficult for me to properly establish the potential impact of the newly discovered mutations. Technically I have no capacity to critically evaluate the aspects related to the targeted exome sequencing and the suitability of the analysis performed for mutation identification.

    Referee Cross-commenting

    I fully agree with the comments of the other reviewer, which are perfectly aligned with my own regarding the preliminary nature of the conclusions about the biochemical and functional characterization of the TOP2A mutations.

  3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #1

    Evidence, reproducibility and clarity

    Summary:

    Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate). Please place your comments about significance in section 2. In this paper the authors used a targeted approach to identify rare mutations in a cohort of glioma patients. Using this approach they identified a recurrent mutation in the TOP2A gene encoding for Topoisomerase 2A, and suggest that this mutation creates a more effective protein, binding DNA strongly and maybe more enzymatically active. RNAseq analysis of TOP2Awt and TOP2Amut tumor samples suggest different transcription patterns and points to possible splicing defects. The most recurrent variant (E9448Q) is described in depth and some experimental information shows this variant might be a gain-of-function mutation.

    Major comments:

    • Are the key conclusions convincing? The validation of both the methodology and the presence of never described TOP2A variations in HGG is done quite successfully. Interesting evidence about relevance of the most frequent mutation is provided. However, besides having computational and biochemistry assays performed, lack of details about in vitro experiment statistics (no p-values are provided in figures 4 and 5, neither sample size, repetitions) weakens the conclusions claimed by the authors about the properties of the mutated topoisomerase.

    • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? Claims about E948Q variant function should be revised. Data is not presented in a convincing way, plus there is ambiguous language used from the results ("We conclude that the E9448Q TOP2A protein is functional, and MIGHT have a higher activity than the WT protein") to the rest of the paper where they strongly support the claims about the TOP2A activity.

    • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. In line with the presented data in the paper, additional experiments that show catalytic changes of the E9448Q variation must be added. It is shown that there are differences in the DNA binding capacity by EMSA compared to the WT form, however, the DNA supercoil relaxation activities is not that different, at least the way the results are presented. The authors suggest that TOP2A mutation is a driver mutation but no validation in vitro of this claim is shown. Can this mutation alone or in combination with e.g. tumor suppressors transform normal cells to cancer cells? Do cell lines expressing this mutation (compared to parental TOP2A wt expressing cells) display increased transcription? Increased invasion?

    • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. If the authors can complement the already presented in vitro experiments with additional ones supporting their hypothesis, this should be feasible. The authors can use patient derived glioma cells or glioma cell lines manipulated to express either the parental TOP2A wt enzyme or the identified mutated form.

    • Are the data and the methods presented in such a way that they can be reproduced? Yes, the authors provide a quite detailed explanation of the methods implemented to reach each one of the results they are presenting.

    • Are the experiments adequately replicated and statistical analysis adequate? No, there is no information about the statistical analysis or number of replicates in any of the in vitro experiments performed. This information should be added to the manuscript.

    Minor comments:

    • Specific experimental issues that are easily addressable.

    • Are prior studies referenced appropriately? Yes, authors clearly address the state of the art regarding previous NGS methodologies and let us know the advantages and novelty of their approach.

    • Are the text and figures clear and accurate? There are some discrepancies between the strength of the language used in different sections of the paper to refer the conclusions they can infer from the results they are showing. While they are all valid, authors should revise it.

    • Do you have suggestions that would help the authors improve the presentation of their data and conclusions? First of all, describe the statistical analysis used in every figure, include number of biological and technical replicates. I would also suggest to change the title or the scope of the discussion, there is too much focus on the TOP2A in the introduction, neglecting all the technical NGS work that actually lead to several new variants being described. This may be confusing when it collides with a conclusion that is heavily focused on the first half describing potential implications of at least another 3 proteins where genetic alterations were described. Given the fact there is not much experimental work that shows TOP2A mutations relevance in HGG or strong enough evidence of the variant's function I would suggest to change a bit the scope of the title.

    Significance

    • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field. The authors describe a methodology that proved to be sensitive and specific enough in order to allow them to detect rare genetic alterations in patient glioma samples. This information could be valuable to describe new driver mutations or infer in genetic pathway alterations that could be potential therapeutic targets. As the authors state at the beginning of the paper, given the poor therapeutical approaches existing for HGG currently, information of this kind could still be highly useful and provide a better outcome to a specific cohort of patients.

    On a personal note, I think there is too much speculation about how TOP2A mutations could be interesting from a biological point of view (authors referred to evidence about implications of this mutation in other forms of cancer) but since no experimental validation is provided in glioma cells, it is difficult to conclude that this enzyme gain-of-function mutation could have a relevant role in HGG and thus make these variants a potential therapeutic target. There are no experiments conducted in glioma cells that express TOP2A variants, it would be interesting to see if it has an effect in the migratory/invasive phenotype like described in other cancer types or like it is suggested by analysis of the genetic pathways activated in the HGG patients samples harboring TOP2A mutation. In addition, there is no evidence of the TOP2A mutations possible role as a driver mutation, which is an interesting aspect that could be further explored from both a computational and an experimental approach.

    • Place the work in the context of the existing literature (provide references, where appropriate). The quality of the paper is high and in line with other studies in the literature that perform genome and transcriptome analysis of tumor samples. It is only the experimental validation that is lacking data supporting the "in silico" findings.

    • State what audience might be interested in and influenced by the reported findings. Computational biologists are the right audience to target this paper. If additional experimental work further validating their initial bioinformatic findings is added to the manuscript then probably a wider population could be targeted.

    • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. Brain tumors, immunotherapy, cancer stem cells, tumor microenvironment, tumor heterogeneity. I do not have sufficient expertise to evaluate the bioinformatic analysis and software/programs used to analyze the NGS data.