Highly frequent undesired insertional mutagenesis during Drosophila genome editing

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Log in to save this article

Abstract

CRISPR/Cas9 based genome editing employing Homology Directed Repair (HDR) from template vector sequences is a widely used technique to enable precise insertions, deletions or modifications to genes. Here, we describe an undesired and highly frequent editing event when using conventional CRISPR/Cas9 plus HDR methods for Drosophila melanogaster germline genome editing. We find that the template vector employed for HDR repair unwantedly and commonly inserts into the genome. We observe this deviation from the desired edit at multiple genomic locations, with different HDR vectors and with multiple genome editing designs. To avoid these events, we have generated a novel HDR template vector that enables animals with these undesired insertions to be identified and excluded. Our results suggest that HDR based genome edited animals must be carefully screened for unwanted vector template genomic integration in order to avoid misleading interpretations of genome editing outcomes.

Article activity feed

  1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

    Learn more at Review Commons


    Reply to the reviewers

    1. General Statement We thank all three reviewers for their careful and constructive evaluation of our manuscript. We are pleased that the reviewers recognised the importance of the work we describe and found the experimental approach sound.

    This manuscript reports that undesired insertion of the plasmid backbone, including vector sequences not intended to be part of the genome edit, occurs at high frequency during CRISPR/Cas9-mediated HDR in Drosophila. We document this phenomenon across multiple independent genome editing projects, using three different plasmid backbones and targeting distinct genomic loci, demonstrating that it is not an isolated or project-specific artefact. We further introduce pVID, a new donor vector incorporating a ZsGreen negative selection marker that allows straightforward identification and exclusion of lines carrying undesired insertions, providing a practical solution to avoid this genome editing issue.

    In response to the reviewers' comments, we have revised the manuscript to: (i) correct and contextualise prior descriptions of this problem, incorporating the references suggested by Reviewer 2; (ii) add a table summarising gRNA characteristics for all editing projects; (iii) expand the discussion of the underlying DNA repair mechanisms, the potential influence of Cas9 source choice, and the relevance of the findings beyond Drosophila; (iv) confirm the stability of problematic template vector insertions across multiple generations; and (v) improve figure clarity, correct typographical errors, and clarify several passages flagged by the reviewers. All responses are described in detail below.

    1. Point-by-Point Description of the Revisions

    Reviewer 1

    Major Comment 1 — DNA repair pathways underlying backbone capture • I think the authors should discuss potential DNA repair pathways (e.g., NHEJ, MMEJ) underlying plasmid backbone capture in more detail. Did you check for knockouts within your screened transformants? That could provide insight into the underlying mechanisms.

    Response: We screened humanized TDP-43 line for tbph knockouts, since our aim was to fully knock out the Drosophila gene and insert the human ortholog. However, we did not screen any of the other lines described in the manuscript for indels caused by NHEJ, since the dsRed selection we employed would not enable us to recover lines without insertion events. We hypothesise that one of the two gRNAs used being more inefficient than the other causes a single homologous recombination event and insertion of the vector template. However, the underlying mechanism is still unclear, and could be caused by NHEJ, HDR or a combination of these mechanisms as has previously observed (44). We have expanded on potential mechanisms inducing HDR template vector insertion events in the discussion of the revised manuscript.

    Major Comment 2 — gRNA characteristics and design parameters • It would be important to describe gRNA characteristics and general design parameters (GC content, distance from cut to intended edit, homology arm length) and analyze whether these correlate with correct HDR vs. plasmid insertion. A table summarizing these details could help reveal potential trends.

    Response: At the reviewers suggestion, we have added a table (Table 1) describing the all the characteristics of the gRNAs further in the material and method section. Unfortunately though, no commonality was immediately apparent to us.

    Major Comment 3 — Single versus dual gRNA strategies • Did the authors consider exploring whether using a single gRNA reduces backbone insertion frequency compared to dual-gRNA strategies? I understand that two gRNAs are needed for your strategy, but it would be interesting to know whether these outcomes are linked to the dual-gRNA design.

    Response: As stated in the discussion, we theorize that perhaps one of the two gRNAs used in our strategies cuts more efficiently and thereby causes a single homologous recombination event and insertion of the vector template. It is possible that originally using a strategy with only one gRNA could cause less insertion of the vector template, however this may be at the cost of gene editing efficiency. Indeed, when Ge et al (17) compared using one versus two gRNAs to induce HDR, they observed more reliable repair events when two gRNAs were used.

    Major Comment 4 — Stability of backbone insertions across generations • Did you evaluate whether backbone insertions are stable across generations or prone to rearrangement?

    Response: We did keep several of the lines reported in this paper stably across multiple generations, and we have added this observation to the manuscript

    Major Comment 5 — Broader applicability in non-model organisms and therapeutic settings • A broader discussion of the potential applications of this approach in non-model insects, mammalian cells, or therapeutic settings where HDR is inefficient would be valuable.

    Response: While we only investigated this effect in the creation of CRISPR/Cas9 Drosophila melanogaster models, it is very possible that this could also affect other model organisms or cells. We encourage the use of HDR template negative selection markers in all uses of HDR-mediated CRISPR/Cas9 genome editing.

    Major Comment 6 — Cas9 promoter and expression level • The authors also mentioned using a validated Cas9 line (ref #23). What promoter drives Cas9 expression in this line? Did you consider testing different promoters? Since timing of Cas9 expression can be critical, promoter choice may have influenced the results and should be discussed.

    Response: We used the nos promoter for the expression of Cas9, as this promoter is expressed in germ cells and is known to have better efficiency than the other germline promotor like vasa (Port et al 2014, Ref #23). However, it is conceivable that the high Cas9 concentration in this line could induce a higher rate of double stranded breaks and thus template vector insertion. We agree it would be interesting to test other Cas9 sources, though this would likely come at the cost of overall editing efficiency. As we describe, the use of pVID now allows negative selection against HDR template vector insertion even with this Cas9 source. We have expanded upon the potential use of other Cas9 sources in the revised discussion.

    Reviewer 2

    Major comments

    None

    Minor Comment 1 — Line 38: prior descriptions of backbone insertion in Drosophila Line 38: "this type of unwanted template vector insertion in the case of Drosophila genome editing has to our knowledge not been previously described." Insertion of vector sequences after CRISPR editing in Drosophila and strategies to mitigate such events have been previously described in multiple studies. The authors need to incorporate these into their manuscript. https://doi.org/10.1242/bio.20147682, https://doi.org/10.1080/19336934.2020.1832416, https://doi.org/10.1534/g3.116.032557.

    Response: We are very grateful to the reviewer for pointing out these prior observations of vector insertion events of which we were not aware. This prior work has now been fully incorporated and referenced in the revised manuscript, and we have removed this erroneous statement. We feel this manuscript validates and quantifies the extent of HDR template insertion across multiple genome editing strategies and templates plus, with pVID, provides a solution to this vexing problem.

    Minor Comment 2 — Line 79: PAM sequence sentence I have difficulties understanding the following sentence: Line 79: "At this location, on both sides of the insertion, the PAM sequence of the target region was edited to match the PAM sequence of the template donor plasmid." I assume what is meant here is that in the donor vector the PAM sequence was mutated to prevent recutting, but that means this sequence is no longer a PAM. Please rephrase for added clarity.

    Response: The PAM sequence was indeed edited in the template donor plasmid to prevent re-cutting, and we are referring to this edited version of the PAM sequence in this sentence. We edited this sentence this to clarify that the PAM sequences have been edited.

    Minor Comment 3 — Figure 2: panel D arrangement In Figure 2 panel D is arranged between panels E and F.

    Response: Thank you for pointing this out. We have corrected this error.

    Minor Comment 4 — Primer positions in figures In Figure 2 it would be useful to also indicate the position of the primers used in 2d in the schematic in 2e. The same applies to Fig. 3a and 4a.

    Response: We have added the position of the primers in figure 2. Since the primers are targeting the backbone of the plasmid commonly in all projects included in this manuscript, we have chosen to only include one figure of this (figure 2).

    Minor Comment 5 — Lines 89–90: duplicated sentence Lines 89, 90: Duplication of the same sentence.

    Response: Thank you, we have corrected this mistake.

    Minor Comment 6 — VGAT editing: consecutive editing and sgRNA placement Editing of the VGAT gene: In this case correct editing and plasmid insertions could be found on the same chromosomes. This might be caused by concatemer formation of repair intermediates (as has been described in multiple systems) or by consecutive editing events. Can you please specify whether the donor vector was designed to prevent consecutive editing? I'm also a bit confused about the locations of the sgRNA target sites according to Fig. 3a. It appears that part of the insertion (i.e. the ALFA tag) was encoded on the homology arm and not between the target sites. While such strategies have been described, they are often avoided as the efficiency of insertion decreases with increasing distance to the cut site. Was it not possible to us a sgRNA better matching the insertion cassette?

    Response: For Vgat genome editing, we followed an existing strategy that has been proven effective, reusing the same gRNAs and overall approach to replace the 9×V5 tag with a 1×ALFA tag (Certel et al. 2022, Ref #28)

    Minor Comment 7 — Line 133: mini-white marker unreliability Line 133: Please describe why the mini-white marker was unreliable.

    Response: In our first design of the pVID vector, we used mini-white as the negative selection marker. However in a number of white eyed lines, we could still confirm the undesired insertion of the HDR template vector. We speculate that expression of mini-white (which we confirmed was not mutated) was repressed in these lines by an unknown mechanism. Since (Nyberg et al. 2020 , Ref #35) also proposed using mini-white as a negative vector selection marker, we wanted to mention this problem with mini-white negative selection, though we remain unsure of the exact cause. In any case, the use of exogenous ZsGreen in pVID as described in the manuscript fully resolved the issue allowing reliable detection of template vector insertion events as we describe.

    Minor Comment 8 — Line 161: "varying frequency" Not sure I understand the sentence in line 161: If 54% of lines had vector insertion, what does the "varying frequency" refer to?

    Response: We have edited this sentence to clarify that 54% of lines had vector insertion.

    Minor Comment 9 — pVID availability in methods Consider highlighting the availability of pVID also in the methods section that described this plasmid.

    Response: This has been added to the methods section.

    Reviewer 3 No edits suggested.

    We thank Reviewer 3 for their positive assessment of the manuscript and for confirming that no revisions are required.

  2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #3

    Evidence, reproducibility and clarity

    The manuscript by Highly frequent undesired insertional mutagenesis during Drosophila genome editing by Kallstig et al. revolves around Homology-Directed Repair (HDR) and the surprisingly high frequency of plasmid backbone insertions into the genome.

    In brief, the authors describe three independent experiments in which the intended homology regions were inserted together with plasmid backbone sequences into the Drosophila genome. Each experiment was designed with a slightly different setup: the first aimed to generate a humanized version of the TAR DNA-binding protein 43 (hTDP-43), while the second introduced an alpha tag into the Vesicular GABA transporter (VGAT) gene. In the first experiment, the pCR4 vector served as the backbone, whereas the second experiment relied on the pHSG298 vector. Both experiments resulted in relatively high frequencies of incorrectly edited genomes - 18% and even 66%, respectively. The authors hypothesized that the rate of undesired events could be even higher if the targeted gene is non-essential. To test this, the third experiment focused on mutagenesis of the Glutamate Receptor IIA (GluRIIA) gene, which is homozygous viable even in protein-null mutants. Indeed, the frequency of incorrect edits was approximately 11:1 (more than 90%). These findings suggest that plasmid backbone insertion is a common and important issue in HDR-based genome editing in Drosophila.

    To address this problem, the authors designed a new vector. While the classical eye color marker (e.g., dsRED) serves for positive identification of HDR recombination, a second fluorescent marker (ZsGreen), encoded in the plasmid backbone and also expressed in the compound eye, enables clear detection of undesired plasmid backbone insertions.

    The study is clearly written, and the plasmids are sufficiently well described in the figures. The reproducibility is somewhat limited by the use of different plasmids in combination with different target genes. Nevertheless, the number of analyzed insertions was high enough to convincingly illustrate the issue.

    Significance

    I find this manuscript to be a valuable description of an existing problem, together with a potentially efficient method for detecting undesired plasmid insertions. From an experimental perspective, I consider the comparison of three different vector backbones combined with different target genes to be rather difficult. On the other hand, as an experimental biologist, I completely understand the logic and the history of the problem-solving process. Undesired insertions were identified by different approaches (PCR and sequencing), and the authors clearly kept this issue in mind. When the problem persisted in the second experiment, and was even more pronounced in the third experiment (involving a non-lethal gene), they developed a vector that makes the screening process more efficient. Altogether this is a valuable technical study worth of reporting.

  3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #2

    Evidence, reproducibility and clarity

    Summary In this manuscript Källstig, Ruchti, McCabe and colleagues report frequent undesired editing outcomes after CRISPR gene knock-ins in Drosophila. Using Cas9 for the targeted induction of DNA double strand breaks and plasmids with long homology arms as donor molecules, they find that the whole plasmid inserts with high frequency at multiple loci. To detect such events they generate a plasmid with a dominant marker encoded on the plasmid backbone, which can be used to enrich for correct insertions by negative selection.

    Major comments

    Minor comments

    Line 38: "this type of unwanted template vector insertion in the case of Drosophila genome editing has to our knowledge not been previously described." Insertion of vector sequences after CRISPR editing in Drosophila and strategies to mitigate such events have been previously described in multiple studies: https://doi.org/10.1242/bio.20147682, https://doi.org/10.1080/19336934.2020.1832416, https://doi.org/10.1534/g3.116.032557. The authors need to incorporate these into their manuscript.

    I have difficulties understanding the following sentence: Line 79: "At this location, on both sides of the insertion, the PAM sequence of the target region was edited to match the PAM sequence of the template donor plasmid." I assume what is meant here is that in the donor vector the PAM sequence was mutated to prevent recutting, but that means this sequence is no longer a PAM. Please rephrase for added clarity.

    In Figure 2 panel D is arranged between panels E and F.

    In Figure 2 it would be useful to also indicate the position of the primers used in 2d in the schematic in 2e. The same applies to Fig. 3a and 4a.

    Lines 89, 90: Duplication of the same sentence.

    Editing of the VGAT gene: In this case correct editing and plasmid insertions could be found on the same chromosomes. This might be caused by concatemer formation of repair intermediates (as has been described in multiple systems) or by consecutive editing events. Can you please specify whether the donor vector was designed to prevent consecutive editing? I'm also a bit confused about the locations of the sgRNA target sites according to Fig. 3a. It appears that part of the insertion (i.e. the ALFA tag) was encoded on the homology arm and not between the target sites. While such strategies have been described, they are often avoided as the efficiency of insertion decreases with increasing distance to the cut site. Was it not possible to us a sgRNA better matching the insertion cassette?

    Line 133: Please describe why the mini-white marker was unreliable.

    Not sure I understand the sentence in line 161: If 54% of lines had vector insertion, what does the "varying frequency" refer to?

    Consider highlighting the availability of pVID also in the methods section that described this plasmid.

    Significance

    This manuscript describes vector backbone insertions as a frequent complication of CRISPR knock-in experiments in Drosophila and introduces a cloning vector with a selectable marker on the plasmid backbone that allows counter selection of such undesired events. The manuscript is very well written and the experiments are overall well designed.

    Insertion of vector sequences during homologous recombination (often referred to as "ends-in" recombination events) has been described on multiple occasions in a wide variety of model systems. Also in Drosophila, the system used here, such events have been described by multiple groups (see comments above). Furthermore, plasmids designed to allow to counter select for such events have also been described previously (e.g. Addgene plasmids 157991, 80801).

    In summary, this manuscript highlights once more an important complication in genome engineering experiments, but does not significantly advance the knowledge in the field beyond the existing literature and the described plasmid is largely redundant with preexisting plasmids designed for the same purpose. While this overall severely limits the significance of this work, it does provide important replication of previous work.

  4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #1

    Evidence, reproducibility and clarity

    CRISPR/Cas9 genome editing has improved the ability to introduce precise genetic modifications in multiple organisms such as Drosophila melanogaster. By coupling Cas9-induced double-strand breaks with homology-directed repair (HDR), researchers can replace, insert, or delete genomic sequences with high specificity.

    In this work, the authors explore significant concerns about the fidelity and outcomes of HDR-based editing. They identify a recurring issue since unintended insertions of the entire donor template vector into the genome was observed. These undesired events are observed across multiple genes, indicating that the problem is not locus- or construct-specific. These insertions can occur at high frequencies, complicating efforts to establish accurate transgenic lines. They not only mask intended edits but may also introduce unpredictable phenotypes unrelated to the desired genetic modification.

    The authors addressed the problem of frequent donor plasmid insertions during CRISPR/Cas9 HDR in Drosophila by redesigning their HDR template vectors. They incorporated a GFP marker into the plasmid backbone alongside a DsRed cassette. This design allowed them to distinguish correct HDR events, which carried only DsRed, from aberrant plasmid integrations, which carried both DsRed and GFP. By screening flies for marker expression, they could rapidly identify and exclude incorrect insertions.

    Please, see below my comments:

    • I think the authors should discuss potential DNA repair pathways (e.g., NHEJ, MMEJ) underlying plasmid backbone capture in more detail. Did you check for knockouts within your screened transformants? That could provide insight into the underlying mechanisms.
    • It would be important to describe gRNA characteristics and general design parameters (GC content, distance from cut to intended edit, homology arm length) and analyze whether these correlate with correct HDR vs. plasmid insertion. A table summarizing these details could help reveal potential trends.
    • Did the authors consider exploring whether using a single gRNA reduces backbone insertion frequency compared to dual-gRNA strategies? I understand that two gRNAs are needed for your strategy, but it would be interesting to know whether these outcomes are linked to the dual-gRNA design.
    • Did you evaluate whether backbone insertions are stable across generations or prone to rearrangement?
    • A broader discussion of the potential applications of this approach in non-model insects, mammalian cells, or therapeutic settings where HDR is inefficient would be valuable.
    • The authors also mentioned using a validated Cas9 line (ref #23). What promoter drives Cas9 expression in this line? Did you consider testing different promoters? Since timing of Cas9 expression can be critical, promoter choice may have influenced the results and should be discussed.

    Significance

    This paper will appeal primarily to researchers in the fields of functional genomics, insect genetics, and genome engineering, particularly those working with Drosophila or other model organisms where CRISPR/Cas9 is widely used. It is also of interest to scientists engaged in vector biology, agricultural pest control, and translational applications of genome editing, as the findings touch on broader issues of editing accuracy and unintended repair outcomes.

    The main advance of the study is the clear demonstration that unintended donor plasmid backbone insertions are not rare artifacts, but frequent and systematic events during CRISPR/Cas9-mediated HDR in Drosophila. By integrating a GFP marker into the plasmid backbone alongside the intended DsRed marker, the authors provide a straightforward and practical method to identify, separate, and exclude these erroneous events. This approach both highlights the hidden pitfalls of HDR-based editing and offers an effective solution, thereby improving the reliability of CRISPR applications. Beyond Drosophila, the work advances the field by underscoring the need for careful design and validation of donor constructs, with potential implications for genome editing strategies in other organisms where HDR efficiency and fidelity remain key challenges.