An expanded toolkit for Drosophila gene tagging using synthesized homology donor constructs for CRISPR-mediated homologous recombination

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    This manuscript will be of general interest to Drosophila researchers, whose work has long relied on the tools generated by the Gene Disruption Project (GDP). This manuscript provides a notable update on the work of the GDP. In it, the authors demonstrate the efficacy of new, streamlined transformation vectors, which they use to generate several hundred novel gene-specific Gal4 driver lines using CRISPR technology. The new vectors promise to allow the GDP to complete its goal of creating null mutations for every gene in the fly genome. The elegant functionality of the new vectors will also likely be of interest to workers outside of Drosophila.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 agreed to share their name with the authors.)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Previously, we described a large collection of Drosophila strains that each carry an artificial exon containing a T2AGAL4 cassette inserted in an intron of a target gene based on CRISPR-mediated homologous recombination. These alleles permit numerous applications and have proven to be very useful. Initially, the homologous recombination-based donor constructs had long homology arms (>500 bps) to promote precise integration of large constructs (>5 kb). Recently, we showed that in vivo linearization of the donor constructs enables insertion of large artificial exons in introns using short homology arms (100–200 bps). Shorter homology arms make it feasible to commercially synthesize homology donors and minimize the cloning steps for donor construct generation. Unfortunately, about 58% of Drosophila genes lack a suitable coding intron for integration of artificial exons in all of the annotated isoforms. Here, we report the development of new set of constructs that allow the replacement of the coding region of genes that lack suitable introns with a KozakGAL4 cassette, generating a knock-out/knock-in allele that expresses GAL4 similarly as the targeted gene. We also developed custom vector backbones to further facilitate and improve transgenesis. Synthesis of homology donor constructs in custom plasmid backbones that contain the target gene sgRNA obviates the need to inject a separate sgRNA plasmid and significantly increases the transgenesis efficiency. These upgrades will enable the targeting of nearly every fly gene, regardless of exon–intron structure, with a 70–80% success rate.

Article activity feed

  1. Author Response:

    Reviewer #1 (Public Review):

    The manuscript by Kanca et al. presents a variety of valuable resources for the use of the Drosophila research community. As an update to the ongoing work of the Drosophila Gene Disruption Project, it includes hundreds of new transgenic fly lines each of which simultaneously knocks out a targeted gene and generates a driver that expresses the Gal4 transcription factor specifically in the pattern of that gene. The "KozakGal4" approach described supplements previous approaches of the GDP, including the powerful "CRIMIC" method, which inserts a synthetic exon containing a T2AGal4 module into an intron of the targeted gene. In the KozakGal4 method, the coding sequence of the native gene is completely replaced by Gal4, which the authors point out will allow them to target genes lacking (suitable) introns. In the KozakGal4 method, gene replacement is accomplished by targeted excision of the native gene using CRISPR-based technology and subsequent incorporation of a Gal4-encoding cassette by homologous recombination. The vectors developed by the authors to effect gene replacement are elegantly optimized to include all components necessary for native gene excision and efficient recombination of Gal4. These components include the guide RNAS (sgRNAs) that cleave flanking regions of the native gene, an sgRNA that liberates the Gal4 cassette from the vector, and short synthetic homology arms that provide effective, site-specific recombination. Importantly, the vectors are designed so that all gene-specific components can be synthesized in a single fragment that can be readily incorporated into the vector backbone followed by insertion of the Gal4 cassette.

    Overall, the technical advances described in the manuscript are impressive and the utility of the method is well demonstrated. The one exception is in the validation of Gal4 expression fidelity. As the authors note, fidelity could be compromised if regulatory information is removed along with sequences in and around a targeted gene. In addition, the introduction of new DNA at a particular locus may alter the regulation of gene expression. In any case, establishing the fidelity of expression of KozakGal4 lines is important and the data presented on this point is both confusing and incomplete. Rather than directly comparing the expression of selected KozakGal4 lines against the expression of the endogenous gene (e.g. by immunostaining, in situ hybridization, or by comparing tissue-specific reporter expression against expression in microarray-derived datasets such as Fly Atlas or modEncode), the authors use two indirect methods to demonstrate fidelity. One method uses VNC scRNAseq data together with the expression patterns of T2AGal4 lines that target genes co-expressed (at least in certain cell types) with the KozakGal4 line, while the other method uses phenotypic rescue by driving UAS-cDNA transgenes. The demonstrations are at best suggestive, and the rescue results presented are minimal, with no description of phenotypes, methods used to assay them, or quantification of rescue. There is thus insufficient information to form a judgment about fidelity and a more direct demonstration is needed.

    We appreciate that the manuscript can be strengthened by adding supporting evidence about the fidelity of GAL4 expression to the expression pattern of the targeted gene. The direct comparison of the GAL4 expression pattern to the expression pattern of the gene is a complex issue. The seemingly straightforward experiments of comparing the GAL4‐UAS reporter fluorescent protein expression pattern to the antibody staining of the targeted gene product suffers from multiple technical and practical issues: 1) Majority of the genes that we targeted are understudied and do not have a readily available antibody that would work for immunostaining. 2)Even if the antibodies were available, and even if the antibodies were completely specific, the staining pattern would likely be different from the GAL4‐UAS reporter expression pattern due to the subcellular localization of the gene product differing from the subcellular localization of the reporter. 3) GAL4‐UAS system introduces very high level of amplification of the signal compared to the expression of the gene product. We have reported the extent of this difference in the Lee et al. 2018 eLife paper where we used RMCE to convert the same MiMIC lines to EGFP protein trap alleles or T2AGAL4 gene trap alleles. The signals that we could detect in larval or adult brains looked qualitatively different. Comparing the expression pattern of the targeted genes product to the KozakGAL4‐UAS reporter gene signal would suffer from the same issue.
    To overcome these issues, we decided to compare GAL4 mRNA expression pattern of KozakGAL4 alleles to the mRNA expression pattern of the targeted gene. We employed smiFISH (single molecule Fluorescent In‐Situ Hybridization) in 3rd instar larval brains for 8 genes. We crossed the KozakGAL4 alleles of these genes to yw flies and performed co‐staining of GAL4 mRNA and targeted genes mRNA. In 7 cases where we could detect the mRNA expression of the gene product reliably, GAL4 mRNA expression pattern was overlapping with the mRNA expression pattern of the targeted gene, suggesting the transcriptional regulation of KozakGAL4 in the locus reflects the transcriptional regulation of the targeted gene. We note that the signal to noise level is quite low for some of the in situ hybridization results. Hence, we attenuated the language about the expression patterns of KozakGAL4 alleles reflecting the expression domain of the targeted genes by adding that there is a caveat that the regulatory elements in the coding regions and UTRs would be removed in these alleles. We include the smiFISH results as a supplementary figure and we add a paragraph describing methodology to the text.

    The manuscript could be strengthened in a couple of other spots as well. There is little to no description in either the Introduction or Results/Discussion of similar knock-out/knock-in approaches, although gene-specific knock-ins of Gal4 have been generated in Drosophila using homologous recombination for some time-typically into the site of ATG start codons. CRISPR technology has only facilitated this approach, which has also been used to create gene-specific cre knock-ins in rodents. This is of potential interest since the authors mention that their approach can be generalized for use in other animals. A short overview of existing knock-in approaches and their limitations relative to KozakGal4 would therefore be useful. Also, the authors motivate the need for the KozakGal4 method by asserting that over 50% of Drosophila genes lack "suitable" coding introns for the integration of artificial T2AGal4 exons such as CRIMIC. This seems to unnecessarily overstate the actual need. The authors define a "suitable" gene as one that has an intron common to all its isoforms that is at least 100 nt long. The length requirement is justified based on the need for suitable sgRNA targets within the intron, but it's possible to use sgRNA targets outside the intron (as long as the homology domains replace this sequence). Also, the requirement of a sufficiently long intron common to all isoforms is quite stringent and could be relaxed if multiple T2AGal4 lines were made to target multiple isoforms. Presumably, multiple KozakGal4 lines will, in fact, also be required for genes that have multiple transcription start sites, if the expression patterns of all isoforms are to be reproduced. In general, there's no doubt about the utility of the KozakGal4 approach, but a more balanced presentation of its merits relative to other approaches seems warranted.

    We agree with the reviewers that the presence of 100 nt long coding intron in all annotated isoforms is a relatively stringent criterion for deeming a gene to be a suitable target for T2AGAL4 methods. This requirement can indeed be relaxed if the same gene is targeted with multiple T2AGAL4 alleles. Nevertheless, for the GDP project, our aim is to generate genetic reagents for as many conserved genes as possible to make them accessible to the research community. Multiple T2AGAL4 that target individual splice isoforms can be done by the laboratories that work on those genes, using the methodology that we describe in this paper. We attenuate the language about the intron length requirements and included our justification for this requirement for the GDP project in the text.

    Reviewer #2 (Public Review):

    In this interesting paper, Kanca and coworkers present a set of updated constructs for the replacement of gene coding regions for instance by a Gal4 expression cassette or a GFP protein trap allele, enabling multiple research applications with the generated fly strains. The novel design now allows for the CRISPR-based targeting of almost any gene in Drosophila. The authors apply these novel tools and generate hundreds of fly lines that complement the pool of already existing strains in the Drosophila Gene Disruption Project. The authors report a high success rate for their HDR-mediated gene targeting strategy and show that they can even target genes that previously proved to be difficult to engineer. The authors validate the expression patterns of a set of lines - supported even by single-cell sequencing experiments - and provide strong evidence that the updated toolkit functions as expected.

    What may confuse the reader is that there are different targeting strategies that are presented with a strong focus on the validation of the expression cassettes used in combination with a specific targeting strategy (i.e., KozakGal4 or GFP protein trap). This leaves the reader with the impression that the insertion of a particular expression cassette would require a tailored targeting strategy, which is not the case. In fact, the majority of the paper deals with the description and extensive validation of small updates on already published methods for the insertion for the generation of additional KO/Gal4 or eGFP trap lines. However, neither the updated knock-in/knock-out strategies described for the insertion of the KOZAKGal4 cassette at the beginning of the results section nor the experiments to GFP tag proteins at different positions in the open reading frames (Figure 5) are of sufficient novelty and technical advancement.

    What really warrants publication is the very elegant and universal method described in Figure 4 that requires only a single vector to be injected into fly embryos. The method is suited to precisely engineer any gene at will in combination with any HDR template. The very smart vector design allows for the directed insertion of custom and commercially synthesized HDR constructs as well as of a specific guide required to target and cut the gene of interest. This makes the method versatile, fast and cheaper with the benefit of being very efficient. This gRNA_int200 targeting strategy will be of broad interest, is straightforward to use and is expected to have a large impact - far beyond the fly community.

    We thank the reviewer for the constructive criticism and for seeing the benefits in our methodology. Although the KozakGAL4 and GFP knock‐ins in the genome are not conceptually new, the combination of our vector design makes the application of these concepts straightforward. Additionally, the extent of application and verification of GAL4 knock‐ins was limited compared to what we include in this manuscript which prompted us to include the KozakGAL4 and GFP knock‐in methodology in this manuscript.

  2. Evaluation Summary:

    This manuscript will be of general interest to Drosophila researchers, whose work has long relied on the tools generated by the Gene Disruption Project (GDP). This manuscript provides a notable update on the work of the GDP. In it, the authors demonstrate the efficacy of new, streamlined transformation vectors, which they use to generate several hundred novel gene-specific Gal4 driver lines using CRISPR technology. The new vectors promise to allow the GDP to complete its goal of creating null mutations for every gene in the fly genome. The elegant functionality of the new vectors will also likely be of interest to workers outside of Drosophila.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 agreed to share their name with the authors.)

  3. Reviewer #1 (Public Review):

    The manuscript by Kanca et al. presents a variety of valuable resources for the use of the Drosophila research community. As an update to the ongoing work of the Drosophila Gene Disruption Project, it includes hundreds of new transgenic fly lines each of which simultaneously knocks out a targeted gene and generates a driver that expresses the Gal4 transcription factor specifically in the pattern of that gene. The "KozakGal4" approach described supplements previous approaches of the GDP, including the powerful "CRIMIC" method, which inserts a synthetic exon containing a T2AGal4 module into an intron of the targeted gene. In the KozakGal4 method, the coding sequence of the native gene is completely replaced by Gal4, which the authors point out will allow them to target genes lacking (suitable) introns. In the KozakGal4 method, gene replacement is accomplished by targeted excision of the native gene using CRISPR-based technology and subsequent incorporation of a Gal4-encoding cassette by homologous recombination. The vectors developed by the authors to effect gene replacement are elegantly optimized to include all components necessary for native gene excision and efficient recombination of Gal4. These components include the guide RNAS (sgRNAs) that cleave flanking regions of the native gene, an sgRNA that liberates the Gal4 cassette from the vector, and short synthetic homology arms that provide effective, site-specific recombination. Importantly, the vectors are designed so that all gene-specific components can be synthesized in a single fragment that can be readily incorporated into the vector backbone followed by insertion of the Gal4 cassette.

    Overall, the technical advances described in the manuscript are impressive and the utility of the method is well demonstrated. The one exception is in the validation of Gal4 expression fidelity. As the authors note, fidelity could be compromised if regulatory information is removed along with sequences in and around a targeted gene. In addition, the introduction of new DNA at a particular locus may alter the regulation of gene expression. In any case, establishing the fidelity of expression of KozakGal4 lines is important and the data presented on this point is both confusing and incomplete. Rather than directly comparing the expression of selected KozakGal4 lines against the expression of the endogenous gene (e.g. by immunostaining, in situ hybridization, or by comparing tissue-specific reporter expression against expression in microarray-derived datasets such as Fly Atlas or modEncode), the authors use two indirect methods to demonstrate fidelity. One method uses VNC scRNAseq data together with the expression patterns of T2AGal4 lines that target genes co-expressed (at least in certain cell types) with the KozakGal4 line, while the other method uses phenotypic rescue by driving UAS-cDNA transgenes. The demonstrations are at best suggestive, and the rescue results presented are minimal, with no description of phenotypes, methods used to assay them, or quantification of rescue. There is thus insufficient information to form a judgment about fidelity and a more direct demonstration is needed.

    The manuscript could be strengthened in a couple of other spots as well. There is little to no description in either the Introduction or Results/Discussion of similar knock-out/knock-in approaches, although gene-specific knock-ins of Gal4 have been generated in Drosophila using homologous recombination for some time-typically into the site of ATG start codons. CRISPR technology has only facilitated this approach, which has also been used to create gene-specific cre knock-ins in rodents. This is of potential interest since the authors mention that their approach can be generalized for use in other animals. A short overview of existing knock-in approaches and their limitations relative to KozakGal4 would therefore be useful. Also, the authors motivate the need for the KozakGal4 method by asserting that over 50% of Drosophila genes lack "suitable" coding introns for the integration of artificial T2AGal4 exons such as CRIMIC. This seems to unnecessarily overstate the actual need. The authors define a "suitable" gene as one that has an intron common to all its isoforms that is at least 100 nt long. The length requirement is justified based on the need for suitable sgRNA targets within the intron, but it's possible to use sgRNA targets outside the intron (as long as the homology domains replace this sequence). Also, the requirement of a sufficiently long intron common to all isoforms is quite stringent and could be relaxed if multiple T2AGal4 lines were made to target multiple isoforms. Presumably, multiple KozakGal4 lines will, in fact, also be required for genes that have multiple transcription start sites, if the expression patterns of all isoforms are to be reproduced. In general, there's no doubt about the utility of the KozakGal4 approach, but a more balanced presentation of its merits relative to other approaches seems warranted.

  4. Reviewer #2 (Public Review):

    In this interesting paper, Kanca and coworkers present a set of updated constructs for the replacement of gene coding regions for instance by a Gal4 expression cassette or a GFP protein trap allele, enabling multiple research applications with the generated fly strains. The novel design now allows for the CRISPR-based targeting of almost any gene in Drosophila. The authors apply these novel tools and generate hundreds of fly lines that complement the pool of already existing strains in the Drosophila Gene Disruption Project. The authors report a high success rate for their HDR-mediated gene targeting strategy and show that they can even target genes that previously proved to be difficult to engineer. The authors validate the expression patterns of a set of lines - supported even by single-cell sequencing experiments - and provide strong evidence that the updated toolkit functions as expected.

    What may confuse the reader is that there are different targeting strategies that are presented with a strong focus on the validation of the expression cassettes used in combination with a specific targeting strategy (i.e., KozakGal4 or GFP protein trap). This leaves the reader with the impression that the insertion of a particular expression cassette would require a tailored targeting strategy, which is not the case.
    In fact, the majority of the paper deals with the description and extensive validation of small updates on already published methods for the insertion for the generation of additional KO/Gal4 or eGFP trap lines. However, neither the updated knock-in/knock-out strategies described for the insertion of the KOZAKGal4 cassette at the beginning of the results section nor the experiments to GFP tag proteins at different positions in the open reading frames (Figure 5) are of sufficient novelty and technical advancement.

    What really warrants publication is the very elegant and universal method described in Figure 4 that requires only a single vector to be injected into fly embryos. The method is suited to precisely engineer any gene at will in combination with any HDR template. The very smart vector design allows for the directed insertion of custom and commercially synthesized HDR constructs as well as of a specific guide required to target and cut the gene of interest. This makes the method versatile, fast and cheaper with the benefit of being very efficient. This gRNA_int200 targeting strategy will be of broad interest, is straightforward to use and is expected to have a large impact - far beyond the fly community.