Computationally defined and in vitro validated putative genomic safe harbour loci for transgene expression in human cells

This article has been Reviewed by the following groups

Read the full article See related articles

Listed in

Log in to save this article

Abstract

Selection of the target site is an inherent question for any project aiming for directed transgene integration. Genomic safe harbour (GSH) loci have been proposed as safe sites in the human genome for transgene integration. Although several sites have been characterised for transgene integration in the literature, most of these do not meet criteria set out for a GSH and the limited set that do have not been characterised extensively. Here, we conducted a computational analysis using publicly available data to identify 25 unique putative GSH loci that reside in active chromosomal compartments. We validated stable transgene expression and minimal disruption of the native transcriptome in three GSH sites in vitro using human embryonic stem cells (hESCs) and their differentiated progeny. Furthermore, for easy targeted transgene expression, we have engineered constitutive landing pad expression constructs into the three validated GSH in hESCs.

Article activity feed

  1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Reply to the reviewers

    The authors do not wish to provide a response at this time.

  2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #3

    Evidence, reproducibility and clarity

    Summary:

    This paper seeks to identify genomic safe harbor loci in the human genome for the integration of transgenes. The authors use computational analysis to identify a set of potentially useful sites for transgene integration; they subsequently test a small subset of these identified locations in human embryonic stem cells to determine the impact of transgene integration on the transcriptome and the ability of these cells to differentiate into numerous cell types. They determine that the subset of sites they identify and test all seem promising as no major changes in transcription or differentiation were observed following integration.

    Major Comments:

    Overall, the conclusions of this paper are reasonably convincing, and the authors do a nice job of laying out the criteria for designation of genomic safe harbor loci and characterizing three of these loci. However, there are several places where the data and rationale for the experimentation could be clarified to make the conclusions more convincing. One major question is whether or not these safe genomic loci identified are actually better than traditionally used loci such as Rosa26 or CCR5. While the authors note in the discussion section that these traditional loci do not meet their criteria for a safe genomic integration site, they do no direct comparison of their new loci vs these more traditional ones. A side-by-side comparison would make the data more convincing that these loci are better suited for genomic integration (such as noting fewer changes in the transcriptome etc).

    The second major issue is that it is unclear how the authors picked seven loci for more extensive targeting out of the 25 initially identified. Without knowing the criteria used for these selections, it is challenging to know if there was a bias in selection of sites for further analysis that could alter results, or if the other identified sites are truly acceptable targets. In addition, only three of those GSH sites were successfully targeted. As a result, it is hard to determine the validity of the authors claim that they identified 25 unique GSH loci when they only fully characterize three of them. While it is not necessary to test all 25, it might be beneficial to test more than three before making these conclusions.

    The data and methods in the paper are generally presented in an understandable fashion, and the use of three biological replicates for characterization of the hESC lines seems reasonable.

    Minor Comments:

    There are several minor issues that addressing could help strengthen the claims in this manuscript. First, it is unclear how the authors used BLAT to narrow down their initial list of 49 safe loci down to 25. A more detailed explanation in the text (vs methods) would aid in reader understanding of methodology. In addition, a deeper explanation of how differentially expressed genes were identified would be helpful. The authors state that many of the DE genes in their GSH targeted loci were identical to those found in both control and untargeted cells. It is unclear what the comparator was in these experiments that was used to identify those DE genes; clarification of this in the text would be helpful for the reader. In figure 2, the labeling of the panels is quite confusing, as panel F appears between panels E and D. Finally, in figure 3, while the three new cell lines are shown to be differentiated into various cell types, no control images are shown for comparison. This would be helpful to add in.

    Significance

    Overall, the major advancement of this paper is the identification of numerous putative genomic safe harbor loci (GSH) for the integration of transgenes in the human genome. With the rapid development of novel gene therapy techniques, the characterization of locations in the genome that are acceptable for transgene integration with the lowest likelihood of unintended off-target or downstream consequences is important. As a result, these sites have the potential to be quite valuable for the gene therapy field and of great interest to many scientists. Thus far, few widely accepted safe locations for genomic integration have been identified, making these sites of interest to numerous labs. As someone who has generated numerous transgenic mouse lines using random integration, the ability to selectively target transgenes and know there will be minimal issues with silencing or off-target impacts is appealing. However, my knowledge of genomic and bioinformatics techniques is minimal, making it challenging for me to adequately assess that piece of this manuscript, though I believe it contains valuable information for the gene editing and gene therapy community.

    Referees cross-commenting

    I agree with comments from the other reviewers and think they are all very reasonable suggestions.

  3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #2

    Evidence, reproducibility and clarity

    Autio MI and colleagues report their study aiming to identify novel genomic safe harbor (GSH) loci in the human genome. First, they conducted a computational analysis of publicly available data with a list of criteria previously suggested for GSH loci. Since expression units placed at GSH loci should stably active, they also examined candidate loci using GTex data and against chromatin regions in the active (A) compartment reported by Schmitt et al. They found 25 candidate loci after these analyses. Then, they successfully placed landing pad constructs on three loci in hESCs by use of the CRISPR technology. They have demonstrated that expression of Clover by the CAG promoter is homogeneous and stable even after differentiation to neuronal, live and cardiac cell lineages.

    Major points:

    1. They only examined expression from the CAG promoter unit. However, this does not guarantee stable expressions from other promoters. Since the CAG promoter is very strong, it may be resistant against cellular silencing activity. For research purpose, tetracycline-regulatable promoters are often used, and it has been reported that although CAG promoter is not silenced, the TRE promoter is silenced when an expression unit is placed at AAVS1 locus (Ordovas L et al. Stem Cell Rep, 5: 918-931, 2015). Therefore, before concluding that these loci are GSH, expression from the TRE promoter should be tested.
    2. They examined off-target integration by PCR and Sanger sequencing of the top 5 predicted off target sites. However, Southern blot analyses are needed to rule out off-target integrations. (This reviewer cannot evaluate data of copy number analysis using Digital PCR).

    Significance

    Identification of GSH loci will advance basic research and clinical applications.

    This reviewer is not good at bioinformatics and cannot evaluate the first half of this study.

    Referees cross-commenting

    I have found that comments by other reviewers are important. As suggested, functionality of differentiated cells should be tested and demonstrated. Again, examine other promoters beside CAG should be tested in those differentiated cells.

  4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #1

    Evidence, reproducibility and clarity

    Summary:

    The Authors have taken a bioinformatic approach to identification of safe harbour loci in human genome, and then validated three of these in the H1 hES cell line. Overall, the rationale and data presented in clear and and the experiments appear to be reproducible.

    Minor concerns:

    1. please expand on the rationale for selection of the seven sites that were selected for initial targeting (i.e. what differentiated these from the other 18 sites as being suitable), and on the results for why no successful edits were identified for 4 of these loci.
    2. please add data that quantifies the number of cells expressing Clover, in the differentiated cell types. Ideally, multiple markers for each lineage should be used.
    3. Functional studies of the differentiated cell types would add substantial value to this paper. in the absence of such data, additional marker proteins that reflect functional properties or the maturity of the derived cell types could be added.

    Significance

    Identification and characterization of new safe harbour sites offers potential for generation of research tools and potentially for clinical applications. Those working in the fields of iPSC-based disease modelling and pre-clinical gene therapy are likely to be interested in this work and the cell line resources developed.

    Reviewer expertise: iPSC, CRISPR/Cas, neuronal differentiation.