High-throughput tracking enables systematic phenotyping and drug repurposing in C. elegans disease models

Thomas J O'Brien
Ida L Barlow
Luigi Feriani
André EX Brown

Curated by eLife

eLife Assessment

This important study provides proof of principle that C. elegans models can be used to accelerate the discovery of candidate treatments for human Mendelian diseases by detailed high-throughput phenotyping of strains harboring mutations in orthologs of human disease genes. The data are compelling and support an approach that enables the potential rapid repurposing of FDA-approved drugs to treat rare diseases for which there are currently no effective treatments. The work will be of interest to all geneticists.

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (eLife)
Evaluated articles (Arcadia Science)

Abstract

There are thousands of Mendelian diseases with more being discovered weekly and the majority have no approved treatments. To address this need, we require scalable approaches that are relatively inexpensive compared to traditional drug development. In the absence of a validated drug target, phenotypic screening in model organisms provides a route for identifying candidate treatments. Success requires a screenable phenotype. However, the right phenotype and assay may not be obvious for pleiotropic neuromuscular disorders. Here, we show that high-throughput imaging and quantitative phenotyping can be conducted systematically on a panel of C. elegans disease model strains. We used CRISPR genome-editing to create 25 worm models of human Mendelian diseases and phenotyped them using a single standardised assay. All but two strains were significantly different from wild-type controls in at least one feature. The observed phenotypes were diverse, but mutations of genes predicted to have related functions led to similar behavioural differences in worms. As a proof-of-concept, we performed a drug repurposing screen of an FDA-approved compound library, and identified two compounds that rescued the behavioural phenotype of a model of UNC80 deficiency. Our results show that a single assay to measure multiple phenotypes can be applied systematically to diverse Mendelian disease models. The relatively short time and low cost associated with creating and phenotyping multiple strains suggest that high-throughput worm tracking could provide a scalable approach to drug repurposing commensurate with the number of Mendelian diseases.

Version published to 10.7554/elife.92491.4 on eLife
Jan 8, 2025
Version published to 10.7554/elife.92491 on eLife
Jan 8, 2025
Version published to 10.7554/elife.92491.3 on eLife
Dec 13, 2024
eLife
Dec 12, 2024

eLife Assessment

This important study provides proof of principle that C. elegans models can be used to accelerate the discovery of candidate treatments for human Mendelian diseases by detailed high-throughput phenotyping of strains harboring mutations in orthologs of human disease genes. The data are compelling and support an approach that enables the potential rapid repurposing of FDA-approved drugs to treat rare diseases for which there are currently no effective treatments. The work will be of interest to all geneticists.

Read the original source
eLife
Dec 12, 2024

Reviewer #3 (Public review):

In this study, O'Brien et al. address the need for scalable and cost-effective approaches to finding lead compounds for the treatment of the growing number of Mendelian diseases. They used state-of-the-art phenotypic screening based on an established high-dimensional phenotypic analysis pipeline in the nematode C. elegans.

First, a panel of 25 C. elegans models was created by generating CRISPR/Cas9 knock-out lines for conserved human disease genes. These mutant strains underwent behavioral analysis using the group's published methodology. Clustering analysis revealed common features for genes likely operating in similar genetic pathways or biological functions. The study also presents results from a more focused examination of ciliopathy disease models.

Subsequently, the study focuses on the NALCN channel …

Reviewer #3 (Public review):

In this study, O'Brien et al. address the need for scalable and cost-effective approaches to finding lead compounds for the treatment of the growing number of Mendelian diseases. They used state-of-the-art phenotypic screening based on an established high-dimensional phenotypic analysis pipeline in the nematode C. elegans.

First, a panel of 25 C. elegans models was created by generating CRISPR/Cas9 knock-out lines for conserved human disease genes. These mutant strains underwent behavioral analysis using the group's published methodology. Clustering analysis revealed common features for genes likely operating in similar genetic pathways or biological functions. The study also presents results from a more focused examination of ciliopathy disease models.

Subsequently, the study focuses on the NALCN channel gene family, comparing the phenotypes of mutants of nca-1, unc-77, and unc-80. This initial characterization identifies three behavioral parameters that exhibit significant differences from the wild type and could serve as indicators for pharmacological modulation.

As a proof-of-concept, O'Brien et al. present a drug repurposing screen using an FDA-approved compound library, identifying two compounds capable of rescuing the behavioral phenotype in a model with UNC80 deficiency. The relatively short time and low cost associated with creating and phenotyping these strains suggest that high-throughput worm tracking could serve as a scalable approach for drug repurposing, addressing the multitude of Mendelian diseases. Interestingly, by measuring a wide range of behavioural parameters, this strategy also simultaneously reveals deleterious side effects of tested drugs that may confound the analysis.

Considering the wealth of data generated in this study regarding important human disease genes, it is regrettable that the data is not made accessible to researchers less versed in data analysis methods. This diminishes the study's utility. It would have a far greater impact if an accessible and user-friendly online interface were established to facilitate data querying and feature extraction for specific mutants. This would empower researchers to compare their findings with the extensive dataset created here.

Another technical limitation of the study is the use of single alleles. Large deletion alleles were generated by CRISPR/Cas9 gene editing. At first glance, this seems like a good idea because it limits the risk that background mutations, present in chemically-generated alleles, will affect behavioral parameters. However, these large deletions can also remove non-coding RNAs or other regulatory genetic elements, as found, for example, in introns. Therefore, it would be prudent to validate the behavioral effects by testing additional loss-of-function alleles produced through early stop codons or targeted deletion of key functional domains.

Comments on revisions:

In this final round of revisions, the authors have improved their manuscript and provide useful information about analysis procedures and code and updated figures.

Read the original source
eLife
Dec 12, 2024

Author response:

The following is the authors’ response to the previous reviews.

This important study provides proof of principle that C. elegans models can be used to accelerate the discovery of candidate treatments for human Mendelian diseases by detailed high-throughput phenotyping of strains harboring mutations in orthologs of human disease genes. The data are compelling and support an approach that enables the potential rapid repurposing of FDA-approved drugs to treat rare diseases for which there are currently no effective treatments. The authors should provide a clearer explanation of how the statistical analyses were performed, as well as a link to a GitHub repository to clarify how figures and tables in the manuscript were generated from the phenotypic data.

We have amended our description of the statistical analysis in the …

Author response:

The following is the authors’ response to the previous reviews.

This important study provides proof of principle that C. elegans models can be used to accelerate the discovery of candidate treatments for human Mendelian diseases by detailed high-throughput phenotyping of strains harboring mutations in orthologs of human disease genes. The data are compelling and support an approach that enables the potential rapid repurposing of FDA-approved drugs to treat rare diseases for which there are currently no effective treatments. The authors should provide a clearer explanation of how the statistical analyses were performed, as well as a link to a GitHub repository to clarify how figures and tables in the manuscript were generated from the phenotypic data.

We have amended our description of the statistical analysis in the materials and methods section of the manuscript. We have also updated the GitHub repository link to a dedicated repository for this study, this contains all of the code needed to generated all the figures made from the phenotypic data provided. Additionally, we have updated the Zenodo repository to contain both the code and datasets within the same file.

We have also updated the GitHub repository link to a dedicated repository for this manuscript, that contains all of the code needed to generate all figures from the phenotypic data provided. Additionally, we have updated the Zenodo repository link to contain both the code and datasets within the same folder structure.

Recommendations for the authors:

Reviewer #1 (Recommendations for the authors):

The authors have responded to previous review to improve the presentation of the work. The paper more than meets publication standards.

No response required.

Reviewer #2 (Recommendations for the authors):

The authors have addressed all of my questions and concerns. I'm happy to see this updated paper of record.

No response required.

Reviewer #3 (Recommendations for the authors):

Regarding the interactive heatmap

The html version and the panel in Figure 2C appear not to coincide visually. Maybe the features are ordered in a different way?

The html version of Figure 2C is for the entire feature set extract per strain and not the condensed Tierpsy256 set shown in the panel figure. We have now remade this figure to show this reduced feature set (aligning with what is shown in Figure 2C) and included both versions of the interactive heatmaps as static html files within the same repository.

Regarding data accessibility overall

More generally, the html file does not address my initial concern about the accessibility of the data to non-experts. Making the full dataset available was a necessary first step, but the hermetic nature of its format and the lack of a simple way to query the data remains an issue for me that limits the usefulness of this data to the broadest audience.

We agree, but unfortunately do not currently have the resources to build a public-facing database to facilitate this.

Read the original source
Version published to 10.7554/elife.92491.2 on eLife
Oct 8, 2024
eLife
Oct 7, 2024

eLife Assessment

This important study provides proof of principle that C. elegans models can be used to accelerate the discovery of candidate treatments for human Mendelian diseases by detailed high-throughput phenotyping of strains harboring mutations in orthologs of human disease genes. The data are compelling and support an approach that enables the potential rapid repurposing of FDA-approved drugs to treat rare diseases for which there are currently no effective treatments. The authors should provide a clearer explanation of how the statistical analyses were performed, as well as a link to a GitHub repository to clarify how figures and tables in the manuscript were generated from the phenotypic data.

Read the original source
eLife
Oct 7, 2024

Reviewer #1 (Public review):

Summary:

As the scientific community identifies increasing numbers of genes and genetic variants that cause rare human diseases, a challenge in the field quickly identify pharmacological interventions to address known deficits. The authors point out that defining phenotypic outcomes required for drug screen assays is often a bottleneck, and emphasize how invertebrate models can be used for quick ID of compounds that may address genetic deficits. A major contribution of this work is to establish a framework for potential intervention drug screening based on quantitative imaging of morphology and mobility behavior, using methods that the authors show can define subtle phenotypes in a high proportion of disease gene knockout mutants. Overall, the work constitutes an elegant combination of previously developed …

Reviewer #1 (Public review):

Summary:

As the scientific community identifies increasing numbers of genes and genetic variants that cause rare human diseases, a challenge in the field quickly identify pharmacological interventions to address known deficits. The authors point out that defining phenotypic outcomes required for drug screen assays is often a bottleneck, and emphasize how invertebrate models can be used for quick ID of compounds that may address genetic deficits. A major contribution of this work is to establish a framework for potential intervention drug screening based on quantitative imaging of morphology and mobility behavior, using methods that the authors show can define subtle phenotypes in a high proportion of disease gene knockout mutants. Overall, the work constitutes an elegant combination of previously developed high-volume imaging with highly detailed quantitative phenotyping (and some paring down to specific phenotypes) to establish proof of principle on how the combined applications can contribute to screens for compounds that may address specific genetic deficits, which can, in turn, suggest both mechanism and therapy.

In brief, the authors selected 25 genes for which loss of function is implicated in human neuro-muscular disease and engineered deletions in the corresponding C. elegans homologs. The authors then imaged morphological features and behaviors prior to, during, and after blue light stimuli, quantitating features, and clustering outcomes as they elegantly developed previously (PMID 35322206; 30171234; 30201839). In doing so, phenotypes in 23/25 tested mutants could be separated enough to distinguish WT from mutant and half of those with adequate robustness to permit high-throughput screens, an outcome that supports the utility of related general efforts to ID phenotypes in C. elegans disease orthologs. A detailed discussion of 4 ciliopathy gene defects, and NACLN-related channelopathy mutants reveals both expected and novel phenotypes, validating the basic approach to modeling vetted targets and underscoring that quantitative imaging approaches reiterate known biology.

The authors then screened a library of nearly 750 FDA-approved drugs for the capacity to shift the unc-80 NACLN channel-disrupted phenotype closer to the wild type. Top "mover" compounds shift outcome in the experimental outcome space; and also reveal how "side effects" can be evaluated to prioritize compounds that confer the fewest changes of other parameters away from the center.

Strengths:

Although the imaging and data analysis approaches have been reported and the screen is restricted in scope and intervention exposure, it is impressive, encouraging and important that the authors strongly combine tools to demonstrate how quantitative imaging phenotypes can be integrated with C. elegans genetics to accelerate the identification of potential modulators of disease (easily extendable to other goals). Generation of deletion alleles and documentation of their associated phenotypes (available in supplemental data) provide potentially useful reagents/data to the field. The capacity to identify "over-shooting" of compound applications with suggestions for scale back and to sort efficacious interventions to minimize other changes to behavioral and physical profiles is a strong contribution.

Weaknesses:

The work does not have major weaknesses, and in revision, the authors have expanded the discussion to potential utility and application in the field.

The authors have also taken into account minor modifications in writing.

Read the original source
eLife
Oct 7, 2024

Reviewer #2 (Public review):

Summary and strengths:

O'Brien et al. present a compelling strategy to both understand rare disease that could have a neuronal focus and discover drugs for repurposing that can affect rare disease phenotypes. Using C. elegans, they optimize the Brown lab worm tracker and Tierpsy analysis platform to look at movement behaviors of 25 knockout strains. These gene knockouts were chosen based on a process to identify human orthologs that could underlie rare diseases. I found the manuscript interesting and a powerful approach to make genotype-phenotype connections using C. elegans. Given the rate that rare Mendelian diseases are found and candidate genes suggested, human geneticists need to consider orthologous approaches to understand the disease and seek treatments on a rapid time scale. This approach is one …

Reviewer #2 (Public review):

Summary and strengths:

O'Brien et al. present a compelling strategy to both understand rare disease that could have a neuronal focus and discover drugs for repurposing that can affect rare disease phenotypes. Using C. elegans, they optimize the Brown lab worm tracker and Tierpsy analysis platform to look at movement behaviors of 25 knockout strains. These gene knockouts were chosen based on a process to identify human orthologs that could underlie rare diseases. I found the manuscript interesting and a powerful approach to make genotype-phenotype connections using C. elegans. Given the rate that rare Mendelian diseases are found and candidate genes suggested, human geneticists need to consider orthologous approaches to understand the disease and seek treatments on a rapid time scale. This approach is one such way. Overall, I have a few minor suggestions and some specific edits.

Weaknesses:

(1) Throughout the text on figures, labels are nearly impossible to read. I had to zoom into the PDF to determine what the figure was showing. Please make text in all figures a minimum of 10 point font. Similarly, Figure 2D point type is impossible to read. Points should be larger in all figures. Gene names should be in italics in all figures, following C. elegans convention.

(2) I have a strong bias against the second point in Figure 1A. Sequencing of trios, cohorts, or individuals NEVER identifies causal genes in the disease. This technique proposes a candidate gene. Future experiments (oftentimes in model organisms) are required to make those connections to causality. Please edit this figure and parts of the text.

(3) How were the high-confidence orthologs filtered from 767 to 543 (lines 128-131)? Also, the choice of the final list of 25 genes is not well justified. Please expand more about how these choices were made.

(4) Figures 3 and 4, why show all 8289 features? It might be easier to understand and read if only the 256 Tierpsy features were plotted in the heat maps.

(5) The unc-80 mutant screen is clever. In the feature space, it is likely better to focus on the 256 less-redundant Tierpsy features instead of just a number of features. It is unclear to me how many of these features are correlated and not providing more information. In other words, the "worsening" of less-redundant features is far more of a concern than "worsening" of 1000 correlated features.Reviewer #2 (Public review):

Summary and strengths:

O'Brien et al. present a compelling strategy to both understand rare disease that could have a neuronal focus and discover drugs for repurposing that can affect rare disease phenotypes. Using C. elegans, they optimize the Brown lab worm tracker and Tierpsy analysis platform to look at movement behaviors of 25 knockout strains. These gene knockouts were chosen based on a process to identify human orthologs that could underlie rare diseases. I found the manuscript interesting and a powerful approach to make genotype-phenotype connections using C. elegans. Given the rate that rare Mendelian diseases are found and candidate genes suggested, human geneticists need to consider orthologous approaches to understand the disease and seek treatments on a rapid time scale. This approach is one such way. Overall, I have a few minor suggestions and some specific edits.

Weaknesses:

(1) Throughout the text on figures, labels are nearly impossible to read. I had to zoom into the PDF to determine what the figure was showing. Please make text in all figures a minimum of 10 point font. Similarly, Figure 2D point type is impossible to read. Points should be larger in all figures. Gene names should be in italics in all figures, following C. elegans convention.

(2) I have a strong bias against the second point in Figure 1A. Sequencing of trios, cohorts, or individuals NEVER identifies causal genes in the disease. This technique proposes a candidate gene. Future experiments (oftentimes in model organisms) are required to make those connections to causality. Please edit this figure and parts of the text.

(3) How were the high-confidence orthologs filtered from 767 to 543 (lines 128-131)? Also, the choice of the final list of 25 genes is not well justified. Please expand more about how these choices were made.

(4) Figures 3 and 4, why show all 8289 features? It might be easier to understand and read if only the 256 Tierpsy features were plotted in the heat maps.

(5) The unc-80 mutant screen is clever. In the feature space, it is likely better to focus on the 256 less-redundant Tierpsy features instead of just a number of features. It is unclear to me how many of these features are correlated and not providing more information. In other words, the "worsening" of less-redundant features is far more of a concern than "worsening" of 1000 correlated features.

Read the original source
eLife
Oct 7, 2024

Reviewer #3 (Public review):

In this study, O'Brien et al. address the need for scalable and cost-effective approaches to finding lead compounds for the treatment of the growing number of Mendelian diseases. They used state-of-the-art phenotypic screening based on an established high-dimensional phenotypic analysis pipeline in the nematode C. elegans.

First, a panel of 25 C. elegans models was created by generating CRISPR/Cas9 knock-out lines for conserved human disease genes. These mutant strains underwent behavioral analysis using the group's published methodology. Clustering analysis revealed common features for genes likely operating in similar genetic pathways or biological functions. The study also presents results from a more focused examination of ciliopathy disease models.

Subsequently, the study focuses on the NALCN channel …

Reviewer #3 (Public review):

In this study, O'Brien et al. address the need for scalable and cost-effective approaches to finding lead compounds for the treatment of the growing number of Mendelian diseases. They used state-of-the-art phenotypic screening based on an established high-dimensional phenotypic analysis pipeline in the nematode C. elegans.

First, a panel of 25 C. elegans models was created by generating CRISPR/Cas9 knock-out lines for conserved human disease genes. These mutant strains underwent behavioral analysis using the group's published methodology. Clustering analysis revealed common features for genes likely operating in similar genetic pathways or biological functions. The study also presents results from a more focused examination of ciliopathy disease models.

Subsequently, the study focuses on the NALCN channel gene family, comparing the phenotypes of mutants of nca-1, unc-77, and unc-80. This initial characterization identifies three behavioral parameters that exhibit significant differences from the wild type and could serve as indicators for pharmacological modulation.

As a proof-of-concept, O'Brien et al. present a drug repurposing screen using an FDA-approved compound library, identifying two compounds capable of rescuing the behavioral phenotype in a model with UNC80 deficiency. The relatively short time and low cost associated with creating and phenotyping these strains suggest that high-throughput worm tracking could serve as a scalable approach for drug repurposing, addressing the multitude of Mendelian diseases. Interestingly, by measuring a wide range of behavioural parameters, this strategy also simultaneously reveals deleterious side effects of tested drugs that may confound the analysis.

Considering the wealth of data generated in this study regarding important human disease genes, it is regrettable that the data is not made accessible to researchers less versed in data analysis methods. This diminishes the study's utility. It would have a far greater impact if an accessible and user-friendly online interface were established to facilitate data querying and feature extraction for specific mutants. This would empower researchers to compare their findings with the extensive dataset created here.

Another technical limitation of the study is the use of single alleles. Large deletion alleles were generated by CRISPR/Cas9 gene editing. At first glance, this seems like a good idea because it limits the risk that background mutations, present in chemically-generated alleles, will affect behavioral parameters. However, these large deletions can also remove non-coding RNAs or other regulatory genetic elements, as found, for example, in introns. Therefore, it would be prudent to validate the behavioral effects by testing additional loss-of-function alleles produced through early stop codons or targeted deletion of key functional domains.

Read the original source
eLife
Oct 7, 2024

Author response:

The following is the authors’ response to the original reviews.

Public Reviews:

Reviewer #1 (Public Review):

Summary:

As the scientific community identifies increasing numbers of genetic variants that cause rare human diseases, a challenge is how the field can most quickly identify pharmacological interventions to address known deficits. The authors point out that defining phenotypic outcomes required for drug screen assays is often challenging, and emphasize how invertebrate models can be used for quick ID of compounds that may address genetic deficits. A major contribution of this work is to establish a framework for potential intervention drug screening based on quantitative imaging of morphology and mobility behavior, using methods that the authors show can define subtle phenotypes in a high proportion of disease …

Author response:

The following is the authors’ response to the original reviews.

Public Reviews:

Reviewer #1 (Public Review):

Summary:

As the scientific community identifies increasing numbers of genetic variants that cause rare human diseases, a challenge is how the field can most quickly identify pharmacological interventions to address known deficits. The authors point out that defining phenotypic outcomes required for drug screen assays is often challenging, and emphasize how invertebrate models can be used for quick ID of compounds that may address genetic deficits. A major contribution of this work is to establish a framework for potential intervention drug screening based on quantitative imaging of morphology and mobility behavior, using methods that the authors show can define subtle phenotypes in a high proportion of disease gene knockout mutants.

Overall, the work constitutes an elegant combination of previously developed high-volume imaging with highly detailed quantitative phenotyping (and some paring down to specific phenotypes) to establish proof of principle on how the combined applications can contribute to screens for compounds that may address specific genetic deficits, which can suggest both mechanism and therapy.

In brief, the authors selected 25 genes for which loss of function is implicated in human neuro-muscular disease and engineered deletions in the corresponding C. elegans homologs. The authors then imaged morphological features and behaviors prior to, during, and after blue light stimuli, quantitating features, and clustering outcomes as they elegantly developed previously (PMID 35322206; 30171234; 30201839). In doing so, phenotypes in 23/25 tested mutants could be separated enough to distinguish WT from mutant and half of those with adequate robustness to permit high-throughput screens, an outcome that supports the utility of general efforts to ID phenotypes in C. elegans disease orthologs using this approach. A detailed discussion of 4 ciliopathy gene defects, and NACLN-related channelopathy mutants reveals both expected and novel phenotypes, validating the basic approach to modeling vetted targets and underscoring that quantitative imaging approaches reiterate known biology. The authors then screened a library of nearly 750 FDA-approved drugs for the capacity to shift the unc-80 NACLN channel-disrupted phenotype closer to the wild type. Top "mover" compound move outcome in the experimental outcome space; and also reveal how "side effects" can be evaluated to prioritize compounds that confer the fewest changes of other parameters away from the center.

Strengths:

Although the imaging and data analysis approaches have been reported and the screen is limited in scope and intervention exposure, it is important that the authors strongly combine individual approach elements to demonstrate how quantitative imaging phenotypes can be integrated with C. elegans genetics to accelerate the identification of potential modulators of disease (easily extendable to other goals). Generation of deletion alleles and documentation of their associated phenotypes (available in supplemental data) provide potentially useful reagents/data to the field. The capacity to identify "over-shooting" of compound applications with suggestions for scale back and to sort efficacious interventions to minimize other changes to behavioral and physical profiles is a strong contribution.

Weaknesses:

The work does not have major weaknesses, although it may be possible to expand the discussion to increase utility in the field:

(1) Increased discussion of the challenges and limitations of the approach may enhance successful adaptation application in the field.

It is quite possible that morphological and behavioral phenotypes have nothing to do with disease mechanisms and rather reflect secondary outcomes, such that positive hits will address "off-target" consequences.

This is possible and can only be determined with human data. We now discuss the possibility in the discussion.

The deletion approach is adequately justified in the text, but the authors may make the point somewhere that screening target outcomes might be enhanced by the inclusion of engineered alleles that match the human disease condition. Their work on sod-1 alleles (PMID 35322206) might be noted in this discussion.

We agree and now mention this work in the discussion. We are currently working on a collection of strains with patient-specific mutations.

Drug testing here involved a strikingly brief exposure to a compound, which holds implications for how a given drug might engage in adult animals. The authors might comment more extensively on extended treatments that include earlier life or more extended targeting. The assumption is that administering different exposure periods and durations, but if the authors are aware as to whether there are challenges associated with more prolonged applications, larger scale etc. it would be useful to note them.

More prolonged applications are definitely possible. We chose short treatments for this screen to model the potential for changing neural phenotypes once developmental effects of the mutation have already occurred. We now briefly discuss this choice and the potential of longer treatments in the discussion.

(2) More justification of the shift to only a few target parameters for judging compound effectiveness.

- In the screen in Figure 4D and text around 313, 3 selected core features of the unc-80 mutant (fraction that blue-light pause, speed, and curvature) were used to avoid the high replicate requirements to identify subtle phenotypes. Although this strategy was successful as reported in Figure 5, the pared-down approach seems a bit at odds with the emphasis on the range of features that can be compared mutant/wt with the author's powerful image analysis. Adding details about the reduced statistical power upon multiple comparisons, with a concrete example calculated, might help interested scientists better assess how to apply this tool in experimental design.

To empirically test the effect of including more features on the subsequent screen, we have repeated the analysis using increasing numbers of features. In a new supplementary figure we find increasing the number of features reduces our power to detect rescue. At 256 features, we would not be able to detect any compounds that rescued the disease model phenotype.

(3) More development of the side-effect concept. The side effects analysis is interesting and potentially powerful. Prioritization of an intervention because of minimal perturbation of other phenotypes might be better documented and discussed a bit further; how reliably does the metric of low side effects correlate with drug effectiveness?

Ultimately this can only be determined with clinical trial data on multiple drugs, but there are currently no therapeutic options for UNC80 deficiency in humans. We have included some extra discussion of the side effect concept.

Reviewer #2 (Public Review):

Summary and strengths:

O'Brien et al. present a compelling strategy to both understand rare disease that could have a neuronal focus and discover drugs for repurposing that can affect rare disease phenotypes. Using C. elegans, they optimize the Brown lab worm tracker and Tierpsy analysis platform to look at the movement behaviors of 25 knockout strains. These gene knockouts were chosen based on a process to identify human orthologs that could underlie rare diseases. I found the manuscript interesting and a powerful approach to making genotype-phenotype connections using C. elegans. Given the rate at which rare Mendelian diseases are found and candidate genes suggested, human geneticists need to consider orthologous approaches to understand the disease and seek treatments on a rapid time scale. This approach is one such way. Overall, I have a few minor suggestions and some specific edits.

Weaknesses:

(1) Throughout the text on figures, labels are nearly impossible to read. I had to zoom into the PDF to determine what the figure was showing. Please make text in all figures a minimum of 10-point font. Similarly, the Figure 2D point type is impossible to read. Points should be larger in all figures. Gene names should be in italics in all figures, following C. elegans convention.

We have updated all figures with larger labels and, where necessary, split figures to allow for better readability. We’ve also corrected italicisation.

(2) I have a strong bias against the second point in Figure 1A. Sequencing of trios, cohorts, or individuals NEVER identifies causal genes in the disease. This technique proposes a candidate gene. Future experiments (oftentimes in model organisms) are required to make those connections to causality. Please edit this figure and parts of the text.

We have removed references to causation. We were thinking of cases where a known variant is found in a patient where causality has already been established rather than cases of new variant discovery.

(3) How were the high-confidence orthologs filtered from 767 to 543 (lines 128-131)? Also, the choice of the final list of 25 genes is not well justified. Please expand more about how these choices were made.

We now explain the extra keyword filtering step. For the final filtering step, we simply examined the list and chose 25. There is therefore little justification to provide and we acknowledge these cannot be seen as representative of the larger set according to well-defined rules. The choice was based on which genes we thought would be interesting using their descriptions or our prior knowledge (“subjective interestingness” in the main text).

(4) Figures 3 and 4, why show all 8289 features? It might be easier to understand and read if only the 256 Tierpsy features were plotted in the heat maps.

In this case, we included all features because they were all tested for differences between mutants and controls. By consistently using all features for each fingerprint we can be sure that the features that are different that we want to highlight in box plots can be referred to in the fingerprint.

(5) The unc-80 mutant screen is clever. In the feature space, it is likely better to focus on the 256 less-redundant Tierpsy features instead of just a number of features. It is unclear to me how many of these features are correlated and not providing more information. In other words, the "worsening" of less-redundant features is far more of a concern than the "worsening" of 1000 correlated features.

This is a good point. We’ve redone the analysis using the Tierpsy 256 feature set and included this as a supplementary figure. We find that the same trend exists when looking at this reduced feature set.

Reviewer #3 (Public Review):

In this study, O'Brien et al. address the need for scalable and cost-effective approaches to finding lead compounds for the treatment of the growing number of Mendelian diseases. They used state-of-the-art phenotypic screening based on an established high-dimensional phenotypic analysis pipeline in the nematode C. elegans.

First, a panel of 25 C. elegans models was created by generating CRISPR/Cas9 knock-out lines for conserved human disease genes. These mutant strains underwent behavioral analysis using the group's published methodology. Clustering analysis revealed common features for genes likely operating in similar genetic pathways or biological functions. The study also presents results from a more focused examination of ciliopathy disease models.

Subsequently, the study focuses on the NALCN channel gene family, comparing the phenotypes of mutants of nca-1, unc-77, and unc-80. This initial characterization identifies three behavioral parameters that exhibit significant differences from the wild type and could serve as indicators for pharmacological modulation.

As a proof-of-concept, O'Brien et al. present a drug repurposing screen using an FDA-approved compound library, identifying two compounds capable of rescuing the behavioral phenotype in a model with UNC80 deficiency. The relatively short time and low cost associated with creating and phenotyping these strains suggest that high-throughput worm tracking could serve as a scalable approach for drug repurposing, addressing the multitude of Mendelian diseases. Interestingly, by measuring a wide range of behavioural parameters, this strategy also simultaneously reveals deleterious side effects of tested drugs that may confound the analysis.

Considering the wealth of data generated in this study regarding important human disease genes, it is regrettable that the data is not actually made accessible. This diminishes the study's utility. It would have a far greater impact if an accessible and user-friendly online interface were established to facilitate data querying and feature extraction for specific mutants. This would empower researchers to compare their findings with the extensive dataset created here. Otherwise, one is left with a very limited set of exploitable data.

We have now made the feature data available on Zenodo (https://doi.org/10.5281/zenodo.12684118) as a matrix of feature summaries and individual skeleton timeseries data (the feature matrix makes it more straightforward to extract the data from particular mutants for reanalysis). We have also created a static html version of the heatmap in Figure 2 containing the entire behavioural feature set extracted by Tierpsy. This can be opened in a browser and zoomed for detailed inspection. Mousing over the heatmap shows the names of features at each position making it easier to arrive at intuitive conclusions like ‘strain A is slow’ or ‘strain B is more curved’.

Another technical limitation of the study is the use of single alleles. Large deletion alleles were generated by CRISPR/Cas9 gene editing. At first glance, this seems like a good idea because it limits the risk that background mutations, present in chemically-generated alleles, will affect behavioral parameters. However, these large deletions can also remove non-coding RNAs or other regulatory genetic elements, as found, for example, in introns. Therefore, it would be prudent to validate the behavioral effects by testing additional loss-of-function alleles produced through early stop codons or targeted deletion of key functional domains.

We have added a note in the main text on limitations of deletion alleles. We like the idea of making multiple alleles in future studies, especially in cases where a project is focussed on just one or a few genes.

Recommendations for the authors

Reviewer #1 (Recommendations For The Authors):

Note that none of the above suggestions or the one immediately below are considered mandatory.

One additional minor point: The dual implication of mevalonate perturbations for NACLM deficiencies is striking. At the same time, the mevalonate pathway is critical for embryo viability among other things, which prompts questions about how reproductive physiology is integrated in this screen approach. It appears that sterilization protocols are not used to prepare screen target animals, but it would be useful to know if there were a signature associated with drug-induced sterility that might help identify one potential common non-interesting outcome of compound treatments in general. In this work, the screen treatment is only 4 hours, which is probably too short to compromise reproduction, but as noted above, it is likely users would intend to expose test subjects for much longer than 4-hour periods.

This is an interesting point. In its current form our screen doesn’t assess reproductive physiology. This is something that we will consider in ongoing projects.

Figures

Figure 1D might be omitted or moved to supplement.

We have removed 1D and moved figure 1E as a standalone table (Table 1) to improve readability.

Figure 2D "key" is hard to make out size differences for prestim, bluelight, and poststim -more distinctive symbols should be used.

We have increased the size of the symbols so that the key is easier to read.

Line 412 unc-25 should be in italics

Corrected

Reviewer #2 (Recommendations For The Authors):

Specific edits:

All of the errors below have been corrected.

Line 47, "loss of function" should be hyphenated because it is a compound adjective that modifies mutations.

Line 50, "genetically-tractable" should not be hyphenated because it is not a compound adjective. It is an adverb-adjective pair. Line 102 has the same grammatical issue.

Line 85, "rare genetic diseases" do not "affect nervous system function". The disease might have deficits in this function, but the disease does not do anything to function.

Line 86, it should be mutations not mutants. Mutations are changes to DNA. Mutants are individuals with mutations.

Throughout, wild-type should be hyphenated when it is used as a compound adjective.

Figure 4, asterisks is spelled incorrectly.

Reviewer #3 (Recommendations For The Authors):

- As stated in the public review, the utility of the study is limited by the lack of access to the complete dataset. The wealth of data produced by the study is one of its major outputs.

We have made the data publicly available on Zenodo. We appreciate the request.

- Describe the exact break-points of the different alleles, because it was not readily feasible to derive them from the gene fact sheets provided in the supplementary materials.

We have now provided the start position and total length of deletion for each gene in the gene fact sheets.

- Figure 1C: what does "Genetic homology"/"sequence identity" refer to? How were these values calculated?

UNC-49 is clearly not 95% identical to vertebrate GABAR subunits at the protein level.

We have changed the axis label to “BLAST % Sequence Identity” to clarify that these values are calculated from BLAST sequence alignments on WormBase and the Alliance Genom Resources webpages.

- Figure 1E : The data presented in Figure 1E appears somewhat unreliable. For example, a cursory check showed:

(1) Wrong human ortholog: unc-49 is a Gaba receptor, not a Glycine receptor as indicated in the second column.

(2) Wrong disease association: dys-1 is not associated with Bardet-Biedl syndrome; overall the data indicated in the table does not seem to fully match the HPO database.

(3) Inconsistent disease association: why don't the avr-14 and glc-2 (and even unc-49) profiles overlap/coincide given that they present overlapping sets of human orthologs.

Thank you for catching this! We have corrected gene names which were mistakenly pasted. We have also made this a standalone table (Table 1) for improved readability.

- Error in legend to figure 4I : "with ciliopathies and N2" > ciliopathies should be "NALCN disease".

- Error at line 301: "Figures 2E-H" should be "Figures 4E-H".

Corrected.

Read the original source
Version published to 10.7554/elife.92491.1 on eLife
Nov 28, 2023
eLife
Nov 27, 2023

eLife assessment

This important study provides proof of the principle that C. elegans models can be used to accelerate the discovery of candidate treatments for human Mendelian diseases by detailed high-throughput phenotyping of strains harboring mutations in orthologs of human disease genes. The data presented are solid and would potentially be convincing if complete data sets were to be made available to the scientific community. This approach enables the potential rapid repurposing of FDA-approved drugs to treat rare diseases for which there are currently no effective treatments.

Read the original source
eLife
Nov 27, 2023
Reviewer #1 (Public Review):

Summary:
As the scientific community identifies increasing numbers of genetic variants that cause rare human diseases, a challenge is how the field can most quickly identify pharmacological interventions to address known deficits. The authors point out that defining phenotypic outcomes required for drug screen assays is often challenging, and emphasize how invertebrate models can be used for quick ID of compounds that may address genetic deficits. A major contribution of this work is to establish a framework for potential intervention drug screening based on quantitative imaging of morphology and mobility behavior, using methods that the authors show can define subtle phenotypes in a high proportion of disease gene knockout mutants.

Overall, the work constitutes an elegant combination of previously developed …
Reviewer #1 (Public Review):

Summary:
As the scientific community identifies increasing numbers of genetic variants that cause rare human diseases, a challenge is how the field can most quickly identify pharmacological interventions to address known deficits. The authors point out that defining phenotypic outcomes required for drug screen assays is often challenging, and emphasize how invertebrate models can be used for quick ID of compounds that may address genetic deficits. A major contribution of this work is to establish a framework for potential intervention drug screening based on quantitative imaging of morphology and mobility behavior, using methods that the authors show can define subtle phenotypes in a high proportion of disease gene knockout mutants.

Overall, the work constitutes an elegant combination of previously developed high-volume imaging with highly detailed quantitative phenotyping (and some paring down to specific phenotypes) to establish proof of principle on how the combined applications can contribute to screens for compounds that may address specific genetic deficits, which can suggest both mechanism and therapy.

In brief, the authors selected 25 genes for which loss of function is implicated in human neuro-muscular disease and engineered deletions in the corresponding C. elegans homologs. The authors then imaged morphological features and behaviors prior to, during, and after blue light stimuli, quantitating features, and clustering outcomes as they elegantly developed previously (PMID 35322206; 30171234; 30201839). In doing so, phenotypes in 23/25 tested mutants could be separated enough to distinguish WT from mutant and half of those with adequate robustness to permit high-throughput screens, an outcome that supports the utility of general efforts to ID phenotypes in C. elegans disease orthologs using this approach. A detailed discussion of 4 ciliopathy gene defects, and NACLN-related channelopathy mutants reveals both expected and novel phenotypes, validating the basic approach to modeling vetted targets and underscoring that quantitative imaging approaches reiterate known biology. The authors then screened a library of nearly 750 FDA-approved drugs for the capacity to shift the unc-80 NACLN channel-disrupted phenotype closer to the wild type. Top "mover" compound move outcome in the experimental outcome space; and also reveal how "side effects" can be evaluated to prioritize compounds that confer the fewest changes of other parameters away from the center.

Strengths:
Although the imaging and data analysis approaches have been reported and the screen is limited in scope and intervention exposure, it is important that the authors strongly combine individual approach elements to demonstrate how quantitative imaging phenotypes can be integrated with C. elegans genetics to accelerate the identification of potential modulators of disease (easily extendable to other goals). Generation of deletion alleles and documentation of their associated phenotypes (available in supplemental data) provide potentially useful reagents/data to the field. The capacity to identify "over-shooting" of compound applications with suggestions for scale back and to sort efficacious interventions to minimize other changes to behavioral and physical profiles is a strong contribution.

Weaknesses:
The work does not have major weaknesses, although it may be possible to expand the discussion to increase utility in the field:

Increased discussion of the challenges and limitations of the approach may enhance successful adaptation application in the field.

--It is quite possible that morphological and behavioral phenotypes have nothing to do with disease mechanisms and rather reflect secondary outcomes, such that positive hits will address "off-target" consequences.

--The deletion approach is adequately justified in the text, but the authors may make the point somewhere that screening target outcomes might be enhanced by the inclusion of engineered alleles that match the human disease condition. Their work on sod-1 alleles (PMID 35322206) might be noted in this discussion.

--Drug testing here involved a strikingly brief exposure to a compound, which holds implications for how a given drug might engage in adult animals. The authors might comment more extensively on extended treatments that include earlier life or more extended targeting. The assumption is that administering different exposure periods and durations, but if the authors are aware as to whether there are challenges associated with more prolonged applications, larger scale etc. it would be useful to note them.

More justification of the shift to only a few target parameters for judging compound effectiveness.
-In the screen in Figure 4D and text around 313, 3 selected core features of the unc-80 mutant (fraction that blue-light pause, speed, and curvature) were used to avoid the high replicate requirements to identify subtle phenotypes. Although this strategy was successful as reported in Figure 5, the pared-down approach seems a bit at odds with the emphasis on the range of features that can be compared mutant/wt with the author's powerful image analysis. Adding details about the reduced statistical power upon multiple comparisons, with a concrete example calculated, might help interested scientists better assess how to apply this tool in experimental design.

More development of the side-effect concept. The side effects analysis is interesting and potentially powerful. Prioritization of an intervention because of minimal perturbation of other phenotypes might be better documented and discussed a bit further; how reliably does the metric of low side effects correlate with drug effectiveness?
Read the original source
eLife
Nov 27, 2023

Reviewer #2 (Public Review):

Summary and strengths:

O'Brien et al. present a compelling strategy to both understand rare disease that could have a neuronal focus and discover drugs for repurposing that can affect rare disease phenotypes. Using C. elegans, they optimize the Brown lab worm tracker and Tierpsy analysis platform to look at the movement behaviors of 25 knockout strains. These gene knockouts were chosen based on a process to identify human orthologs that could underlie rare diseases. I found the manuscript interesting and a powerful approach to making genotype-phenotype connections using C. elegans. Given the rate at which rare Mendelian diseases are found and candidate genes suggested, human geneticists need to consider orthologous approaches to understand the disease and seek treatments on a rapid time scale. This approach …

Reviewer #2 (Public Review):

Summary and strengths:

O'Brien et al. present a compelling strategy to both understand rare disease that could have a neuronal focus and discover drugs for repurposing that can affect rare disease phenotypes. Using C. elegans, they optimize the Brown lab worm tracker and Tierpsy analysis platform to look at the movement behaviors of 25 knockout strains. These gene knockouts were chosen based on a process to identify human orthologs that could underlie rare diseases. I found the manuscript interesting and a powerful approach to making genotype-phenotype connections using C. elegans. Given the rate at which rare Mendelian diseases are found and candidate genes suggested, human geneticists need to consider orthologous approaches to understand the disease and seek treatments on a rapid time scale. This approach is one such way. Overall, I have a few minor suggestions and some specific edits.

Weaknesses:
(1) Throughout the text on figures, labels are nearly impossible to read. I had to zoom into the PDF to determine what the figure was showing. Please make text in all figures a minimum of 10-point font. Similarly, the Figure 2D point type is impossible to read. Points should be larger in all figures. Gene names should be in italics in all figures, following C. elegans convention.

(2) I have a strong bias against the second point in Figure 1A. Sequencing of trios, cohorts, or individuals NEVER identifies causal genes in the disease. This technique proposes a candidate gene. Future experiments (oftentimes in model organisms) are required to make those connections to causality. Please edit this figure and parts of the text.

(3) How were the high-confidence orthologs filtered from 767 to 543 (lines 128-131)? Also, the choice of the final list of 25 genes is not well justified. Please expand more about how these choices were made.

(4) Figures 3 and 4, why show all 8289 features? It might be easier to understand and read if only the 256 Tierpsy features were plotted in the heat maps.

(5) The unc-80 mutant screen is clever. In the feature space, it is likely better to focus on the 256 less-redundant Tierpsy features instead of just a number of features. It is unclear to me how many of these features are correlated and not providing more information. In other words, the "worsening" of less-redundant features is far more of a concern than the "worsening" of 1000 correlated features.

Read the original source
eLife
Nov 27, 2023

Reviewer #3 (Public Review):

In this study, O'Brien et al. address the need for scalable and cost-effective approaches to finding lead compounds for the treatment of the growing number of Mendelian diseases. They used state-of-the-art phenotypic screening based on an established high-dimensional phenotypic analysis pipeline in the nematode C. elegans.

First, a panel of 25 C. elegans models was created by generating CRISPR/Cas9 knock-out lines for conserved human disease genes. These mutant strains underwent behavioral analysis using the group's published methodology. Clustering analysis revealed common features for genes likely operating in similar genetic pathways or biological functions. The study also presents results from a more focused examination of ciliopathy disease models.

Subsequently, the study focuses on the NALCN channel …

Reviewer #3 (Public Review):

In this study, O'Brien et al. address the need for scalable and cost-effective approaches to finding lead compounds for the treatment of the growing number of Mendelian diseases. They used state-of-the-art phenotypic screening based on an established high-dimensional phenotypic analysis pipeline in the nematode C. elegans.

First, a panel of 25 C. elegans models was created by generating CRISPR/Cas9 knock-out lines for conserved human disease genes. These mutant strains underwent behavioral analysis using the group's published methodology. Clustering analysis revealed common features for genes likely operating in similar genetic pathways or biological functions. The study also presents results from a more focused examination of ciliopathy disease models.

Subsequently, the study focuses on the NALCN channel gene family, comparing the phenotypes of mutants of nca-1, unc-77, and unc-80. This initial characterization identifies three behavioral parameters that exhibit significant differences from the wild type and could serve as indicators for pharmacological modulation.

As a proof-of-concept, O'Brien et al. present a drug repurposing screen using an FDA-approved compound library, identifying two compounds capable of rescuing the behavioral phenotype in a model with UNC80 deficiency. The relatively short time and low cost associated with creating and phenotyping these strains suggest that high-throughput worm tracking could serve as a scalable approach for drug repurposing, addressing the multitude of Mendelian diseases. Interestingly, by measuring a wide range of behavioural parameters, this strategy also simultaneously reveals deleterious side effects of tested drugs that may confound the analysis.

Considering the wealth of data generated in this study regarding important human disease genes, it is regrettable that the data is not actually made accessible. This diminishes the study's utility. It would have a far greater impact if an accessible and user-friendly online interface were established to facilitate data querying and feature extraction for specific mutants. This would empower researchers to compare their findings with the extensive dataset created here. Otherwise, one is left with a very limited set of exploitable data.

Another technical limitation of the study is the use of single alleles. Large deletion alleles were generated by CRISPR/Cas9 gene editing. At first glance, this seems like a good idea because it limits the risk that background mutations, present in chemically-generated alleles, will affect behavioral parameters. However, these large deletions can also remove non-coding RNAs or other regulatory genetic elements, as found, for example, in introns. Therefore, it would be prudent to validate the behavioral effects by testing additional loss-of-function alleles produced through early stop codons or targeted deletion of key functional domains.

Read the original source
Arcadia Science
Sep 1, 2023

phonotypes

phenotypes

Read the original source
Arcadia Science
Sep 1, 2023

(1) at least two orthology prediction algorithms agree the human and worm genes are orthologs; (2) the WormBase (version WS270) (Harris et al., 2020) gene description includes either ‘neuro’ or ‘musc’ (this captures variants of neuronal, neural, muscle, muscular etc.);

I'm wondering about how varying these criteria would effect the number of/which genes were detected.

For criteria 1, what was the rationale for choosing agreement between >2 algorithms? From Fig 1C, it's hard to tell if there is a relationship between %homology and #of agreeing algorithms. What benefit do you get from using this cutoff? What are the tradeoffs? It might be helpful to include a figure similar to 1C, but including the full set of genes before filtering and to walkthrough the outcomes of different cutoffs.

Similarly, for criteria 2, what type of/how many …

(1) at least two orthology prediction algorithms agree the human and worm genes are orthologs; (2) the WormBase (version WS270) (Harris et al., 2020) gene description includes either ‘neuro’ or ‘musc’ (this captures variants of neuronal, neural, muscle, muscular etc.);

I'm wondering about how varying these criteria would effect the number of/which genes were detected.

For criteria 1, what was the rationale for choosing agreement between >2 algorithms? From Fig 1C, it's hard to tell if there is a relationship between %homology and #of agreeing algorithms. What benefit do you get from using this cutoff? What are the tradeoffs? It might be helpful to include a figure similar to 1C, but including the full set of genes before filtering and to walkthrough the outcomes of different cutoffs.

Similarly, for criteria 2, what type of/how many hits do you get if you don't select for 'neuro' or 'musc'? Is there any chance that, though you are using a behavioral readout, genes not annotated 'neuro'/'musc' might still contribute to a behavioral phenotype (e.g. via pleiotropy/epistasis)? Would be useful to include a statement of your thinking on this!

Read the original source
Version published to 10.1101/2023.08.25.554786 on bioRxiv
Aug 28, 2023

Large Language Models Enhance Molecular Diagnoses of Mendelian Disorders via A Novel Logic

This article has 15 authors:
1. Zefu Chen
2. Jihao Cai
3. Yongxin Yang
4. Sen Zhao
5. Guozhuang Li
6. Kexin Xu
7. Qing Li
8. Timothy Hospedales
9. Lina Zhao
10. Zhongmin Zhang
11. Zhihong Wu
12. Guixing Qiu
13. Terry Jianguo Zhang
14. Pengfei Liu
15. Nan Wu
This article has no evaluationsLatest version Dec 22, 2025
Pooled CRISPRi screening reveals fungal-specific vulnerabilities across environments and genetic backgrounds

This article has 11 authors:
1. Rebecca Shapiro
2. Lauren Wensing
3. Philippe Despres
4. Desiree Francis
5. Meea Fogal
6. Anthony Hendriks
7. Nicholas Gervais
8. Clara Fikry
9. Abdul-Rahman Adamu Bukari
10. Aleeza Gerstein
11. Christina Cuomo
This article has no evaluationsLatest version Jan 13, 2026
Decoding Complex Genotype-Phenotype Interactions by Discretizing the Genome

This article has 6 authors:
1. Jędrzej Kubica
2. Hetvi Jethwani
3. Krzysztof H. Banecki
4. Mauricio Moldes
5. Dariusz Plewczynski
6. Ben Busby
This article has no evaluationsLatest version Dec 17, 2025

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Large Language Models Enhance Molecular Diagnoses of Mendelian Disorders via A Novel Logic

Pooled CRISPRi screening reveals fungal-specific vulnerabilities across environments and genetic backgrounds

Decoding Complex Genotype-Phenotype Interactions by Discretizing the Genome