Workflow sharing with automated metadata validation and test execution to improve the reusability of published workflows

This article has been Reviewed by the following groups

Read the full article

Listed in

Log in to save this article

Abstract

Background

Many open-source workflow systems have made bioinformatics data analysis procedures portable. Sharing these workflows provides researchers easy access to high-quality analysis methods without the requirement of computational expertise. However, published workflows are not always guaranteed to be reliably reusable. Therefore, a system is needed to lower the cost of sharing workflows in a reusable form.

Results

We introduce Yevis, a system to build a workflow registry that automatically validates and tests workflows to be published. The validation and test are based on the requirements we defined for a workflow being reusable with confidence. Yevis runs on GitHub and Zenodo and allows workflow hosting without the need of dedicated computing resources. A Yevis registry accepts workflow registration via a GitHub pull request, followed by an automatic validation and test process for the submitted workflow. As a proof of concept, we built a registry using Yevis to host workflows from a community to demonstrate how a workflow can be shared while fulfilling the defined requirements.

Conclusions

Yevis helps in the building of a workflow registry to share reusable workflows without requiring extensive human resources. By following Yevis’s workflow-sharing procedure, one can operate a registry while satisfying the reusable workflow criteria. This system is particularly useful to individuals or communities that want to share workflows but lacks the specific technical expertise to build and maintain a workflow registry from scratch.

Article activity feed

  1. We defined the Yevis metadata file, a JSON or YAML format file that contains structured workflow metadata (Figure 2).

    This looks very approachable. I made another comment about this, but I assume this file doesn't go to Zenodo with the repository.

    Have you explored providing a "scaffold" or recommended directory/file structure for the workflow repository and auto generating this file from that? That is potentially intractable, as I don't have a handle on all the edge cases.

    Currently, are there checks to ensure that the Yevis metadata file is accurate (ie this file says the workflow is using an MIT license, and the actual repository says some form of CC license). If that check doesn't exist, would it put undue burden on the maintainer of the repository?

  2. WorkflowHub asks submitters to take responsibility for workflows: when a workflow is registered on WorkflowHub, the license and author identity should be clearly stated, encouraging them to publish FAIR workflows. However, there is no obligation as to the correctness of the workflow syntax, its executability, or testing. Not placing too many responsibilities on workflow submitters keeps obstacles to submission low, which will likely increase the diversity of public workflows on WorkflowHub; however, it will also likely increase the number of one-off submissions, which one can assume are at higher risk for the workflow problems previously described.Unlike WorkflowHub, in nf-core, the community that operates the registry holds more accountability for published workflows. Workflow submitters are required to join the nf-core community, develop workflows according to their guidelines, and prepare them for testing. These requirements enable nf-core to collect workflows with remarkable reliability. However, the community’s effort tends to focus on maintaining more generic workflows that have a large number of potential users. Consequently, nf-core states that it does not accept submissions of bespoke workflows. This is an understandable policy, as maintaining a workflow requires domain knowledge of its content, and this is difficult to maintain in the absence of the person who developed the workflow.

    An interesting parallel to this is the differences in approach to package management by PyPI and CondaForge. In general, PyPI packages are submitted by the person maintaining the package and its build infrastructure (submitters take responsibility for the packages). CondaForge packages can be submitted by anyone, but the entity accepting them is centralized and goes through a rigorous process of standardization and validation. Could be an interesting area to further explore.

  3. Sharing workflows not only increases the transparency of research but also helps researchers by facilitating the reuse of programs, thereby making data analysis procedures more efficient.

    100% agree and I love this. One thing I'd like to see as part of this is better tooling to enable developers to compose bioinformatics workflows from existing sub-workflows. The nf-core community strives for this, but the current process of using workflows from the registry is error-prone.

  4. Even if a workflow can be executed, the correctness of its operation often cannot be verified because no tests have been provided.

    This is a great point. Sadly, even with Yevis, this issue is not super easy to solve (as you note further down the paper). Without a clear set of testing principles, I would imagine either the burden of validation would fall onto the maintainer of the repository or would be super manual.

    I'd love to see if you have suggestions on what a good testing suite for a bioinformatics workflow should include or if you could include basic strategies to test bioinformatics workflows. Though, I'd understand if that's outside the scope of this paper.

  5. Yevis runs on GitHub and Zenodo and allows workflow hosting with-out the need of dedicated computing resources.

    I love your usage of Zenodo. At Arcadia Science, as part of our publishing process, we cut a release of the project Github repository and upload that release to Zenodo. Currently, that process is quite manual, and the Zenodo integration you built as part of the yevis-cli repository is a great example for us to automate this process.

  6. Yevis-cli executes a test using a GA4GH Workflow Execution Service (WES) instance, a type of web service also described as workflow as a service (17, 21); therefore, the testing materials must be written along with the specification of the WES run request.

    I've never heard of GA4GH Workflow Execution Service (WES) before, it seems super neat. In their documentation (potentially out-of-date), it seems like they only support CWL or WDL. First, it would be awesome if it could support Nextflow as it seems like there's a convergence on Nextflow as a workflow orchestration tool in the bioinformatics tool.

    Also, if WES only support CWL or WDL, how were you able to add an nf-core workflow to your demo registry on Figure 6?

  7. Because the submission method is restricted to Yevis-cli, the submitted workflow is guaranteed to pass validation and testing.

    One thing that wasn't super clear to me from reading the "Getting Started with Yevis" document was: How is the restriction implemented? What is preventing me from forking the registry repository and opening a PR from that work?

  8. First, Yeviscli generates a template for the Yevis metadata file, which requires the URL of the main workflow description file as an argument.

    One thing that wasn't super clear to me: is this file committed to the original workflow repository as well? Or is it uploaded to Zenodo as part of the repo upload? I think the latter would be very useful to match the Yevis YAML to the repository at a single point in time.

  9. Therefore, we are able to technically build a Yevis registry in an on-premise environment.

    This is awesome and I wanted to give it a shout-out as it may go under-appreciated. Yevis may or may not be super approachable for a massive community, but it seems super useful for a small community/organization/company. At Arcadia, we're super dedicated to sharing our workflows/tools in reproducible ways and depositing our data to FAIR repositories. So, I can imagine us using Yevis as an internal repository to make sure our tools meet our bar of quality before we share it with the outside world.

  10. Conclusions

    Reviewer names: Alban Gaignard (Report on revision 1)

    The reading of the revised paper would have been easier by providing updates in a different color but thank you for taking into account the comments and remarks, and clearly answering the raised issues. I also appreciated the extension of the discussion. However, I still have some concerns regarding the proposed approach. The proposed platform targets both workflow sharing and testing. It is explicitly stated in the abstract: "the validation and test are based on the requirements we defined for a workflow being reusable with confidence". It is clear in the paper that tests are realized through the GitHub CI infrastructure, possibly delegated to a WES workflow execution engine. Although I inspected Figure 3 as well as the wf_params.json and wf_params.yml provided in the demo website. It doesn't seem to be enough to answer questions such as: how are specified tests ? How can a user inspect what has been done during the testing process ? What is evaluated by the system to assess that a test is successful ? I tried to understand what was done during the testing process but the test logs are not available anymore (Add workflow: human-reseq: fastqSE2bam · ddbj/workflow-registry@19b7516 · GitHub) Regarding the findability of the workflows, in line with FAIR principles, the discussion mentions a possible solution which would consists in hosting and curating metadata in another database. To tackle workflow discoverability between multiple systems, accessible on the web, we could expect that the Yevis registry exposes semantic annotations, leveraging Schema.org (or any other controlled vocabulary) for instance. This would also make sense since EDAM ontology classes are referred to in the Yevis metadata file (https://ddbj.github.io/workflow-registry-browser/#/workflows/65bc3bd4-81d1-4f2a8886-1fbe19011d81/versions/1.0.0).

  11. Results

    Reviewer names: Alban Gaignard

    The reading of the revised paper would have been easier by providing updates in a different color but thank you for taking into account the comments and remarks, and clearly answering the raised issues. I also appreciated the extension of the discussion. However, I still have some concerns regarding the proposed approach. The proposed platform targets both workflow sharing and testing. It is explicitly stated in the abstract: "the validation and test are based on the requirements we defined for a workflow being reusable with confidence". It is clear in the paper that tests are realized through the GitHub CI infrastructure, possibly delegated to a WES workflow execution engine. Although I inspected Figure 3 as well as the wf_params.json and wf_params.yml provided in the demo website. It doesn't seem to be enough to answer questions such as: how are specified tests ? How can a user inspect what has been done during the testing process ? What is evaluated by the system to assess that a test is successful ? I tried to understand what was done during the testing process but the test logs are not available anymore (Add workflow: human-reseq: fastqSE2bam · ddbj/workflow-registry@19b7516 · GitHub) Regarding the findability of the workflows, in line with FAIR principles, the discussion mentions a possible solution which would consists in hosting and curating metadata in another database. To tackle workflow discoverability between multiple systems, accessible on the web, we could expect that the Yevis registry exposes semantic annotations, leveraging Schema.org (or any other controlled vocabulary) for instance. This would also make sense since EDAM ontology classes are referred to in the Yevis metadata file (https://ddbj.github.io/workflow-registry-browser/#/workflows/65bc3bd4-81d1-4f2a8886-1fbe19011d81/versions/1.0.0).

  12. analysis

    Reviewer name: Samuel Lampa

    The Yevis manuscript makes a good case for the need to be able to easily set up self-hosted workflow registries, and the work is a laudable effort. From the manuscript, the implementation decisions seem to be done in a very thoughtful way, using standardized APIs and formats where applicable (Such as WES). The manuscript itself is very well written, with a good structure, close to flawless language (see minor comment below) and clear descriptions and figures.

    Main concern

    I have one major gripe though, blocking acceptance: The choice to only support GitHub for hosting. There is a growing problem in the research world that more and more research is being dependent on the single commercial actor GitHub, for seemingly no other reason than convenience. Although GitHub to date can be said to have been a somewhat trustworthy player, there is no guarantee for the future, and ultimately this leaves a lot of research in an unhealthy dependenc on this single platform. As a small note of a recent change, is the proposed removal of the promise to not track its users (see https://github.com/github/site-policy/pull/582). A such a central infrastructure component for research as a workflow registry has an enormous responsibility here, as it may greatly influence the choices of researchers in the future to come, because of encouragement of what is "easier" or more convenient to do with the tools and infrastructure available. With this in mind, I find it unacceptable for a workflow registry supporting open science and open source work to only support one commercial provider. The authors mention that technically they are able to support any vendor, and also on-premise setups, which sounds excellent. I ask the authors to kindly implement this functionality. Especially the ability to run on-premises registries is key to encourage research to stay free and independent from commercial concerns.

    Minor concerns

    1. I think the manuscript is a missing citation to this key workflow review, as a recen overview of the bioinformatics workflows field, for example together with the current citation [6] in the manuscript: Wratten, L., Wilm, A., & Göke, J. (2021). Reproducible, scalable, and shareable analysis pipelines with bioinformatics workflow managers. Nature methods, 18(10), 1161-1168. https://www.nature.com/articles/s41592-021-01254-9
    2. Although it might not have been the intention of the authors, the following sentence sounds unneccessarily subjective and appraising, without data to back this up (rather this would be something for the users to evaluate):

    The Yevis system is a great solution for research communities that aim to share their workflows and wish to establish their own registry as described. I would rather expect wording similar to: "The Yevis system provides a [well-needed] solution for ..." ... which I think might have been closer to what the authors intended as well. Wishing the authors best of luck with this promising work!

  13. Background

    This work has been peer reviewed in GigaScience (see paper https://doi.org/10.1093/gigascience/giad006), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

    Reviewer name: Kyle Hernandez

    Suetake et. al designed and developed a system to publish, validate, and test public workflows utilizing existing standards and integration with modern CI/CD tools. Their design wasn't myopic, they relied heavily on their own experiences, work from GA4GH, and interacting with the large workflow development communities. They were inspired by the important work from Goble et. al that applies the FAIR standards to workflows. As someone who had a long history of workflow engine development, workflow development, and workflow reusability/sharing experience I greatly appreciate this work. There are still unsolved problems, like guidelines on how to approach writing tests for workflows for example, but their system is one level above this and focuses on ways to automate the validation, testing, reviewing/governance, and publishing into a repository to greatly reduce unexpected errors from users. I looked through the source code of their rust-based client, which was extremely readable and developed with industry-level standards. I followed the read me to setup my own repositories, configure the keys, and deploy the services successfully on the first walk through. That speaks to the level of skill, testing, and effort in developing this system and is great news for users interested in using this. At some level it can seem like a "proof of concept", but it is one that is also usable in production with some caveats. The concept is important and implementing this will hopefully inspire more folks to care about this side of workflow "provenance" and reproducibility. There are so many tools out there for CI/CD that is often poorly utilized by academia and I appreciate the author's showing how powerful they can be in this space. The current manuscript is fine and will be of great interest to a wide ranging set of readers, I only have some non-binding suggestions/thoughts that could improve the paper for readers:

    1. Based on your survey of existing systems, could you possibly make a figure or table that showcases the features supported/not supported by these different systems, including yours?
    2. Thoughts on security/cost safeguards? Perhaps beyond the scope, but it does seem like a governing group needs to define some limits to the testing resources and be able to enforce them. If I am a bad actor and programmatically open up 1000 PRs of expensive jobs, I'm not sure what would happen. Actions and artifact storage aren't necessarily free after some limit.
    3. What is the flow for simply updating to a new version of an existing workflow? (perhaps this could be in your docs, not necessarily this manuscript).
    4. CWL is an example of a workflow language that developers can extend to create custom "hints" or "requirements". For example, seven bridges does this in cavatica where a user can define aws spot instance configs etc. WDL has properties to config GCP images. It seems like in these cases, tests should only be defined to work when running "locally" (not with some scheduler/specific cloud env). But the author's do mention that tests will first run locally on the user's environment, so that does kind of get around this.
    5. For the "findable" part of FAIR, how possible is it to have "tags" of sort associated with a wf record so things can be more findable? I imagine when there is a large repository of many workflows, being able to easily narrow down to the specific domain interest you have could be helpful.
  14. Therefore, we are able to technically build a Yevis registry in an on-premise environment.

    This is awesome and I wanted to give it a shout-out as it may go under-appreciated. Yevis may or may not be super approachable for a massive community, but it seems super useful for a small community/organization/company. At Arcadia, we're super dedicated to sharing our workflows/tools in reproducible ways and depositing our data to FAIR repositories. So, I can imagine us using Yevis as an internal repository to make sure our tools meet our bar of quality before we share it with the outside world.

  15. We defined the Yevis metadata file, a JSON or YAML format file that contains structured workflow metadata (Figure 2).

    This looks very approachable. I made another comment about this, but I assume this file doesn't go to Zenodo with the repository.

    Have you explored providing a "scaffold" or recommended directory/file structure for the workflow repository and auto generating this file from that? That is potentially intractable, as I don't have a handle on all the edge cases.

    Currently, are there checks to ensure that the Yevis metadata file is accurate (ie this file says the workflow is using an MIT license, and the actual repository says some form of CC license). If that check doesn't exist, would it put undue burden on the maintainer of the repository?

  16. Yevis-cli executes a test using a GA4GH Workflow Execution Service (WES) instance, a type of web service also described as workflow as a service (17, 21); therefore, the testing materials must be written along with the specification of the WES run request.

    I've never heard of GA4GH Workflow Execution Service (WES) before, it seems super neat. In their documentation (potentially out-of-date), it seems like they only support CWL or WDL. First, it would be awesome if it could support Nextflow as it seems like there's a convergence on Nextflow as a workflow orchestration tool in the bioinformatics tool.

    Also, if WES only support CWL or WDL, how were you able to add an nf-core workflow to your demo registry on Figure 6?

  17. First, Yeviscli generates a template for the Yevis metadata file, which requires the URL of the main workflow description file as an argument.

    One thing that wasn't super clear to me: is this file committed to the original workflow repository as well? Or is it uploaded to Zenodo as part of the repo upload? I think the latter would be very useful to match the Yevis YAML to the repository at a single point in time.

  18. Because the submission method is restricted to Yevis-cli, the submitted workflow is guaranteed to pass validation and testing.

    One thing that wasn't super clear to me from reading the "Getting Started with Yevis" document was: How is the restriction implemented? What is preventing me from forking the registry repository and opening a PR from that work?

  19. Sharing workflows not only increases the transparency of research but also helps researchers by facilitating the reuse of programs, thereby making data analysis procedures more efficient.

    100% agree and I love this. One thing I'd like to see as part of this is better tooling to enable developers to compose bioinformatics workflows from existing sub-workflows. The nf-core community strives for this, but the current process of using workflows from the registry is error-prone.

  20. Even if a workflow can be executed, the correctness of its operation often cannot be verified because no tests have been provided.

    This is a great point. Sadly, even with Yevis, this issue is not super easy to solve (as you note further down the paper). Without a clear set of testing principles, I would imagine either the burden of validation would fall onto the maintainer of the repository or would be super manual.

    I'd love to see if you have suggestions on what a good testing suite for a bioinformatics workflow should include or if you could include basic strategies to test bioinformatics workflows. Though, I'd understand if that's outside the scope of this paper.

  21. Yevis runs on GitHub and Zenodo and allows workflow hosting with-out the need of dedicated computing resources.

    I love your usage of Zenodo. At Arcadia Science, as part of our publishing process, we cut a release of the project Github repository and upload that release to Zenodo. Currently, that process is quite manual, and the Zenodo integration you built as part of the yevis-cli repository is a great example for us to automate this process.

  22. WorkflowHub asks submitters to take responsibility for workflows: when a workflow is registered on WorkflowHub, the license and author identity should be clearly stated, encouraging them to publish FAIR workflows. However, there is no obligation as to the correctness of the workflow syntax, its executability, or testing. Not placing too many responsibilities on workflow submitters keeps obstacles to submission low, which will likely increase the diversity of public workflows on WorkflowHub; however, it will also likely increase the number of one-off submissions, which one can assume are at higher risk for the workflow problems previously described.Unlike WorkflowHub, in nf-core, the community that operates the registry holds more accountability for published workflows. Workflow submitters are required to join the nf-core community, develop workflows according to their guidelines, and prepare them for testing. These requirements enable nf-core to collect workflows with remarkable reliability. However, the community’s effort tends to focus on maintaining more generic workflows that have a large number of potential users. Consequently, nf-core states that it does not accept submissions of bespoke workflows. This is an understandable policy, as maintaining a workflow requires domain knowledge of its content, and this is difficult to maintain in the absence of the person who developed the workflow.

    An interesting parallel to this is the differences in approach to package management by PyPI and CondaForge. In general, PyPI packages are submitted by the person maintaining the package and its build infrastructure (submitters take responsibility for the packages). CondaForge packages can be submitted by anyone, but the entity accepting them is centralized and goes through a rigorous process of standardization and validation. Could be an interesting area to further explore.