An accessible infrastructure for artificial intelligence using a Docker-based JupyterLab in Galaxy

Anup Kumar
Gianmauro Cuccuru
Björn Grüning
Rolf Backofen

This article has been Reviewed by the following groups

Read the full article

Listed in

Evaluated articles (GigaScience)

Abstract

Background

Artificial intelligence (AI) programs that train on large datasets require powerful compute infrastructure consisting of several CPU cores and GPUs. JupyterLab provides an excellent framework for developing AI programs, but it needs to be hosted on such an infrastructure to enable faster training of AI programs using parallel computing.

Findings

An open-source, docker-based, and GPU-enabled JupyterLab infrastructure is developed that runs on the public compute infrastructure of Galaxy Europe consisting of thousands of CPU cores, many GPUs, and several petabytes of storage to rapidly prototype and develop end-to-end AI projects. Using a JupyterLab notebook, long-running AI model training programs can also be executed remotely to create trained models, represented in open neural network exchange (ONNX) format, and other output datasets in Galaxy. Other features include Git integration for version control, the option of creating and executing pipelines of notebooks, and multiple dashboards and packages for monitoring compute resources and visualization, respectively.

Conclusions

These features make JupyterLab in Galaxy Europe highly suitable for creating and managing AI projects. A recent scientific publication that predicts infected regions in COVID-19 computed tomography scan images is reproduced using various features of JupyterLab on Galaxy Europe. In addition, ColabFold, a faster implementation of AlphaFold2, is accessed in JupyterLab to predict the 3-dimensional structure of protein sequences. JupyterLab is accessible in 2 ways—one as an interactive Galaxy tool and the other by running the underlying Docker container. In both ways, long-running training can be executed on Galaxy’s compute infrastructure. Scripts to create the Docker container are available under MIT license at https://github.com/usegalaxy-eu/gpu-jupyterlab-docker.

GigaScience
May 2, 2023
Abstract

Reviewer 2: Milot Mirdita

Kumar et al. present a Docker-based integration of Jupyter Notebooks in the Galaxy workflow system that can utilize GPUs. This notebook is also available in the Galaxy Europe instance.

I was able to create a Galaxy Europe account, find the newly introduced Galaxy tool and submit a job. However, it remained stuck with the message "This job is waiting to run" and the job info "Stopped" for multiple hours. I was able to download the docker image and run it on a local server with multiple Nvidia GPUs. This resulted in a running Jupyter Lab, however running the GPU based examples resulted in driver mismatch errors/warnings (pynvml.nvml.NVMLError_LibRmVersionMismatch: RM has detected an NVML/RM version mismatch; kernel version 470.141.3 does not match DSO version 515.65.1 -- cannot find working devices in …
Abstract

Reviewer 2: Milot Mirdita

Kumar et al. present a Docker-based integration of Jupyter Notebooks in the Galaxy workflow system that can utilize GPUs. This notebook is also available in the Galaxy Europe instance.

I was able to create a Galaxy Europe account, find the newly introduced Galaxy tool and submit a job. However, it remained stuck with the message "This job is waiting to run" and the job info "Stopped" for multiple hours. I was able to download the docker image and run it on a local server with multiple Nvidia GPUs. This resulted in a running Jupyter Lab, however running the GPU based examples resulted in driver mismatch errors/warnings (pynvml.nvml.NVMLError_LibRmVersionMismatch: RM has detected an NVML/RM version mismatch; kernel version 470.141.3 does not match DSO version 515.65.1 -- cannot find working devices in this configuration). Thus, the examples ran on CPU only. I did not try to resolve this issue and only repeated some examples.

The authors show two use-cases for the GPU Jupyter Docker and provide a step-by-step tutorial for usage on Galaxy Europe. Shipping machine learning applications that utilize GPUs as Jupyter Notebooks has become popular recently and supporting these through well-known and freely accessible Galaxy servers, such as Galaxy Europe, would be of clear benefit to users. Additionally, it would be very valuable for method developers like me to easily deploy GPU-based methods to Galaxy servers.

Major:

As mentioned before, I had issues getting a running Jupyter Lab on the Galaxy Europe server. Is this due to a limited number of GPUs or was this due to an error?

Our ColabFold Multiple Sequence Alignment server currently processes about 10-20k MSAs per day. We do not know how many of these are running on Google Colab or on users' local machines. However, a substantial number of predictions are running inside Google Colab. The authors claim that Google Colab's and Kaggle's resources are scarce. However, generally, users (with either free or pro accounts) are given an instance nearly immediately on Colab. I recognize that it is extremely difficult to compete with these commercial platform providers. However, providing a long-term, freely available and securely funded, platform with ML accelerators would be extremely beneficial for the whole community. I would like to see a discussion on what GPU resources are currently available to users of Galaxy Europe (and the whole Galaxy Project) and what plans exist to expand these in the future.

The size of the docker container (compressed ~10GB, uncompressed ~22GB) seems difficult to sustain. Both keeping up an up-to-date Docker image and ensuring the availability of older images for reproducibility looks difficult to me, especially with such fast moving dependencies such as machine learning frameworks. How do the authors plan to deal with this issue?

Minor:

Please highlight the tutorial (https://training.galaxyproject.org/training-material/topics/statistics/tutorials/gpu_jupyter_lab/tutorial.html) on GitHub and inside the container readme (home_page.ipynb). It is very easy to overlook. I also nearly overlooked the example notebook repository (https://github.com/anuprulez/gpu_jupyterlab_ct_image_segmentation). I found it confusing, that I could not find the two shown example use-cases inside the Docker container. I only later figured out that I have to clone the example repository into the running container.

The manuscript highlights various workflow methods (elyra, kubeflow, airflow), however it needs clarification on how the Galaxy workflow integration works. I saw that it is possible to give input of another Galaxy output to the tool. I would appreciate a tutorial on how to make the GPU Jupyter Docker into part of a Galaxy workflow with multiple tools running. I think the above mentioned tutorials can be expanded to show how the output can be given to the next tool.

Docker Hub has introduced many business-model changes such as deleting container images that are rarely used, which poses a challenge for reproducibility. I know that Dr Grüning is involved in the Biocontainers project. I would recommend investigating if it is possible to combine these efforts to make this GPU container and derived containers long term available.

The Docker container is explicitly running as a root user, while the manuscript highlights the security benefits of Docker. The cited report by Baset et al. highlights the security benefits and the many security challenges that Docker containers pose. I suggest checking what security best practices for Docker containers are possible to implement, while still allowing GPUs to be exposed to users.

I recommend revising the manuscript for conciseness, with an additional focus on capitalization of words.
Read the original source
GigaScience
May 2, 2023

Abstract

This work has been peer reviewed in GigaScience (see paper https://doi.org/10.1093/gigascience/giad028), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

Reviewer 1: Philippe Boileau

This manuscript introduces the new docker-based JupyterLab framework in Galaxy, describing its core components and demonstrating its use in the reproduction of two analyses. The proposed framework is also thoroughly compared to competitors, like Google’s Colab and Amazon’s SageMaker. This tool is bound to have an impact on the life sciences: it democratizes computational analyses and facilitates reproducibility. I thank the authors for their important work. However, I think that this technical note should be reviewed for grammatical errors and faulty punctuation. I’ve identified …

Abstract

This work has been peer reviewed in GigaScience (see paper https://doi.org/10.1093/gigascience/giad028), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

Reviewer 1: Philippe Boileau

This manuscript introduces the new docker-based JupyterLab framework in Galaxy, describing its core components and demonstrating its use in the reproduction of two analyses. The proposed framework is also thoroughly compared to competitors, like Google’s Colab and Amazon’s SageMaker. This tool is bound to have an impact on the life sciences: it democratizes computational analyses and facilitates reproducibility. I thank the authors for their important work. However, I think that this technical note should be reviewed for grammatical errors and faulty punctuation. I’ve identified some such issues in the comments below but wasn’t able to address all of them. Included in the comments are other remarks which, if addressed, could strengthen some key takeaways. • The first sentence of the abstract states that AI programs require “powerful compute infrastructure” when applied to large datasets. I think readers would like to know how you qualify an infrastructure as “powerful”. A brief definition could be included in the second sentence instead of repeating “. . . hosted on a powerful infrastructure . . . ”. • Is it “JupyterLab” or “jupyterlab notebook”? The Project Jupyter site seems to use the former. Based on the documentation, JupyterLab is a web-based user interface that can open Jupyter notebooks (.ipynb files). • The statement “Artificial intelligence (AI) approaches such as machine learning (ML) and deep learning (DL) . . . ” implies that ML and DL are distinct aspects of AI. This distinction is insinuated throughout the rest of text. Isn’t DL a subset of ML? I suggest replacing “ML and DL algorithms” by “ML algorithms” and specifying “DL algorithms” only as needed. • I believe there’s a missing comma between “ecosystems” and “enabling” in the first sentence of the Docker container section. • Consider reformatting “A container runs . . . of the running software.” to “A container runs an isolated environment with minimal interactions between it and the host OS. Running software in a container is more secure.” • Related to the suggestion above: Can you explain why this increased security is necessary? An example might help emphasize the importance of a secure container. • I think “Docker container inherits . . . ” should be “The Docker container inherits . . . ”. Same goes for “Docker container is decoupled . . . ”. • Consider reformatting “Moreover, it can easily be extended by installing suitable packages only by adding their appropriate package names in its dockerfile.” to “Moreover, the Docker container is easily extended: additional software packages can be installed by adding their names to the dockerfile.” • Consider replacing “some of the popular ones are” by “including” • I believe there’s an unneeded comma between “. . . platform for both” and “rapid prototyping. . . ”. • I believe that there’s missing a word in the last sentence of the Features of jupyterlab and notebook infrastructure section: “. . . an H5 file.” • “google” and “amazon” should be capitalized. • Consider removing “and non-ideal” from Related infrastructure section. • I believe the comma in “. . . but they come at a price, . . . ” should be replace by a colon. • I believe there’s missing a comma between “. . . free of charge” and “similar to colab . . . ”. • Why is sharing a sessions’s resources across multiple notebooks more useful than operating each notebook in a separate session? Isn’t the latter preferable when a notebook causes a session to crash? • “deep learning” in the Implementation section should be replaced by “DL” for consistency. • I think that readers would find a link to your tool on Galaxy Europe useful: https://usegalaxy.eu/root?tool_id=interactive_tool_ml_jupyter_notebook. The same is true for your tutorial: I think readers would find a URL in the text more easily than in the references. However, the tool failed to execute on usegalaxy.edu with the following error message: “This tool is restricted to authorized users”. I was unable to follow the tutorial. Was this a one-off issue with the Galaxy servers?

Read the original source
Version published to 10.1093/gigascience/giad028
Dec 28, 2022
Version published to 10.1101/2022.07.08.499333v1 on bioRxiv
Jul 11, 2022

A synthetic data generation framework for scalable and resource-efficient medical AI assistants

This article has 10 authors:
1. Abdurrahim Yilmaz
2. Furkan Yuceyalcin
3. Rahmetullah Varol
4. Ece Gokyayla
5. Ozan Erdem
6. Donghee Choi
7. Ali Anil Demircali
8. Gulsum Gencoglan
9. Joram M. Posma
10. Burak Temelkuran
This article has no evaluationsLatest version May 18, 2025
Extending a Moldable Computer Architecture to Accelerate DL Inference on FPGA

This article has 6 authors:
1. Mirko Mariotti
2. Giulio Bianchini
3. Igor Neri
4. Daniele Spiga
5. Diego Ciangottini
6. Loriano Storchi
This article has no evaluationsLatest version May 27, 2025
EvolCat-Python: A User-Friendly Toolkit for Exploring the Language of Life

This article has 1 author:
1. Robert Friedman
This article has no evaluationsLatest version May 27, 2025

This article has been Reviewed by the following groups

Listed in

Abstract

Background

Findings

Conclusions

Article activity feed

Related articles

A synthetic data generation framework for scalable and resource-efficient medical AI assistants

Extending a Moldable Computer Architecture to Accelerate DL Inference on FPGA

EvolCat-Python: A User-Friendly Toolkit for Exploring the Language of Life