Autoprot: Processing, Analysis and Visualization of Proteomics Data in Python

Julian Bender
Wignand W. D. Mühlhäuser
Johannes P. Zimmerman
Friedel Drepper
Bettina Warscheid

This article has been Reviewed by the following groups

Read the full article

Listed in

Evaluated articles (Arcadia Science)

Abstract

The increasing numbers of complex quantitative mass spectrometry-based proteomics data sets demand a standardised and reliable analysis pipeline. For this purpose, Python-based analysis, particularly through Jupyter notebooks, serves as a simple yet powerful tool. Nevertheless, the availability of Python software for standardised and accessible MS data analysis is limited, and this software is often constrained to using analysis functions written in Python. This excludes existing and well-tested software, for example written in R. Despite this, Python offers several interactive data visualisation modules that greatly enhance exploratory research and facilitate result communication with collaboration partners. Consequently, there is a need for an integrated and Jupyter-compatible Python analysis pipeline that incorporates R algorithms and interactive visualization for proteomics data analysis.

Summary

We developed autoprot, a Python module for simplified analysis of quantitative mass spectrometry-based proteomics experiments processed with the MaxQuant software. It provides access to established functions written in both Python and R for statistical testing and data transformation. Moreover, it generates JavaScript-based interactive plots that can be integrated into interactive web applications. Thereby, autoprot offers standardised, fast and reliable proteomics data analysis while maintaining the high customisability required to tailor the analysis pipeline to specific experiments.

Availability and Implementation

Autoprot is implemented in Python ≥ 3.9 and can be downloaded from https://github.com/ag-warscheid/autoprot . Online documentation is available at https://ag-warscheid.github.io/autoprot/ .

Arcadia Science
Jul 22, 2024

Thanks for the remark. You are correct, in its current form autoprot requires a local R installation for installing and running R code. Autoprot is used from within a Python installation but makes use of several well-established R packages for statistical analysis (through subprocess). This was a design choice to avoid re-implementing and (long-term) maintenance of a complete set of statistical packages which is beyond what we can do as an academic lab.

I agree that installing a Python-only version from pip would be a nice option if the R-based statistics are not needed. We will look into implementing this.

Regarding the dependencies, please see my reply to your second comment below.

Read the original source
Arcadia Science
Jul 22, 2024

These tests are available in Python with autoprot working as a shim to hand down the arguments to R (via subprocess). This indeed complicates maintaining an identical environment across different system especially for different operating systems. As a first step, autoprot automatically saves lists of the Python and R environments (with version numbers) during a run which simplifies installing matching versions on a second system. In the long run, we aim to provide a containerized version of autoprot including a defined set of R packages.

Read the original source
Arcadia Science
Jul 12, 2024

Advanced statistical R algorithms are invoked through a dedicated R installation

From the figure below it looks like these statistical tests would be readily available or easy to write functions for in python? Just for installation and maintenance issues it's difficult to maintain code depending on two different languages.

Read the original source
Arcadia Science
Jul 12, 2024

It provides access to established functions written in both Python and R for statistical testing and data transformation.

From the way this preprint is written it sounds like this piece of software is a python package, but you need to have it installed to work with both Python and R? It would probably be best to either write the entire package in one language, or keep the python/R things separate so then you can install the package with pip or from CRAN/devtools depending on the language. Otherwise keeping up with dependencies down the line for both languages could be difficult.

Read the original source
Version published to 10.1101/2024.01.18.571429v1 on bioRxiv
Jan 23, 2024

Facilitating Analysis and Dissemination of Proteomics data through Metadata Integration in MaxQuant

This article has 5 authors:
1. Walter Viegener
2. Shamil Urazbakhtin
3. Daniela Ferretti
4. Jürgen Cox
5. Jinqiu Xiao
This article has no evaluationsLatest version Jun 25, 2025
PMScanR: an R package for the large-scale identification, analysis, and visualization of protein motifs

This article has 5 authors:
1. Jan Pawel Jastrzebski
2. Monika Gawronska
3. Wiktor Babis
4. Miriana Quaranta
5. Damian Czopek
This article has no evaluationsLatest version May 27, 2025
Click-qPCR: an ultra-simple tool for interactive qPCR data analysis

This article has 2 authors:
1. Azusa Kubota
2. Atsushi Tajima
This article has no evaluationsLatest version May 31, 2025

This article has been Reviewed by the following groups

Listed in

Abstract

Summary

Availability and Implementation

Article activity feed

Related articles

Facilitating Analysis and Dissemination of Proteomics data through Metadata Integration in MaxQuant

PMScanR: an R package for the large-scale identification, analysis, and visualization of protein motifs

Click-qPCR: an ultra-simple tool for interactive qPCR data analysis