Cellquant: a vibecoder’s guide to image analysis

Abani Neferkara
Asif Ali
David Pincus

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (Arcadia Science)

Abstract

Quantitative fluorescence microscopy is central to modern cell biology, yet extracting reproducible measurements from images remains a bottleneck for biologists without programming experience. Here we present cellquant , a single-script command line pipeline for multi-channel fluorescence images that performs cell segmentation, puncta quantification, colocalization analysis, and spatial proximity measurements. Because the interface is entirely text based, the exact command used to generate any result can be recorded and re-executed. We validate cellquant on two biological systems. In human HCT116 cells, the pipeline quantified arsenite-induced stress granule formation. In budding yeast, simultaneous measurement of nucleolar morphology, colocalization, and spatial proximity across a temperature gradient revealed a coordinated sequence of nucleolar reorganization. Applying PCA and UMAP to the multi-parameter output of cellquant resolved a continuous cell state transition across the temperature gradient, with condensate redistribution and nucleolar morphology defining orthogonal axes. The pipeline produces publication-ready quantification with visual quality control and statistically rigorous replicate analysis. All code, documentation, and example datasets are freely available.

Arcadia Science
Mar 16, 2026

The cellquant pipeline is distributed as a single Python script alongside an environment specification (environment.yml)

I'm not sure there is any benefit to organizing this software as a single script rather than as a modern python package with a modular structure and a single CLI entrypoint. Single large scripts (this one is 2300 lines) are hard to work with for both humans and agents (they are hard to navigate for humans, and it is impossible for two agents to concurrently work on unrelated parts of the code, as they would be trying to edit the same file).

Read the original source
Arcadia Science
Mar 16, 2026

A complete command line interface (CLI) reference (CLI_REFERENCE.md) documents every argument with its default value, valid range, and interaction with cell-type presets. This reference is structured to be readable by both humans and chatbots, so that a user can paste it into a conversation and ask the AI to find the relevant parameter.

This kind of documentation is a double-edged sword, as it creates two sources of truth and is hard to keep in sync with the code (this is true for both humans and agents, who each tend to forget to update the docs). I would suggest eliminating this document in favor of writing and maintaining thorough docstrings in the code itself, which humans can introspect in their IDE and which agents will "know" to read without any prompting.

Read the original source
Arcadia Science
Mar 16, 2026

Distributing the pipeline as a single Python script within a minimal repository eliminates the installation complexity that often derails non-programmers before they reach the analysis itself.

In my experience, the installation complexity referenced here is usually due to creating and managing virtual environments, and distributing this software as a single script in a github repo will not eliminate that (indeed it arguably makes it worse). I would suggest that structuring this script as a python package and distributing it via the usual channels (PyPI and conda-forge) would likely facilitate, rather than hinder, its adoption.

Read the original source
Arcadia Science
Mar 13, 2026

multi-channel fluorescence microscopy

Nice work. Are you considering expanding this tool to other microscopy techniques?

Read the original source
Arcadia Science
Mar 13, 2026

This is not a claim that AI can replace computational biologists. Complex pipelines, novel algorithms, and performance-critical applications will continue to require programming expertise. Rather, we argue that for standard analyses (segmentation, colocalization, puncta counting, spatial measurements), the combination of a well-designed CLI tool and an AI assistant is sufficient for a biologist to produce rigorous, reproducible quantification. The biologist’s expertise in recognizing correct segmentation and evaluating biological plausibility is essential for the AI’s ability to translate natural-language descriptions into parameter configurations.

This is an important (and appreciated!) callout. Have you thought about the types of test questions a biologist should ask to check AI-generated results and analysis of the dataset?

Read the original source
Arcadia Science
Mar 13, 2026

The pipeline does not currently support batch parallelization, processing images sequentially. For large screening datasets this may be a practical limitation, though the per-image processing time (seconds to minutes depending on image size and analysis modules) is acceptable for most experimental workflows.

Nice work! This seems like a useful tool for making image analysis more accessible.

I was curious whether there was a specific reason for the limitation you mention here: "The pipeline does not currently support batch parallelization, processing images sequentially."

My first thought was that segmentation might be the main constraint (especially when using Cellpose). But it seems like one possible architecture would be to run segmentation first and save masks, then parallelize the downstream steps (puncta detection, …

The pipeline does not currently support batch parallelization, processing images sequentially. For large screening datasets this may be a practical limitation, though the per-image processing time (seconds to minutes depending on image size and analysis modules) is acceptable for most experimental workflows.

Nice work! This seems like a useful tool for making image analysis more accessible.

I was curious whether there was a specific reason for the limitation you mention here: "The pipeline does not currently support batch parallelization, processing images sequentially."

My first thought was that segmentation might be the main constraint (especially when using Cellpose). But it seems like one possible architecture would be to run segmentation first and save masks, then parallelize the downstream steps (puncta detection, colocalization, morphology measurements, etc.) across images or cells.

That part of the pipeline seems like it could be relatively straightforward for Claude to parallelize with something like multiprocessing or joblib and might significantly improve throughput for larger datasets, since downstream analysis would no longer be tied to per-image segmentation.

Was something like this considered, or are there constraints (e.g., Cellpose behavior, I/O, memory usage) that make parallelization less practical?

Read the original source
Version published to 10.64898/2026.03.09.710634 on bioRxiv
Mar 9, 2026

CTransformer: Deep-transformer-based 3D cell membrane tracking with subcellular-resolved molecular quantification

This article has 16 authors:
1. Rigaudiere Z.L. Li
2. Guoye Guan
3. Xian Xiu
4. Dongying Xie
5. Yiming Ma
6. Chenwei Wang
7. Sicheng You
8. Zhen Zhu
9. Darrick Lee
10. Zirui Zhang
11. Zhuoheng Ran
12. Chao Tang
13. Jianfeng Cao
14. Zhaoke Huang
15. Zhongying Zhao
16. Hong Yan
This article has no evaluationsLatest version Mar 4, 2026
ETSAM: Effectively Segmenting Cell Membranes in cryo-Electron Tomograms

This article has 2 authors:
1. Jianlin Cheng
2. Joel Selvaraj
This article has no evaluationsLatest version Feb 20, 2026
stemOrchestrator: Enabling Seamless Hardware Control and High-Throughput Workflows on Electron Microscopes

This article has 6 authors:
1. Utkarsh Pratiush
2. Austin Houston
3. Paolo Longo
4. Remco Geurts
5. Sergei Kalinin
6. Gerd Duscher
This article has no evaluationsLatest version Jan 27, 2026

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

CTransformer: Deep-transformer-based 3D cell membrane tracking with subcellular-resolved molecular quantification

ETSAM: Effectively Segmenting Cell Membranes in cryo-Electron Tomograms

stemOrchestrator: Enabling Seamless Hardware Control and High-Throughput Workflows on Electron Microscopes