A motif for domain-specific analysis applets that are easy to learn, reuse, test, and to compose into pipelines: application to vision science

Avraham A. Lepsky
Madeline K. Severson
Ruyang Wang
Xinyu Cheng
Ricardo L. Rodriguez
Rui Gong
Stephen D. Van Hooser

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Scientific progress depends on the analysis of primary data, yet the small, domain-specific programs that perform most scientific analyses are typically poorly documented, narrowly tested, and difficult to reuse outside the lab that created them. General-purpose pipeline tools address the problem of running steps in order but do not enforce documentation, testing, or standardized outputs. We describe a motif for building domain-specific analysis applets, which we call calculators , that constrains developer choices in order to produce code that is readable, tested, and reusable almost as a byproduct of following the template. Calculators operate on a typed, searchable database of documents, eliminating the need to explicitly wire inputs and outputs together; instead, each calculator searches the database for documents it can operate on and adds its results as new typed documents. Calculators must provide documentation in a standard location, self-tests that can be run and inspected interactively, adjustable input parameters, a single well-defined output document type, and a default plotting method. Sets of calculators compose naturally into pipelines whose outputs satisfy FAIR principles at every stage. We demonstrate the motif by implementing calculators for common analyses in vision science, including orientation and direction selectivity, contrast tuning, spatial and temporal frequency tuning, speed tuning, and Hartley reverse correlation. These calculators have been used in published work and are in active use across collaborating laboratories. We discuss the design principles of the motif, its advantages and limitations, and its applicability to domain-specific computation across neuroscience and beyond.

Significance Statement

Scientists often must write small programs to analyze their own data. These programs are usually poorly documented, lightly tested, and hard for other labs to reuse. Mistakes in this kind of code have even caused well-known papers to be retracted. We describe a simple pattern for writing these programs, which we call a calculator. The pattern requires the programmer to include clear documentation, built-in tests, adjustable settings, and a standard form of output. Calculators work by searching a shared database for data they know how to handle, so many calculators can be chained together into a pipeline without extra setup. We show how this works by building calculators for common visual neuroscience analyses that other labs are already using.

Version published to 10.64898/2026.04.27.721136 on bioRxiv
Apr 30, 2026

An Empirical Evaluation of Large Language Models Applying Software Architectural Patterns

This article has 4 authors:
1. Christos Hadjichristofi
2. Michail Tsilimigkounakis
3. Georgios Sotiropoulos
4. Vassilios Vescoukis
This article has no evaluationsLatest version Mar 31, 2026
Automated Semantic State-Layout Synthesis for Generated Lexers:Structural Evaluation with Rollback-Aware Semantics

This article has 1 author:
1. Reda Belaiche
This article has no evaluationsLatest version Mar 31, 2026
Grammar-Guided Incremental Method for Efficient LLM-Generated Code Execution

This article has 2 authors:
1. Anton Svystunov
2. Yaroslav Tereshchenko
This article has no evaluationsLatest version Apr 2, 2026

Discuss this preprint

Listed in

Abstract

Significance Statement

Article activity feed

Related articles

An Empirical Evaluation of Large Language Models Applying Software Architectural Patterns

Automated Semantic State-Layout Synthesis for Generated Lexers:Structural Evaluation with Rollback-Aware Semantics

Grammar-Guided Incremental Method for Efficient LLM-Generated Code Execution