A motif for domain-specific analysis applets that are easy to learn, reuse, test, and to compose into pipelines: application to vision science

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Scientific progress depends on the analysis of primary data, yet the small, domain-specific programs that perform most scientific analyses are typically poorly documented, narrowly tested, and difficult to reuse outside the lab that created them. General-purpose pipeline tools address the problem of running steps in order but do not enforce documentation, testing, or standardized outputs. We describe a motif for building domain-specific analysis applets, which we call calculators , that constrains developer choices in order to produce code that is readable, tested, and reusable almost as a byproduct of following the template. Calculators operate on a typed, searchable database of documents, eliminating the need to explicitly wire inputs and outputs together; instead, each calculator searches the database for documents it can operate on and adds its results as new typed documents. Calculators must provide documentation in a standard location, self-tests that can be run and inspected interactively, adjustable input parameters, a single well-defined output document type, and a default plotting method. Sets of calculators compose naturally into pipelines whose outputs satisfy FAIR principles at every stage. We demonstrate the motif by implementing calculators for common analyses in vision science, including orientation and direction selectivity, contrast tuning, spatial and temporal frequency tuning, speed tuning, and Hartley reverse correlation. These calculators have been used in published work and are in active use across collaborating laboratories. We discuss the design principles of the motif, its advantages and limitations, and its applicability to domain-specific computation across neuroscience and beyond.

Significance Statement

Scientists often must write small programs to analyze their own data. These programs are usually poorly documented, lightly tested, and hard for other labs to reuse. Mistakes in this kind of code have even caused well-known papers to be retracted. We describe a simple pattern for writing these programs, which we call a calculator. The pattern requires the programmer to include clear documentation, built-in tests, adjustable settings, and a standard form of output. Calculators work by searching a shared database for data they know how to handle, so many calculators can be chained together into a pipeline without extra setup. We show how this works by building calculators for common visual neuroscience analyses that other labs are already using.

Article activity feed