Annotix: An Integrated Desktop Platform for Multi-Modal Data Annotation, Collaborative Labeling, and End-to-End Machine Learning Training

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The preparation of annotated datasets remains a critical bottleneck in the machine learning (ML) pipeline. Existing tools are fragmented across cloud-hosted services, self-hosted web applications, and lightweight desktop tools—none simultaneously ad-dressing diverse annotation modalities, offline-first operation, integrated training, and serverless collaboration. We present Annotix, an open-source, cross-platform desktop application built on a Rust backend (Tauri 2) and React 19 frontend, designed to unify the entire ML data preparation workflow within a single privacy-preserving environ-ment. To evaluate its practical utility, we conducted a controlled annotation efficiency study using 60 synthetic images (bounding box and mask tasks) annotated by three expert evaluators across Annotix, CVAT, and Label Studio, analyzed via Krus-kal-Wallis with Dunn–Bonferroni post-hoc tests, and a heuristic usability evaluation over standardized tasks on real medical images (retinographies and otoscopies). Re-sults demonstrate that Annotix achieves statistically significant annotation efficiency relative to established tools while offering substantially broader feature coverage, in-cluding 7 image annotation primitives, 19 ML training backends, ONNX-based infer-ence-assisted labeling, and serverless P2P collaboration. Annotix provides a complete, privacy-preserving ML data preparation workflow suited to regulated domains such as medical imaging and ecological monitoring and is freely available under the MIT license.

Article activity feed