BioEncoder: a metric learning toolkit for comparative organismal biology

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In the realm of biological image analysis, deep learning (DL) has been a transformative force. Tasks like finding areas of interest within an image (segmentation) and/or sorting images into groups (classification), can now be automated and achieve unprecedented efficiency and accuracy. However, conventional DL methods are challenged by large-scale biodiversity datasets, which are characterized by unbalanced classes and small phenotypic differences between them. Here we present BioEncoder, a user-friendly toolkit for metric learning, which overcomes the aforementioned challenges by focussing on learning relationships between individual data points rather than on the separability of classes. This approach simplifies training of robust image classification models for biologists, democratizing access to advanced DL techniques. BioEncoder is encapsulated in a Python package, crafted for ease of use and flexibility across diverse datasets. It features taxon-agnostic data loaders, custom augmentation options, and streamlined hyperparameter adjustments through text-based configuration files. The toolkit’s significance lies in its potential to unlock new possibilities in biological image analysis - from phenomics and disease diagnosis to species identification and ecosystem monitoring. BioEncoder focuses on the urgent need for toolkits bridging the gap between complex DL methodologies and practical applications in biological research.

Article activity feed