LAMPP: A benchmark for continuous evaluation of host phenotype prediction from shotgun metagenomic data
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
Predicting host phenotypes from shotgun metagenomic data is essential for translating microbiome research into clinical practice. Despite the development of numerous computational tools for this task, researchers often default to traditional machine learning methods such as Random Forest. This hesitancy to adopt newer methods stems from their complexity as well as the lack of standardized evaluations, as most tools are assessed on different datasets and compared against a limited set of methods.
Results
Here, we introduce LAMPP, a standardized benchmark for evaluating host phenotype prediction methods using gut metagenomic data. LAMPP features a diverse range of prediction tasks and enables consistent, comparative assessments across prediction tools and is available for ongoing benchmarking at https://lampp.yassourlab.com/
Conclusions
Our systematic evaluation of existing tools shows that classic machine learning methods (e.g., Random Forest) perform competitively, offering both ease of use and state-of-the-art results. At the same time, it demonstrates that microbiome-based phenotype prediction remains a challenging problem. By providing a consistent platform for ongoing evaluation, LAMPP motivates the development of innovative tools that perform beyond the current state of the art.