Facilitating analysis of open neurophysiology data on the DANDI Archive using large language model tools

Jeremy F. Magland
Ryan Ly
Oliver Rübel
Benjamin Dichter

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The DANDI Archive is a key resource for sharing open neurophysiology data, hosting over 400 datasets in the Neurodata Without Borders (NWB) format. While these datasets hold tremendous potential for reanalysis and discovery, many researchers face barriers to reuse, including unfamiliarity with access methods and difficulty identifying relevant content. Here we introduce an AI-powered, agentic chat assistant and a notebook generation pipeline. The chat assistant serves as an interactive tool for exploring DANDI datasets. It leverages large language models (LLMs) and integrates with agentic tools to guide users through data access, visualization, and preliminary analysis. The notebook generator analyzes dataset structure with minimal human input, executing inspection scripts and generating visualizations. It then produces an instructional Python notebook tailored to the dataset. We applied this system to 12 recent datasets. Review by neurophysiology data specialists found the generated notebooks to be generally accurate and well-structured, with most notebooks rated as “very helpful.” This work demonstrates how AI can support FAIR principles by lowering barriers to data reuse and engagement.

Version published to 10.1101/2025.07.17.663965 on bioRxiv
Jul 18, 2025

Developing an Interactive Neuroimaging Education Resource with Neurodesk

This article has 17 authors:
1. Monika Dörig
2. Michèle Masson-Trottier
3. Thuy Thanh Dao
4. Kyle Mapue
5. Andrew Jahn
6. Fernanda L. Ribeiro
7. Ashley Stewart
8. Thomas Shaw
9. Michal Toth
10. Marla Pinkert
11. Paul Taylor
12. Angela Renton
13. Daniel A. Handwerker
14. Giulia Baracchini
15. Christopher Rorden
16. Aswin Narayanan
17. Steffen Bollmann
This article has no evaluationsLatest version Jan 3, 2026
Classification of Decomposed Neural Data in Memory Networks and LLM-Based Stimuli Processing

This article has 5 authors:
1. Muhammad Shahzaib
2. Salma Zainab Farooq
3. Eric H. Schumacher
4. Shella Keilholz
5. Sadia Shakil
This article has no evaluationsLatest version Jan 19, 2026
Generative learning with multimodal prompts as computational model for brain responses

This article has 4 authors:
1. Ya-Li Li
2. Xin Liu
3. Jichuan Zhang
4. Shengjin Wang
This article has no evaluationsLatest version Dec 18, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Developing an Interactive Neuroimaging Education Resource with Neurodesk

Classification of Decomposed Neural Data in Memory Networks and LLM-Based Stimuli Processing

Generative learning with multimodal prompts as computational model for brain responses