NeuroConText: Contrastive Learning for Neuroscience Meta-Analysis with Rich Text Representation
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Brain meta-analysis is the common way to gather information about human brain function across the existing literature in order to formulate hypotheses and contextualize new findings. However, automated meta-analysis tools face challenges such as inconsistent terminology and difficulties in analyzing long texts and capturing semantic meaning because they still rely on bag-of-words approaches; furthermore, sparse coordinate reporting in articles distorts the activation distribution due to incomplete data. This paper introduces NeuroConText, a predictive text-to-brain modeling framework designed to support brain meta-analysis by bridging neuroscience text, brain location coordinates, and brain images within a shared latent space. This framework follows the predictive brain meta-analysis paradigm: it learns a regression from text descriptions to whole-brain activation maps and also enables the retrieval of relevant studies through contrastive learning, optimizing a multi-objective loss that combines retrieval and reconstruction objectives. Furthermore, NeuroConText supports second-level statistical synthesis by providing activation associated with top-K retrieved studies that can serve as input to coordinate-based meta-analysis (CBMA) methods. NeuroConText also leverages large language models (LLMs) to capture neuroscientific information from full-text articles, plus an LLM-based text augmentation strategy to handle short-text inputs. Quantitative and qualitative analyses demonstrate NeuroConText ability to enhance text-to-brain retrieval performance and reconstruct brain maps from neuroscience texts. We also show that predictive brain meta-analysis tools can infer brain activations in regions discussed in articles but absent in reported coordinates, potentially addressing the challenge of sparse coordinate reporting.