Triad: Vision Foundation Model for 3D Magnetic Resonance Imaging

Xiaofeng Yang
Shansong Wang
Mojtaba Safari
Qiang Li
Chih-Wei Chang
Richard Qiu
Justin Roper
David Yu

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Vision foundation models (VFMs) are pre-trained on extensive image datasets to learn general representations for diverse types of data. These models can subsequently be fine-tuned for specific downstream tasks, significantly boosting performance across a broad range of applications. However, existing vision foundation models that claim to be applicable to various clinical tasks are mostly pre-trained on 3D computed tomography (CT), which benefits from the availability of extensive 3D CT databases. Significant differences between CT and magnetic resonance imaging (MRI) in imaging principles, signal characteristics, and data distribution may hinder their practical performance and versatility in MRI-specific applications. Here, we propose Triad, a vision foundation model for 3D MRI. Triad adopts a widely used autoencoder architecture to learn robust representations from 131,170 3D MRI volumes and uses organ-independent imaging descriptions to constrain the semantic distribution of the visual modality. The above pre-training dataset is called Triad-131K, which is currently the largest 3D MRI pre-training dataset. We evaluate Triad across three tasks, namely, organ/tumor segmentation, organ/cancer classification, and medical image registration, in two data modalities (within-domain and out-of-domain) settings using 25 downstream datasets. By initializing models with Triad's pre-trained weights, nnUNet-Triad improves segmentation performance by 2.51% compared to nnUNet-Scratch across 17 datasets. Swin-B-Triad achieves a 4.04% improvement over Swin-B-Scratch in classification tasks across five datasets. SwinUNETR-Triad improves by 4.00% compared to SwinUNETR-Scratch in registration tasks across two datasets. Our study demonstrates that pre-training can improve performance when the data modalities and organs of upstream and downstream tasks are consistent. This work highlights the value of large-scale pre-training techniques for downstream tasks in 3D MRI. By open-sourcing Triad's weights, code, and data, we aim to enhance the adaptability and reliability of foundation models for 3D MRI in clinical tasks.

Version published to 10.21203/rs.3.rs-6129856/v1 on Research Square
Mar 10, 2025

From Slices to Volumes: A Scalable Pipeline for Developing General-Purpose Brain MRI Foundation Models

This article has 8 authors:
1. Feng Su
2. Xiaoping Yi
3. Ye Cheng
4. Yongjie Ma
5. Wenqiang Zu
6. Qing Zhao
7. Gengdi Huang
8. Lei Ma
This article has no evaluationsLatest version Apr 14, 2025
Generating 3D Optical Coherence Tomography from 2D Fundus Images via Diffusion Models

This article has 9 authors:
1. Bowen Liu
2. Yue Wu
3. Ruoyu Chen
4. Pusheng Xu
5. Peng Xiao
6. Zhen Tian
7. Binwei Huang
8. Mingguang He
9. Danli Shi
This article has no evaluationsLatest version Mar 6, 2025
A Foundational Generative Model for Breast Ultrasound Image Analysis

This article has 23 authors:
1. Liwei Wang
2. Haojun Yu
3. Youcheng Li
4. Nan Zhang
5. Zihan Niu
6. Xuantong Gong
7. Yanwen Luo
8. Haotian Ye
9. Siyu He
10. Quanlin Wu
11. Wangyan Qin
12. Mengyuan Zhou
13. Jie Han
14. Jia Tao
15. Ziwei Zhao
16. Di Dai
17. Di He
18. Dong Wang
19. Binghui Tang
20. Ling Huo
21. James Zou
22. Qingli Zhu
23. Yong Wang
This article has no evaluationsLatest version Apr 17, 2025

Listed in

Abstract

Article activity feed

Related articles

From Slices to Volumes: A Scalable Pipeline for Developing General-Purpose Brain MRI Foundation Models

Generating 3D Optical Coherence Tomography from 2D Fundus Images via Diffusion Models

A Foundational Generative Model for Breast Ultrasound Image Analysis