Towards Robust and Scalable Mixture of Experts Architectures for Large Language and Vision Models

Aamina Yousra
Jumanah Fawziya
Fawzi Gamal

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The advent of foundation-scale deep learning models, characterized by unprecedented model sizes and multi-modal capabilities, has revitalized interest in Mixture of Experts (MoE) architectures due to their potential for efficient conditional computation and scalability. However, robustness challenges—including routing instability, expert overload, and vulnerability to distributional shifts and adversarial attacks—pose significant barriers to reliable deployment in large language and vision models. This survey presents a comprehensive and mathematically rigorous overview of robust MoE methods in the era of foundation models. We systematically examine foundational theories, algorithmic advances in capacity-aware routing and auxiliary regularization, and state-of-the-art training strategies designed to enhance robustness and scalability. Empirical evaluations across diverse language, vision, and multi-modal benchmarks highlight the strengths and limitations of current approaches. We further identify critical open problems spanning theoretical guarantees, differentiable routing optimization, multi-modal consistency, and efficient training under resource constraints. By synthesizing recent developments and articulating future directions, this survey aims to provide a unified framework for advancing robust MoE research, facilitating their broader adoption in next-generation AI systems.

Version published to 10.31224/4764
Jul 2, 2025

Achieving Explainable, Scalable, and Robust Machine Learning for Real-World Applications

This article has 2 authors:
1. Yong Nuan
2. Zhihao Ru
This article has no evaluationsLatest version Jun 30, 2025
Scalable and Interpretable Mixture of Experts Models in Machine Learning: Foundations, Applications, and Challenges

This article has 3 authors:
1. Rajab Jafar
2. Fawzi Gamal
3. Rais Raheem
This article has no evaluationsLatest version Jul 3, 2025
Practical Guidelines for Building Explainable, Efficient, and Robust Machine Learning Systems

This article has 2 authors:
1. Huan Zheng
2. Zhihao Ru
This article has no evaluationsLatest version Jun 29, 2025

Listed in

Abstract

Article activity feed

Related articles

Achieving Explainable, Scalable, and Robust Machine Learning for Real-World Applications

Scalable and Interpretable Mixture of Experts Models in Machine Learning: Foundations, Applications, and Challenges

Practical Guidelines for Building Explainable, Efficient, and Robust Machine Learning Systems