Scalable and Interpretable Mixture of Experts Models in Machine Learning: Foundations, Applications, and Challenges
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Mixture of Experts (MoE) models have emerged as a powerful framework in machine learning, combining multiple specialized expert networks through a gating mechanism to enable scalable, efficient, and adaptive computation. This survey provides a comprehensive and mathematically rigorous overview of efficient and explainable MoE architectures, encompassing their theoretical foundations, optimization properties, and generalization guarantees. We explore a broad range of applications across natural language processing, computer vision, reinforcement learning, healthcare, and industrial domains, illustrating the versatility and empirical effectiveness of MoE models. A central focus is placed on explainability: we formalize attribution methods that leverage the modular structure of MoE, discuss quantitative metrics for interpretability, and examine strategies to enhance transparency and trustworthiness. Finally, we identify key open challenges and promising research directions, aiming to bridge the gap between scalable model design and human-centric interpretability. This survey serves as a foundational resource for advancing the development of efficient, explainable, and robust Mixture of Experts in modern machine learning.