Architectural Diversity in Mixture of Experts: A Comparative Study

Yashkumar R. Lukhi
Harsh Rameshbhai Moradiya
Dmitry Ignatov
Radu Timofte

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This work presents the integration of Mixture of Experts (MoE) architectures into the LEMUR neural network dataset to enhance model diversity and scalability. The MoE framework employs multiple expert networks and a gating mechanism for dynamic routing, enabling efficient computation and improved specialization across tasks. Eight MoE variants were implemented and benchmarked on CIFAR-10, achieving up to 93% accuracy with optimized routing, regularization, and training strategies. This integration provides a foundation for benchmarking expert-based models within LEMUR and supports future research in adaptive model composition and automated machine learning. The project work and its plugins are accessible as open source projects under the MIT license at https://github.com/ABrain-One/nn-dataset.

Version published to 10.20944/preprints202512.1023.v1
Dec 11, 2025

Hybrid Graph Encoding: A Unified Framework for Adaptive Network Representations

This article has 1 author:
1. Sourabh Subhash Rajput
This article has no evaluationsLatest version Jan 29, 2026
An Adaptive Multi-Objective Memetic Algorithm (AMOMA) for the Hyperparameter Tuning of LightGBM

This article has 1 author:
1. Odunayo Damilola Osofuye
This article has no evaluationsLatest version Dec 12, 2025
Large Language Models: A Survey of Architectures, Training Paradigms, and Alignment Methods

This article has 5 authors:
1. Deepshikha Bhati
2. Fnu Neha
3. Devi Sri Bandaru
4. Matthew Weber
5. Ishan Dilipbhai Gajera
This article has no evaluationsLatest version Jan 15, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Hybrid Graph Encoding: A Unified Framework for Adaptive Network Representations

An Adaptive Multi-Objective Memetic Algorithm (AMOMA) for the Hyperparameter Tuning of LightGBM

Large Language Models: A Survey of Architectures, Training Paradigms, and Alignment Methods