SymTensor: Symbolic and Adaptive Tensor Partitioning by Unified Parallelism for Deep Learning

Hongxing Wang
Zhengdao Yu
Chong Li
Serge Petiton

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The rapid expansion of deep learning models in scale and structural diversity has made distributed training essential. Designing efficient parallelization strategies requires balancing computation, communication, and memory. However, existing methods struggle to coordinate multiple parallelization strategies across different model components and to adapt to changing models. This paper proposes SymTensor, a strategy generation method based on a principled tensor-level cost model without relying on predefined rules.SymTensor unifies different forms of parallelism into a single system and formulates a symbolic model to jointly analyze computation, communication, and memory costs.It employs an adaptive tensor partitioning algorithm to minimize total cost.Our proposal adapts to changes such as model architectures, operator types, and input shapes. Our experiments on representative foundation models validated that SymTensor-generated strategies achieve up to more than 2x of the training performance compared to those generated by the state-of-the-art Megatron-LM.Our tensor-based symbolic-cost-driven solution provides strong efficiency, adaptability, and practicality over large-scale distributed training.

Version published to 10.21203/rs.3.rs-7744659/v1 on Research Square
Nov 10, 2025

Optimizing Deep Learning Architectures forEnhanced Computational Efficiency

This article has 2 authors:
1. Ying Wang
2. Hui Li
This article has no evaluationsLatest version Sep 22, 2025
Mneme: A Parallel Preprocessing Framework for Large Tabular Datasets

This article has 3 authors:
1. Argiris Sofotasios
2. Dimitris Metaxakis
3. Panagiotis Hadjidoukas
This article has no evaluationsLatest version Nov 10, 2025
Learning-Compatible Sparse Hypergraph Partitioning for Scalable Structured Prediction

This article has 5 authors:
1. Menghao Li
2. Yuhan Chen
3. Zixuan Wang
4. Liyun Xu
5. Zhou Shen
This article has no evaluationsLatest version Nov 5, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Optimizing Deep Learning Architectures forEnhanced Computational Efficiency

Mneme: A Parallel Preprocessing Framework for Large Tabular Datasets

Learning-Compatible Sparse Hypergraph Partitioning for Scalable Structured Prediction