Effects of Poor Workload Partitioning on System Performance for Chiplet-Based Systems

Peter Mbua
Peter Forcha
Christophe Bobda

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The emergence of chiplet-based architectures represents a paradigm shift in post-Moore’s Law computing systems, offering substantial cost and yield advantages through functional disaggregation. However, the heterogeneity of inter-chiplet communication introduces unique performance challenges that conventional partitioning strategies fail to address. In this work, the ways in which poor workload partitioning degrades communication performance in chiplet-based systems are comprehensively characterized. We demonstrate, through a detailed experimental analysis, that suboptimal workload partitioning can increase inter-chiplet communication latency by up to a factor of 10 and inflate network congestion beyond sustainable levels as systems scale. Our findings show that optimized partitioning strategies can achieve an 87.4% reduction in inter-chiplet traffic, improve system throughput by a factor of 8.75, and enhance energy efficiency by a factor of 10.3 compared to naive partitioning approaches. We further characterize how these effects scale with system size, revealing that the communication overhead can consume 85% of the execution time in poorly partitioned 16-chiplet systems, compared to only 35% in well-partitioned configurations. This work provides essential insights into the communication-aware design space of chiplet systems and validates the critical importance of sophisticated workload partitioning algorithms.

Version published to 10.3390/electronics15061139
Mar 10, 2026
Version published to 10.20944/preprints202602.0486.v2
Feb 27, 2026
Version published to 10.20944/preprints202602.0486.v1
Feb 6, 2026

Beyond All-Reduce: Event-Driven Model Parallelism Without Collective Communication Primitives (EBD2N)

This article has 4 authors:
1. Ernesto Leite
2. Fabrice Mourlin
3. Youakim Badr
4. Pierre Paradinas
This article has no evaluationsLatest version Mar 5, 2026
The Impact of Process Competition on Energy Consumption: Analysis and Modeling

This article has 6 authors:
1. Joberto Martins
2. Eduardo Gomes Campos
3. Rafaela Sousa de Alencar Lacerda
4. Adnei Willian Donatti
5. Charles C. Miers
6. Tereza C. M. B. Carvalho
This article has no evaluationsLatest version Feb 18, 2026
Embodied Foundation Models at the Edge: A Survey of Deployment Constraints and Mitigation Strategies

This article has 12 authors:
1. Utkarsh Grover
2. Ravi Ranjan
3. Mingyang Mao
4. Trung Tien Dong
5. Satvik Praveen
6. Zhenqi Wu
7. Morris Chang
8. Tinoosh Mohsenin
9. Yi Sheng
10. Agoritsa Polyzou
11. Eiman Kanjo
12. Xiaomin Lin
This article has no evaluationsLatest version Mar 17, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Beyond All-Reduce: Event-Driven Model Parallelism Without Collective Communication Primitives (EBD2N)

The Impact of Process Competition on Energy Consumption: Analysis and Modeling

Embodied Foundation Models at the Edge: A Survey of Deployment Constraints and Mitigation Strategies