Effects of Poor Workload Partitioning on System Performance for Chiplet-Based Systems

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The emergence of chiplet-based architectures represents a paradigm shift in post-Moore’s Law computing systems, offering substantial cost and yield advantages through functional disaggregation. However, the heterogeneity of inter-chiplet communication introduces unique performance challenges that conventional partitioning strategies fail to address. In this work, the ways in which poor workload partitioning degrades communication performance in chiplet-based systems are comprehensively characterized. We demonstrate, through a detailed experimental analysis, that suboptimal workload partitioning can increase inter-chiplet communication latency by up to a factor of 10 and inflate network congestion beyond sustainable levels as systems scale. Our findings show that optimized partitioning strategies can achieve an 87.4% reduction in inter-chiplet traffic, improve system throughput by a factor of 8.75, and enhance energy efficiency by a factor of 10.3 compared to naive partitioning approaches. We further characterize how these effects scale with system size, revealing that the communication overhead can consume 85% of the execution time in poorly partitioned 16-chiplet systems, compared to only 35% in well-partitioned configurations. This work provides essential insights into the communication-aware design space of chiplet systems and validates the critical importance of sophisticated workload partitioning algorithms.

Article activity feed