Algorithmic Techniques for GPU Scheduling: A Comprehensive Survey

Robert Chab
Fei Li
Sanjeev Setia

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

In this survey, we provide a comprehensive classification of GPU task scheduling approaches, categorized by their underlying algorithmic techniques and evaluation metrics. We examine traditional methods—including greedy algorithms, dynamic programming, and mathematical programming—alongside advanced machine learning techniques integrated into scheduling policies. We also evaluate the performance of these approaches across diverse applications. This work focuses on understanding the trade-offs among various algorithmic techniques, the architectural and job-level factors influencing scheduling decisions, and the balance between user-level and service-level objectives. The analysis shows that no one paradigm dominates; instead, the highest-performing schedulers blend the predictability of formal methods with the adaptability of learning, often moderated by queueing insights for fairness. We also discuss key challenges in optimizing GPU resource management and suggest potential solutions.

Version published to 10.3390/a18070385
Jun 25, 2025
Version published to 10.20944/preprints202505.0152.v1
May 5, 2025

A Dynamic Traffic-Aware VC Partitioning Strategy and Optimization in CPU-GPU Heterogeneous Network-on-Chip

This article has 5 authors:
1. Juan Fang
2. Yiming Yan
3. Haoyu Cheng
4. Yuening Wang
5. Juncheng Chen
This article has no evaluationsLatest version Dec 11, 2025
Implementation and Performance Optimization of a DPDK Packet Gateway on Manycore CPUs

This article has 1 author:
1. Daisuke Sugisawa
This article has no evaluationsLatest version Jan 19, 2026
Near-Optimal Universal Scheduling for Moldable Tasks: The Fair Algorithm

This article has 3 authors:
1. Lucas Perotin
2. Thomas Verrecchia
3. Padma Raghavan
This article has no evaluationsLatest version Jan 14, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

A Dynamic Traffic-Aware VC Partitioning Strategy and Optimization in CPU-GPU Heterogeneous Network-on-Chip

Implementation and Performance Optimization of a DPDK Packet Gateway on Manycore CPUs

Near-Optimal Universal Scheduling for Moldable Tasks: The Fair Algorithm