XTorch: A High-Performance C++ Framework for Deep Learning Training

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The deep learning ecosystem is predominantly driven by high-level Python frameworks like PyTorch and TensorFlow, which offer exceptional flexibility and ease of use. However, the reliance on a Python front-end can introduce significant performance overhead, partic- ularly in data-intensive training pipelines, often necessitating multi-GPU setups to achieve acceptable training times. This paper introduces XTorch, a high-level C++ deep learning framework built atop LibTorch, designed to bridge the gap between Python’s usability and C++’s raw performance. XTorch provides a familiar API for datasets, transforms, and mod- els while eliminating Python-related bottlenecks. We demonstrate its efficacy by training a Deep Convolutional Generative Adversarial Network (DCGAN) on the CelebA dataset. Our results show that XTorch, running on a single NVIDIA RTX 3090 GPU, completes a 5-epoch training run in 219 seconds. This represents a 37% speedup over a standard PyTorch implementation which required 350 seconds using two RTX 3090 GPUs with DataParallel. This work validates that a native C++ framework can not only match but significantly outperform common multi-GPU Python setups, offering a compelling case for reducing hardware costs and accelerating research and deployment.

Article activity feed