Extending a Moldable Computer Architecture to Accelerate DL Inference on FPGA

Mirko Mariotti
Giulio Bianchini
Igor Neri
Daniele Spiga
Diego Ciangottini
Loriano Storchi

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Over the past years, the field of Machine and Deep Learning has seen a strong 1 developments both in terms of software and hardware with the increase of specialised devices. One of the biggest challenges in this field is the inference phase, where the trained model makes predictions of unseen data. Although computationally powerful, traditional computing architectures face limitations in efficiently managing requests, especially from an energy point of view. For this reason, the need arose to find alternative hardware solutions and among these there are Field Programmable Gate Arrays (FPGAs): their key feature of being reconfigurable, combined with parallel processing capability, low latency and low power consumption, makes those devices uniquely suited to accelerating inference tasks. In this paper, we present a novel approach to accelerate the inference phase of a Multi-Layer Perceptron (MLP) using BondMachine , an OpenSource framework for the design of hardware accelerators for FPGAs. Analysis of the latency, energy consumption and resource usage as well as comparisons with respect to standard architectures and other FPGA approaches are presented, highlighting the strengths and critical points of the proposed solution.

Version published to 10.20944/preprints202505.2111.v1
May 27, 2025

FPGA Implementation of a Deep Convolutional Neural Network Hardware Accelerator

This article has 3 authors:
1. Ximei Huangfu
2. Yu jiang
3. Junfeng Dai
This article has no evaluationsLatest version Apr 21, 2025
First Demonstration of Ferroelectric Digital In-Memory Computing for Scalable, Reliable and Ultra-Efficient Similarity Computation

This article has 8 authors:
1. Hussam Amrouch
2. Anirban Kar
3. Albi Mema
4. Thorgund Nemec
5. Stefan Duenkel
6. Halid Mulaosmanovic
7. Sven Beyer
8. Yogesh Singh Chauhan
This article has no evaluationsLatest version May 6, 2025
Accelerating CRYSTALS-Kyber: High-Speed NTT Design with Optimized Pipelining and Modular Reduction

This article has 3 authors:
1. Omar S. Sonbul
2. Muhammad Rashid
3. Amar Y. Jaffar
This article has no evaluationsLatest version May 23, 2025

Listed in

Abstract

Article activity feed

Related articles

FPGA Implementation of a Deep Convolutional Neural Network Hardware Accelerator

First Demonstration of Ferroelectric Digital In-Memory Computing for Scalable, Reliable and Ultra-Efficient Similarity Computation

Accelerating CRYSTALS-Kyber: High-Speed NTT Design with Optimized Pipelining and Modular Reduction