Extending a Moldable Computer Architecture to Accelerate DL Inference on FPGA

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Over the past years, the field of Machine and Deep Learning has seen a strong 1 developments both in terms of software and hardware with the increase of specialised devices. One of the biggest challenges in this field is the inference phase, where the trained model makes predictions of unseen data. Although computationally powerful, traditional computing architectures face limitations in efficiently managing requests, especially from an energy point of view. For this reason, the need arose to find alternative hardware solutions and among these there are Field Programmable Gate Arrays (FPGAs): their key feature of being reconfigurable, combined with parallel processing capability, low latency and low power consumption, makes those devices uniquely suited to accelerating inference tasks. In this paper, we present a novel approach to accelerate the inference phase of a Multi-Layer Perceptron (MLP) using BondMachine , an OpenSource framework for the design of hardware accelerators for FPGAs. Analysis of the latency, energy consumption and resource usage as well as comparisons with respect to standard architectures and other FPGA approaches are presented, highlighting the strengths and critical points of the proposed solution.

Article activity feed