ROM-SRAM Hybrid Compute-in-Memory for Edge AI: Circuits, Architectures and Challenges

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The rapid growth of neural network parameters presents critical challenges for deploying artificial intelligence on edge devices with limited memory and power budgets. Compute-in-Memory (CiM) has emerged as a promising approach to overcoming the memory wall by integrating storage and computation. Under mature Complementary Metal-Oxide-Semiconductor (CMOS) technology, the recently proposed hybrid CiM based on Read-Only Memory (ROM) and Static Random-Access Memory (SRAM) is a promising way to further improving energy efficiency by cutting off the off-chip weight fetch. This paper surveys ROM-based CiM circuits in various domains and analyzes their efficiency, accuracy, and scalability. We further explore ROM-SRAM hybrid CiM architectures, which balance density and flexibility through weight and structural adaptation for efficient fine-tuning and task migration. Key challenges include achieving larger on-chip capacity, scaling to large models, and supporting dynamic operations in long-sequence inference. Potential solutions such as 3D stacking, chiplet integration, software-level token pruning, and design space exploration methods are discussed. Finally, we highlight future prospects for expanding hybrid CiM architectures to broader edge artificial intelligence applications.

Article activity feed