ROM-SRAM Hybrid Compute-in-Memory for Edge AI: Circuits, Architectures and Challenges

Xirui Du
Tianyi Yu
Hengping Zhou
Ling-An Cheong
Teng Wan
Huazhong Yang
Xueqing Li

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The rapid growth of neural network parameters presents critical challenges for deploying artificial intelligence on edge devices with limited memory and power budgets. Compute-in-Memory (CiM) has emerged as a promising approach to overcoming the memory wall by integrating storage and computation. Under mature Complementary Metal-Oxide-Semiconductor (CMOS) technology, the recently proposed hybrid CiM based on Read-Only Memory (ROM) and Static Random-Access Memory (SRAM) is a promising way to further improving energy efficiency by cutting off the off-chip weight fetch. This paper surveys ROM-based CiM circuits in various domains and analyzes their efficiency, accuracy, and scalability. We further explore ROM-SRAM hybrid CiM architectures, which balance density and flexibility through weight and structural adaptation for efficient fine-tuning and task migration. Key challenges include achieving larger on-chip capacity, scaling to large models, and supporting dynamic operations in long-sequence inference. Potential solutions such as 3D stacking, chiplet integration, software-level token pruning, and design space exploration methods are discussed. Finally, we highlight future prospects for expanding hybrid CiM architectures to broader edge artificial intelligence applications.

Version published to 10.21203/rs.3.rs-7749677/v1 on Research Square
Oct 22, 2025

Adaptive Dataflow and Precision Optimization for Deep Learning on Configurable Hardware Architectures

This article has 3 authors:
1. Gulnaz Rati
2. Rafael Mendes
3. Aisha Noor
This article has no evaluationsLatest version Oct 8, 2025
Physically unclonable memristor-based compute-in-memory chip for secure AI

This article has 12 authors:
1. Peng Huang
2. Yiyang Chen
3. Lixia Han
4. Ao Shi
5. Lianliang Wu
6. Hairuo Lu
7. Kexun Li
8. Haozhang Yang
9. Jiaqi Li
10. Zheng Zhou
11. Lifeng Liu
12. Jinfeng Kang
This article has no evaluationsLatest version Sep 5, 2025
Hardware-Hybrid Key-Value Store: FPGA-Accelerated Design for Low-Latency and Congestion-Resilient In-Memory Caching

This article has 1 author:
1. Daisuke Sugisawa
This article has no evaluationsLatest version Oct 17, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Adaptive Dataflow and Precision Optimization for Deep Learning on Configurable Hardware Architectures

Physically unclonable memristor-based compute-in-memory chip for secure AI

Hardware-Hybrid Key-Value Store: FPGA-Accelerated Design for Low-Latency and Congestion-Resilient In-Memory Caching