Variable Bit-Width All-Optical Content-Addressable Memory Enabled by Sb2Se3 for Similarity Search
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
In the big-data-driven artificial intelligence era, similarity search, as a core operation in machine learning and data mining, demands high speed, energy efficiency, and scenario adaptability. Conventional electronic content-addressable memory (ECAMs) suffer from inherent RC delay bottlenecks, whereas existing optical content-addressable memory (OCAMs) are restricted by fixed bit-widths and limited distance metrics. In this work, we propose a variable bit-width all-optical CAM leveraging multi-segment modulators and phase-change material (PCM) Sb2Se3. The multi-segment memory unit (MSMU) therein compresses N-bit binary data into a single analog photonic unit, supporting direct data writing/loading without digital-to-analog converters (DACs) and flexible trade-offs between precision, storage capacity, noise immunity, and energy while enabling Hamming and nonlinear distance metrics. A six-element three-bit OCAM prototype was fabricated on a silicon nitride silicon-on-insulator (SiN-SOI) platform. Despite the absence of integrated high-speed phase shifters, the device still achieves reliable optical data storage and retrieval. K-nearest neighbor (kNN) simulations based on experimentally derived statistical data—validated on the iris, wine, and breast cancer datasets—show that the three-bit operating mode achieves classification accuracy comparable to Manhattan/Euclidean distances at high signal-to-noise ratios (SNRs), while the one-bit mode exhibits strong noise robustness. Energy consumption is 364 fJ/bit (3-bit) and 890 fJ/bit (1-bit). This work provides a high-speed, energy-efficient, and reconfigurable all-optical similarity search solution with experimentally verified device performance and dataset-validated applicability, showing great potential for widespread deployment in data-intensive machine learning and data-mining applications.