The memory wall just met its match: intelligent SSDs

#ai #storage #ssd #kioxia

Intelligent storage is here. It’s not just a concept for the future, and it’s a rapidly emerging reality, driven by the convergence of flash memory and artificial intelligence. For years, storage has been the quiet workhorse, passively holding data until a CPU or GPU requested it. But as AI models grow beyond trillions of parameters, the cost of shuttling data back and forth has become unsustainable. We've hit a memory wall, where the capacity of expensive High Bandwidth Memory (HBM) simply cannot keep pace with the data demands of large language models and retrieval-augmented generation (RAG).

The question is no longer about making storage faster, but about making it smarter. Two key open-source repositories are exploring the "how," and they signal a fundamental shift: rosspeili/computational_storage_landscape and ARPAHLS/lc0_vic.

From Passive Block to Active, Queryable Storage

The first repository, computational_storage_landscape, is a strategic guide to this emerging ecosystem. It positions KIOXIA Group as a primary lens and focuses on the technical feasibility of embedding Small Language Models (TinyLMs) directly into SSD controllers. This isn't just about faster reads and writes, but about offloading processing to where the data resides. By using extreme quantization, these TinyLMs can perform inference tasks at the edge of the storage device, dramatically reducing the data that needs to travel up the I/O stack to the host system.

The core enabler here is the shift toward what the repo calls "intelligent, queryable storage". Instead of a drive just returning blocks of data, it becomes an active computational node capable of running search, filtering, and ranking functions on its own. This reflects a broader industry trend, with major players like IBM introducing Content-Aware Storage (CAS) architectures and the SNIA (Storage Networking Industry Association) launching Storage.AI initiatives to standardize data flows for AI workflows.

The Reference Implementation: Talking to Your Drive

But strategic maps are theoretical without a compass. This is where the second repository, lc0_vic (Logical Controller Zero / Virtual Intelligent Controller), becomes crucial. It's a working, open-source Python reference implementation for the exact ideas detailed in the landscape repo.

The project is a direct response to KIOXIA’s research on AiSAQ (All-in-Storage ANNS with Product Quantization). This algorithm allows for approximate nearest neighbor (ANN) vector search directly on flash, without the need to store indexes in costly DRAM. We call this the "Tiered filesystem retrieval" architecture, and we break it down like this:

L0: Metadata scanning, the first pass at understanding your data.
L1: Vector tier, where content is converted into searchable embeddings.
L2: Optional deep parsing (Skillware) for complex extraction (eg. OCR, media parsing and more).

This architecture is orchestrated by a controller that creates a QueryPlan, enabling you to run natural language queries against your local file system. The user experience is simple: you can pip install the tool, run vic index to build your search index, and ask a question via vic ask. This elegantly proves the concept outlined in the first repo by making it tangible. As the repo notes, it's "more than a paper design," featuring full CI and integration tests to validate the logic.

The Road Ahead for Intelligent Storage

The lc0_vic repository is explicit that it runs on the host computer today, but its research goal is to explore whether these retrieval contracts can be mapped to firmware or device-adjacent runtimes. This is the bridge between the two repos: the landscape repo provides the where (SSD controllers), and the lc0_vic repo provides the how (tiered retrieval and in-storage vector search).

The combination of these two projects paints a clear picture. As data centers accumulate exabytes of flash storage, the idea of a "smart SSD" that can pre-process data, run vector searches, and answer questions without waking the host CPU isn't just efficient, but inevitable. The era of silent storage is ending, and the era of conversational storage is only beginning.

We will be working on a lite demo of the reference implementation to showcase how you can simply query a local folder or SSD using NLP, and get structured results with descriptions, and not just cold keyword-based path matching.