Kioxia transforms AI inference systems with innovative storage solution
- Kioxia launched the AiSAQ project to enhance the use of SSDs in AI inference systems.
- The AiSAQ minimizes DRAM requirements by moving database vectors into storage.
- This innovation allows for more scalable RAG AI solutions and improves model accuracy.
In April 2025, Kioxia's Rory Bolt presented the AiSAQ initiative, an open-source project aimed at advancing the capabilities of Solid State Drives (SSDs) in Retrieval-Augmented Generation (RAG) AI solutions. The initiative seeks to shift the focus of AI from the intensive and costly generation of foundational models to scalable, efficient inference systems that harness existing data for practical applications. This transition aligns with the growing demand for AI technologies that utilize real-time information while minimizing required hardware resources. Kioxia has been invested in AI developments since 2017, enhancing its NAND fabrication processes through machine vision to track production trends and defect rates. The report details the contrast between traditional methods, such as utilizing hard drives in large data centers, compared to the increasingly popular in-house solutions that leverage SSDs for training purposes. These systems often operate with foundational Large Language Models (LLMs) that are fine-tuned with up-to-date data to improve performance and reduce instances of misinformation, or hallucinations. The publication explains that implementation of RAG can be achieved through various strategies, from placing the database index and vectors entirely in the DRAM, which can be prohibitively expensive due to its high memory requirements, to Microsoft's Disk ANN approach that shifts a significant portion of the database content to SSDs, thereby decreasing DRAM demands. Kioxia aims to enhance scalability further through the AiSAQ, which relocates database vectors fully into storage, thus sparing DRAM amid increasing database sizes. In early July 2025, Kioxia announced additional enhancements to the AiSAQ framework. The update introduces flexible controls that enable system architects to find an optimal balance between search performance and storage capacity, allowing for better customizability of RAG systems without the need for hardware changes. As a result, this approach fosters improved scalability within RAG workflows and enhances the accuracy of models, marking a significant advancement in AI storage solutions.