We are living in an era driven by artificial intelligence. Large models like ChatGPT or Sora are reshaping various industries. However, these models have huge “appetites”; their massive parameters pose severe challenges to computing and storage systems. When chip computing speeds are fast, but data supply cannot keep up, the so-called “memory wall” bottleneck forms. High-Bandwidth Flash technology is a key innovation born precisely to break this wall and inject powerful momentum into AI systems at a more economical cost.
What is HBF? How does it achieve the "best of both worlds"?
High-Bandwidth Flash is not an entirely new storage medium, but rather a clever architectural innovation. Its core idea is to combine the 3D NAND flash memory commonly found in our daily lives (used in phones and solid-state drives) with the advanced packaging and interconnection technology of HBM, which is commonly used in high-performance computing. Simply put, HBF aims to give high-capacity, low-cost flash memory data transfer speeds close to high-end memory, thereby achieving an ideal balance between capacity, bandwidth, and cost. Achieving this goal relies primarily on two major technological breakthroughs.
The first is 3D stacking and TSV interconnection technology. The success of HBM lies in vertically stacking multiple DRAM chips like building blocks using Through-Silicon Vias, enabling high-speed communication through tiny vertical channels. HBF borrows this concept, performing high-density stacking of multiple NAND flash chips. This design significantly shortens the internal data transmission paths within the chip, increases integration density, and lays the foundation for high bandwidth.
The second, and more critical, breakthrough is the parallel sub-array architecture. Although traditional NAND flash has large capacity, the number of channels that can simultaneously read and write data is limited. This is like a wide road with very few entrances and exits, prone to congestion. HBF innovates the core structure of flash memory by dividing it into a large number of storage sub-arrays that can work independently and in parallel. Each sub-array has its own independent read/write channels. When hundreds or thousands of such sub-arrays work simultaneously, it’s equivalent to expanding a single-lane road into a high-speed network with thousands of lanes, allowing the data deluge to flow out unimpeded, thereby achieving a leap in total bandwidth.
To more clearly demonstrate HBF’s unique positioning, the table below compares its key characteristics with its “predecessors” HBM and traditional NAND SSDs:
| Characteristic | HBF | HBM | Traditional NAND SSD |
|---|---|---|---|
| Core Advantage | High capacity, high bandwidth, low cost | Extremely high bandwidth, ultra-low latency | High capacity, very low cost |
| Typical Capacity per Stack/Chip | Up to 512GB | Approx. 24-48GB | 1TB-2TB |
| Bandwidth Level | Very high, close to HBM | Extreme | Relatively low |
| Cost per Unit | Relatively low | Very high | Very low |
| Best Application | AI inference, read-intensive tasks | AI training, high-performance computing | Data storage, archiving |
This comparison intuitively shows that HBF precisely fills the market gap between HBM and traditional SSDs. It does not possess the ultra-low latency and extreme write speed of HBM, but offers far greater capacity and much lower cost. Compared to traditional SSDs, it achieves orders of magnitude higher bandwidth, born for scenarios requiring fast reading of massive data.
Technical Characteristics, Advantages, and Challenges
The value of High-Bandwidth Flash lies in its unique combination of technical characteristics, which define its current capabilities and primary application directions. It is not all-powerful, but excels in its areas of expertise. Its current status can be clearly presented by contrasting its advantages and challenges.
Three Core Advantages of HBF
Huge Capacity and Cost Advantage: In the same physical space, a single HBF stack can provide up to 512GB capacity, more than 10 times that of HBM. Based on lower-cost-per-unit NAND flash, it can significantly reduce the total cost of ownership for AI systems. High Read Bandwidth and Energy Efficiency: Through its parallel architecture, its read bandwidth can approach HBM levels, meeting the needs of tasks like AI inference for fast data reading. Meanwhile, its static power consumption is much lower than DRAM, which requires constant refreshing. Precise Market Positioning: It accurately fills the gap between HBM and traditional SSDs, providing an ideal solution for read-intensive applications sensitive to capacity and cost.
Main Challenges Facing HBF
Write Speed and Endurance Limitations: These are inherent characteristics of NAND flash. HBF’s write speed is much slower than HBM’s, and its chips have limited erase/write cycles. Therefore, it is not suitable for AI model training scenarios requiring frequent data writing. Higher Access Latency: Its access latency is at the microsecond level. While this has little impact on many read tasks, it is still much higher than HBM’s nanosecond-level latency and cannot handle latency-extremely-sensitive applications.
In summary, an accurate understanding of HBF’s current status is: it is a high-performance storage solution optimized for read-intensive tasks. It is not a replacement for HBM, but a powerful complement. Its value lies in leveraging its strengths and avoiding its weaknesses to solve specific problems.
Future Outlook for HBF
Based on its characteristics of “large capacity, high read bandwidth, low cost, and limited write endurance,” HBF’s future development path is very clear. It will not replace HBM’s position in training, but will carve out its own niche market and form a complementary ecosystem with existing technologies.
Core Application Scenarios
The killer applications for HBF are mainly concentrated in the following directions:
AI Edge Inference Servers. This is the most ideal and promising application scenario for HBF. Deploying AI models on edge servers for inference involves workloads that are almost purely read operations, frequently calling pre-trained model parameters. This perfectly matches HBF’s advantages of high read bandwidth and huge capacity, while avoiding its weaknesses of slow writes and limited endurance. Additionally, HBF’s low power consumption is well-suited for energy-sensitive edge environments.
Forming a heterogeneous or hybrid memory architecture with HBM. In cloud data centers, HBF can serve as an effective capacity extension for HBM. In this model, HBM acts as a high-speed cache, holding the “hot data” most urgently needed for current computation, while the complete, massive AI model is stored in HBF. Different parts of the model parameters are then high-speed prefetched from HBF to HBM as needed. This combination offers an attractive balance between performance, capacity, and total cost.
Looking further ahead, HBF technology is expected to trickle down to end-user devices. As AI PCs and high-end smartphones demand stronger local AI capabilities, integrating HBF could enable these devices to run larger parameter models locally, reducing reliance on the cloud and better protecting user privacy.
Technology Development and Industrialization Process
In terms of technology development, leading companies have already planned clear roadmaps. For example, SandDisk plans continuous iteration through three generations of products, with goals including further increasing single-chip capacity beyond 512GB and doubling the read bandwidth from current levels, while continuously optimizing energy efficiency.
The industrialization process has also begun. The collaboration between SK Hynix and SandDisk marks a key step for HBF moving from R&D to industrialization. The industry generally anticipates that HBF module samples will be available in the second half of 2026, and the first AI inference servers integrating HBF are expected to officially debut in early 2027. Market analysts predict that by 2030, HBF has the potential to grow into a market worth tens of billions of US dollars. Although its scale might be smaller than HBM’s, it will become an indispensable part of AI infrastructure.
In conclusion, HBF’s future lies in being a key piece in the AI storage ecosystem. Its development will closely revolve around read-intensive tasks, gradually empowering the next generation of AI applications from the cloud to the edge.





