Memory Stocks Dip After Google Reveals TurboQuant Compression for AI

New algorithm promises steep cuts in key-value cache size and faster performance on H100 GPUs, pressuring NAND and DRAM-related names

By Leila Farooq SNDK MU WDC STX

Memory Stocks Dip After Google Reveals TurboQuant Compression for AI

SNDK MU WDC STX

Shares of several major memory and storage companies fell Wednesday following Google's announcement of TurboQuant, a compression approach that can reduce key-value cache size to 3 bits without retraining and that showed marked memory and performance improvements in tests. The decline in select memory-related stocks occurred even as the Nasdaq 100 moved higher.

Key Points

Google introduced TurboQuant, a compression algorithm that can compress key-value cache to 3 bits without retraining while preserving model accuracy.
Testing on open-source models including Gemma and Mistral showed a 6x reduction in key-value memory size and up to an 8x performance increase on H100 GPUs compared to unquantized keys.
Major memory and storage stocks - SNDK, MU, WDC and STX - fell even as the Nasdaq 100 advanced, reflecting investor sensitivity to factors that could reduce hardware demand.

Shares of memory and storage companies moved lower Wednesday after Google disclosed TurboQuant, a compression algorithm aimed at shrinking memory needs for large language models and vector search workloads. The weakness among those stocks unfolded while the Nasdaq 100 posted gains.

Market moves: SanDisk Corporation (NASDAQ:SNDK) dipped 5.7%, Micron Technology (NASDAQ:MU) slid 3%, Western Digital (NASDAQ:WDC) lost 4.7% and Seagate Technology (NASDAQ:STX) fell 4%. These declines came amid strength in the broader technology index.

What Google announced: The company said TurboQuant is designed to address bottlenecks in the key-value cache, the system that holds frequently accessed information for AI inference. According to the announcement, TurboQuant can compress key-value cache to 3 bits without requiring additional training or fine-tuning while maintaining model accuracy.

Google reported that tests on open-source models including Gemma and Mistral produced a roughly 6x reduction in key-value memory size. On H100 GPU accelerators, the algorithm demonstrated as much as an 8x performance increase compared with unquantized keys.

How the method works: TurboQuant operates in two stages. First, it applies PolarQuant, a process that rotates data vectors to enable higher-quality compression. Second, it uses the Quantized Johnson-Lindenstrauss algorithm to remove residual errors left after initial quantization. Google noted that conventional vector quantization techniques often add an extra 1 to 2 bits per value as memory overhead, which can partially offset compression gains.

Research and benchmarking: TurboQuant is slated for presentation at ICLR 2026, and the PolarQuant component is scheduled for AISTATS 2026. The company tested the method across a range of benchmarks, including LongBench, Needle In A Haystack, ZeroSCROLLS, RULER and L-Eval.

Wider uses: Beyond model inference, Google highlighted applications for vector search functionality that underpin large-scale search engines, indicating the technique may affect systems that rely on dense vector representations.

Implications for memory suppliers: Memory-related equities have recorded substantial gains year to date, a trend that leaves them exposed to technological advances that could lower hardware demand. The announcement of a compression technique capable of materially reducing key-value memory footprints is an example of such a development and was accompanied by immediate market selling in several suppliers.

Market context: While TurboQuant's testing results on specific open-source models and on H100 hardware were presented by Google, the company also acknowledged tradeoffs inherent in prior quantization approaches. The research schedule and benchmark coverage were disclosed as part of the announcement.

Investors watching the memory and storage sector will likely monitor further technical details, broader testing results and adoption signals to assess whether the algorithm materially alters hardware demand dynamics for AI deployments.

Risks

Reduced demand risk for memory and storage suppliers if TurboQuant or similar techniques are widely adopted, affecting revenue for companies in the semiconductor and storage sectors.
Market volatility risk for memory stocks, which have rallied significantly year to date and may be sensitive to technological developments that change capacity requirements.
Uncertainty around broader adoption and real-world performance outside the tested models and benchmarks, which could affect how quickly any demand shift materializes in enterprise procurement cycles.

Menu

Memory Stocks Dip After Google Reveals TurboQuant Compression for AI

Key Points

Risks

More from Stock Markets