Microsoft Unveils Maia 200 Inference Chip; Nvidia Shares Largely Unmoved

Maia 200 touted for high token-generation efficiency and deployed in Microsoft datacenter as investors adopt a wait-and-see stance

By Nina Shah NVDA MSFT

Microsoft Unveils Maia 200 Inference Chip; Nvidia Shares Largely Unmoved

NVDA MSFT

Microsoft announced the Maia 200, a new AI inference accelerator built on TSMC's 3nm process and equipped with native FP8/FP4 tensor cores, 216GB of HBM3e, and 272MB of on-chip SRAM. The company says the chip delivers over 10 petaFLOPS in 4-bit precision and over 5 petaFLOPS in 8-bit within a 750W power envelope, and will be used to run OpenAI's GPT-5.2 models. Nvidia's stock was little changed on Monday following the announcement as investors weigh the potential competitive implications.

Key Points

Microsoft announced the Maia 200 inference chip, built on TSMC's 3nm process, featuring FP8/FP4 tensor cores, 216GB HBM3e memory, and 272MB on-chip SRAM.
The company claims the Maia 200 delivers over 10 petaFLOPS at 4-bit precision and over 5 petaFLOPS at 8-bit within a 750W power envelope and positions it as more powerful than Amazon's Trainium 3 and Google's TPU v7.
Nvidia's stock was little changed on Monday as investors adopt a wait-and-see approach to assess any impact on Nvidia's market position; Maia 200 is already deployed in Microsoft's Iowa datacenter and a developer SDK is being previewed.

Microsoft has introduced the Maia 200, an AI accelerator specifically engineered for inference workloads, and said the chip will be used to run OpenAI's latest GPT-5.2 models. The company described the Maia 200 as the "most performant, first-party silicon from any hyperscaler," and highlighted a set of technical specifications intended to improve the economics of AI token generation.

The Maia 200 is manufactured on TSMC's 3nm process and includes native FP8/FP4 tensor cores, 216GB of HBM3e memory, and 272MB of on-chip SRAM. Microsoft asserts the design delivers over 10 petaFLOPS in 4-bit precision and exceeds 5 petaFLOPS in 8-bit performance while operating within a 750W power envelope. The company also positioned the new chip as more powerful than Amazon's third-generation Trainium and Google's seventh-generation TPU.

Initial deployment is already underway at Microsoft's Iowa datacenter, with plans to expand the Maia 200 to additional regions. Microsoft is also previewing a Maia software development kit that it says will help developers tune and optimize models for the new hardware.

Market response to the announcement was muted. Nvidia's stock was little changed on Monday after Microsoft disclosed the Maia 200. Investors appear to be reserving judgment on whether Microsoft's entry into the inference-chip space will materially alter Nvidia's leading position in the AI hardware market - a position that has supported the company's rapid growth over the past year.

For now, the facts reported by Microsoft include the chip's manufacturing node, its memory and on-chip storage capacities, its tensor-core precision support, claimed petaFLOPS figures at specified bit widths, the 750W power constraint, and the early regional deployment. Microsoft is also making a developer toolkit available as part of the introduction. Observers and market participants will be monitoring how broadly the Maia 200 is rolled out and how software ecosystems adapt to the new silicon.

Summary

Microsoft launched the Maia 200 inference accelerator built on TSMC's 3nm process with native FP8/FP4 tensor cores, 216GB of HBM3e memory, and 272MB of on-chip SRAM. The company claims performance exceeding 10 petaFLOPS at 4-bit precision and over 5 petaFLOPS at 8-bit within a 750W envelope, and has deployed the chip in its Iowa datacenter while previewing a Maia SDK for developers. Nvidia's stock saw little movement on Monday following the announcement.

Risks

Uncertain competitive impact - Investors are taking a wait-and-see stance on whether Microsoft's Maia 200 will meaningfully affect Nvidia's dominant position in AI hardware, creating uncertainty for semiconductor and cloud infrastructure markets.
Limited initial deployment - The Maia 200 is currently deployed in Microsoft's Iowa datacenter with plans to expand, so near-term effects on broader data center hardware procurement and AI workloads may be constrained by rollout pace.
Ecosystem and software adaptation - Effective utilization of the Maia 200 will depend on developer adoption and optimization via the Maia SDK, introducing uncertainty for application performance and integration timelines across cloud and AI services.

Menu

Microsoft Unveils Maia 200 Inference Chip; Nvidia Shares Largely Unmoved

Key Points

Risks

More from Stock Markets