Microsoft has introduced the Maia 200, an AI accelerator specifically engineered for inference workloads, and said the chip will be used to run OpenAI's latest GPT-5.2 models. The company described the Maia 200 as the "most performant, first-party silicon from any hyperscaler," and highlighted a set of technical specifications intended to improve the economics of AI token generation.
The Maia 200 is manufactured on TSMC's 3nm process and includes native FP8/FP4 tensor cores, 216GB of HBM3e memory, and 272MB of on-chip SRAM. Microsoft asserts the design delivers over 10 petaFLOPS in 4-bit precision and exceeds 5 petaFLOPS in 8-bit performance while operating within a 750W power envelope. The company also positioned the new chip as more powerful than Amazon's third-generation Trainium and Google's seventh-generation TPU.
Initial deployment is already underway at Microsoft's Iowa datacenter, with plans to expand the Maia 200 to additional regions. Microsoft is also previewing a Maia software development kit that it says will help developers tune and optimize models for the new hardware.
Market response to the announcement was muted. Nvidia's stock was little changed on Monday after Microsoft disclosed the Maia 200. Investors appear to be reserving judgment on whether Microsoft's entry into the inference-chip space will materially alter Nvidia's leading position in the AI hardware market - a position that has supported the company's rapid growth over the past year.
For now, the facts reported by Microsoft include the chip's manufacturing node, its memory and on-chip storage capacities, its tensor-core precision support, claimed petaFLOPS figures at specified bit widths, the 750W power constraint, and the early regional deployment. Microsoft is also making a developer toolkit available as part of the introduction. Observers and market participants will be monitoring how broadly the Maia 200 is rolled out and how software ecosystems adapt to the new silicon.
Summary
Microsoft launched the Maia 200 inference accelerator built on TSMC's 3nm process with native FP8/FP4 tensor cores, 216GB of HBM3e memory, and 272MB of on-chip SRAM. The company claims performance exceeding 10 petaFLOPS at 4-bit precision and over 5 petaFLOPS at 8-bit within a 750W envelope, and has deployed the chip in its Iowa datacenter while previewing a Maia SDK for developers. Nvidia's stock saw little movement on Monday following the announcement.