Microsoft Azure has set a new benchmark in AI infrastructure by launching the world’s first large-scale NVIDIA GB300 NVL72 supercomputing cluster, designed specifically to meet the computational demands of OpenAI. This next-generation cluster delivers unparalleled scale, performance, and efficiency for training and deploying frontier AI models.
Key Takeaways
- Over 4,600 NVIDIA GB300 NVL72 systems deployed, powering more than 4,600 Blackwell Ultra GPUs.
- The cluster enables the training of models with hundreds of trillions of parameters, reducing model development time from months to weeks.
- Advanced cooling, networking, and memory architectures are engineered for performance and sustainability.
- The collaboration strengthens Microsoft and NVIDIA’s leadership in providing next-generation AI infrastructure for OpenAI.
The World’s First At-Scale GB300 NVL72 Cluster
Microsoft’s new Azure cluster is the first in the world to deploy NVIDIA’s latest GB300 NVL72 systems at a supercomputing scale. Each rack delivers 72 Blackwell Ultra GPUs and 36 Grace CPUs, creating a unified, high-bandwidth memory pool of up to 40TB per rack. This architecture significantly boosts computational throughput, offering up to 1.44 exaflops of FP4 Tensor Core performance per rack.
Advanced Networking and Performance
The cluster uses fifth-generation NVIDIA NVLink Switch fabric within racks, affording each GPU direct access to 130TB/s of intra-rack bandwidth. For inter-rack communication, the NVIDIA Quantum-X800 InfiniBand network delivers 800Gbps of connectivity per GPU, ensuring low latency and scalable performance across thousands of GPUs.
Recent MLPerf benchmarks highlight the GB300’s superior performance, showing up to 5x higher throughput on massive models compared to previous architectures. This positions Microsoft and OpenAI at the forefront of AI innovation, particularly for advanced reasoning and generative AI workloads.
Engineering for Scale and Sustainability
Building this cluster required Microsoft to redesign nearly every aspect of its AI data centers. Innovations include:
- Liquid cooling systems: Standalone heat exchanger units minimize water usage and maintain stability in dense computing environments.
- Next-generation power distribution: Custom solutions support the high energy needs and dynamic load balancing of accelerated AI workloads.
- Unified software orchestration: Enhanced scheduling and storage systems maximize resource efficiency at supercomputing scale.
Ushering in the Next Era of AI with OpenAI
This milestone is the result of years of collaboration between Microsoft and NVIDIA, reimagining everything from hardware to software stacks. The cluster gives OpenAI the infrastructure to build, train, and deploy the next generation of AI models at a scale previously thought unattainable.
Looking ahead, Microsoft plans to expand these GB300 deployments globally, promising faster model development cycles and the capacity to support ever-larger and more sophisticated AI systems. This launch cements Microsoft Azure as a leader in cloud-based AI infrastructure, powering the frontier of artificial intelligence for OpenAI and beyond.


