SANTA CLARA — NVIDIA, in collaboration with Microsoft Azure, has unveiled a groundbreaking development in cloud computing, introducing a new GPU-accelerated supercomputer in the Microsoft Azure cloud. This innovation marks a significant step towards democratizing access to supercomputing power, previously accessible only to large enterprises and organizations.
The new offering, named the Microsoft Azure NDv2 instance, boasts remarkable scalability, capable of accommodating up to 800 NVIDIA V100 Tensor Core GPUs interconnected on a single Mellanox InfiniBand backend network. This architecture empowers users to harness the full potential of complex AI and high-performance computing (HPC) applications, rivaling the capabilities of on-premises supercomputers that typically require extensive deployment timelines.
Ian Buck, vice president and general manager of Accelerated Computing at NVIDIA, emphasized the importance of this development, stating, “Until now, access to supercomputers for AI and high-performance computing has been reserved for the world’s largest businesses and organizations.”
Girish Bablani, corporate vice president of Azure Compute at Microsoft Corp., highlighted the transformative impact of this collaboration, stating, “As cloud computing gains momentum everywhere, customers are seeking more powerful services.”
The Microsoft Azure NDv2 instance offers significant performance and cost advantages over traditional CPU-based computing, particularly in AI, machine learning, and HPC workloads. For instance, AI researchers can leverage multiple NDv2 instances to train complex conversational AI models in a fraction of the time previously required. Microsoft and NVIDIA engineers successfully trained BERT, a popular conversational AI model, in approximately three hours using 64 NDv2 instances, thanks to optimizations provided by NCCL, an NVIDIA CUDA X™ library, and high-speed Mellanox interconnects.
Furthermore, customers utilizing multiple NDv2 instances for HPC workloads, such as the simulation of materials at the atomic scale for drug development, can achieve significantly faster results compared to traditional CPU-based nodes.
The NDv2 instances are equipped with GPU-optimized HPC applications, machine learning software, and deep learning frameworks from the NVIDIA NGC container registry and Azure Marketplace. This streamlined deployment process is further facilitated by Helm charts, enabling easy deployment of AI software on Kubernetes clusters.
The Microsoft Azure NDv2 instance is currently available in preview, offering users the flexibility to cluster instances according to their workload demands. This collaboration between Microsoft Azure and NVIDIA represents a significant stride towards democratizing access to supercomputing capabilities, ushering in a new era of innovation and discovery.