SANTA CLARA — On March 18, 2024, Google Cloud and NVIDIA announced a significant expansion of their partnership aimed at empowering the machine learning (ML) community with enhanced technology to accelerate the development, scaling, and management of generative AI applications. The move represents a concerted effort to foster more accessible and open AI infrastructure.
Key elements of this partnership include the adoption of NVIDIA’s latest Grace Blackwell AI computing platform by Google, along with the integration of NVIDIA DGX Cloud service into Google Cloud infrastructure. Additionally, Google will incorporate NVIDIA H100-powered DGX Cloud platforms into its offerings, providing developers with powerful tools to train and deploy AI models using their preferred frameworks.
Thomas Kurian, CEO of Google Cloud, emphasized the comprehensive nature of the collaboration, spanning from hardware integration to software ecosystems. He highlighted the commitment to providing an accessible and open AI platform for ML developers.
Jensen Huang, founder and CEO of NVIDIA, stressed the importance of offering solutions that enable enterprises to harness generative AI efficiently. He underlined the significance of expanded infrastructure offerings and integrations in providing customers with scalable AI applications.
The integration efforts between NVIDIA and Google Cloud build upon their longstanding commitment to providing leading capabilities across the AI stack. Key components of the expansion include:
- Adoption of NVIDIA Grace Blackwell: This platform enables real-time inference on large language models (LLMs) with up to trillion parameters. Google’s adoption of Grace Blackwell for internal deployments signifies a step toward offering Blackwell-powered instances to cloud customers.
- Grace Blackwell-powered DGX Cloud on Google Cloud: Google will introduce NVIDIA GB200 NVL72 systems to its cloud infrastructure, providing energy-efficient training and inference capabilities for LLMs. The availability of DGX Cloud on Google Cloud A3 VM instances further enhances the serverless experience for enterprise developers.
- Support for JAX on GPUs: Collaboration between Google Cloud and NVIDIA facilitates the use of JAX, a high-performance machine learning framework, on NVIDIA H100 GPUs. This widens access to large-scale LLM training within the ML community.
- Integration of NVIDIA NIM on Google Kubernetes Engine (GKE): NIM inference microservices, part of the NVIDIA AI Enterprise software platform, will be integrated into GKE, streamlining generative AI deployment in enterprises.
- Support for NVIDIA NeMo: Google Cloud’s support for NVIDIA NeMo framework via GKE and Google Cloud HPC Toolkit simplifies the deployment and scaling of generative AI models.
- Expansion of Vertex AI and Dataflow support for NVIDIA GPUs: Vertex AI now supports Google Cloud A3 VMs powered by NVIDIA H100 GPUs, providing MLOps teams with scalable infrastructure for AI applications. Dataflow has expanded support for accelerated data processing on NVIDIA GPUs.
The holistic partnership between Google Cloud and NVIDIA enables AI researchers, scientists, and developers to train, fine-tune, and serve sophisticated AI models with optimized tools and frameworks. Testimonials from companies like Runway, Palo Alto Networks, and Writer underscore the tangible benefits of this collaboration in enhancing model performance and lowering hosting costs.
The collaboration between Google Cloud and NVIDIA will be further showcased at GTC, the global AI conference, from March 18 to 21, providing an opportunity for industry professionals to delve deeper into the advancements in AI infrastructure.