MOUNTAIN VIEW– Google is introducing a new generation of open AI models designed to support responsible AI development and assist developers and researchers globally. Gemma, built on the same research and technology behind Google’s Gemini models, aims to help foster safe, innovative AI creation.
Developed by Google DeepMind and other teams across Google, Gemma is a family of lightweight yet powerful models, offering two sizes: Gemma 2B and Gemma 7B. These models are designed for easy deployment and are available worldwide starting today. Accompanying the model weights, Google is also releasing a suite of tools and resources to guide developers and support collaborative, ethical AI usage.
Key Features of Gemma:
- Model Sizes and Variants: Gemma comes in two sizes—2B and 7B—with both pre-trained and instruction-tuned versions.
- Responsible Generative AI Toolkit: A comprehensive toolkit with guidance and tools designed to help developers create safer AI applications.
- Wide Compatibility: Gemma supports major frameworks like JAX, PyTorch, and TensorFlow via native Keras 3.0 and offers integrations with Hugging Face, MaxText, NVIDIA NeMo, and TensorRT-LLM.
- Cross-Platform Support: The models can run on devices ranging from laptops and workstations to Google Cloud, with easy deployment on Vertex AI and Google Kubernetes Engine (GKE).
- Optimization for Hardware: Gemma is optimized for multiple AI hardware platforms, including NVIDIA GPUs and Google Cloud TPUs, ensuring top-tier performance.
- Commercial Use Permissions: The terms of use allow for responsible commercial usage and distribution for organizations of all sizes.
Industry-Leading Performance for Its Size
Gemma 2B and 7B models share key technical components with the larger Gemini models, enabling them to perform exceptionally well for their size when compared to other open models. Notably, Gemma outperforms many larger models on critical benchmarks while maintaining stringent safety standards. These models can be easily run on laptops, desktops, or Google Cloud, making them accessible for a wide range of applications.
Built for Safety and Responsibility
Gemma was developed with a strong focus on safety. Google utilized automated techniques to filter sensitive personal information from training data and applied fine-tuning and reinforcement learning from human feedback (RLHF) to ensure the models behave responsibly. Extensive evaluations, including red-teaming and adversarial testing, were carried out to assess and minimize risks. The full details of these evaluations are available in the Gemma Model Card.
Additionally, the Responsible Generative AI Toolkit offers practical tools for developers to enhance safety in their applications. These include safety classification methodologies, a model debugging tool, and best practices for building safe models.
Optimized Across Tools and Platforms
Gemma models are adaptable to specific application needs, such as summarization or retrieval-augmented generation (RAG). Developers can fine-tune models on their own data using tools that work across popular frameworks, including Keras 3.0, PyTorch, JAX, and Hugging Face Transformers.
The models also support cross-device compatibility, running seamlessly across a variety of devices including laptops, desktops, IoT, mobile devices, and cloud platforms. Google’s collaboration with NVIDIA ensures that Gemma is optimized for industry-leading performance on NVIDIA GPUs, ranging from data centers to local AI PCs.
Google Cloud Integration
For developers looking for further customization, Gemma offers robust integration with Google Cloud’s Vertex AI. This platform provides a comprehensive MLOps toolkit for tuning and deploying models efficiently across a wide range of hardware, including GPUs, TPUs, and CPUs, whether self-managed or fully managed through Google Kubernetes Engine (GKE).
Free Access for Research and Development
Google is committed to supporting the AI research community. Researchers and developers can start working with Gemma through free access via Kaggle, a free tier for Colab notebooks, and a $300 credit for first-time Google Cloud users. Google Cloud credits of up to $500,000 are also available to researchers to help accelerate their projects.
Get Started with Gemma
To begin exploring Gemma and to access quickstart guides, developers can visit ai.google.dev/gemma. As Google continues to expand the Gemma model family, more variants and applications will be introduced, offering exciting opportunities for innovation.
We’re eager to see how developers use Gemma to create impactful AI solutions and look forward to sharing future events and opportunities for collaboration.