SANTA CLARA — In a groundbreaking collaboration, ServiceNow, Hugging Face, and NVIDIA have unveiled StarCoder2, a revolutionary family of open-access large language models (LLMs) designed to redefine code generation for enterprise applications. Developed in partnership with the esteemed BigCode Community, StarCoder2 marks a significant leap forward in performance, transparency, and cost-effectiveness in the realm of generative AI.
Trained on an impressive array of 619 programming languages, StarCoder2 represents a collaborative effort to democratize access to cutting-edge AI capabilities. The project leverages the expertise of ServiceNow, a leader in digital workflow solutions, alongside Hugging Face, a prominent open-source platform for machine learning collaboration, and NVIDIA, renowned for its accelerated infrastructure and responsible AI practices.
The StarCoder2 family comprises three distinct model sizes, each tailored to suit diverse application requirements. Ranging from a 3-billion-parameter model developed by ServiceNow to a 15-billion-parameter model crafted by NVIDIA using advanced training techniques and infrastructure, these models offer unparalleled flexibility and efficiency in code generation tasks.
Harm de Vries, lead of ServiceNow’s StarCoder2 development team, underscores the significance of open scientific collaboration and ethical AI practices in shaping the evolution of StarCoder2. Emphasizing its potential to enhance developer productivity and foster innovation across organizations of all sizes, de Vries highlights StarCoder2’s commitment to responsible AI development.
Leandro von Werra, machine learning engineer at Hugging Face, echoes this sentiment, emphasizing the importance of transparency and open governance in driving responsible innovation. By providing full data and training transparency, StarCoder2 empowers the community to build a wide range of applications more efficiently, fostering a culture of collaboration and knowledge sharing.
Jonathan Cohen, vice president of applied research at NVIDIA, underscores the transformative impact of code LLMs on efficiency and innovation across industries. NVIDIA’s collaboration with ServiceNow and Hugging Face aims to introduce secure, responsibly developed models that support broader access to accountable generative AI, benefiting the global community at large.
StarCoder2’s state-of-the-art architecture, coupled with meticulously curated data sources from BigCode, prioritizes transparency and open governance, laying the foundation for responsible innovation at scale. These advancements unlock the potential of AI-driven coding applications, enabling accurate, context-aware predictions and accelerating digital transformation initiatives.
Furthermore, StarCoder2 empowers users to fine-tune models with industry-specific data, leveraging open-source tools such as NVIDIA NeMo and Hugging Face TRL. This capability enables the development of personalized coding assistants, advanced chatbots, and text-to-workflow applications tailored to organizational needs.
As a testament to its commitment to responsible development, StarCoder2 is built using responsibly sourced data under license from Software Heritage, aligning with ethical AI principles and fostering transparency throughout the development process. The model will be made available under the BigCode Open RAIL-M license, ensuring royalty-free access and promoting collaboration within the developer community.
For developers eager to explore the capabilities of StarCoder2, the models will be available for download from Hugging Face, with the 15-billion-parameter model accessible via NVIDIA AI Foundation models. This open-access approach encourages experimentation and innovation, driving forward the boundaries of generative AI in code generation.
In a landscape defined by rapid technological advancements, StarCoder2 stands as a beacon of collaboration, transparency, and responsible innovation, poised to shape the future of code generation and enterprise application development.