SANTA CLARA — NVIDIA has revealed that xAI’s massive Colossus supercomputer, featuring 100,000 NVIDIA Hopper GPUs, is powered by the NVIDIA Spectrum-X Ethernet networking platform, which ensures optimal performance for large-scale AI deployments. This facility, located in Memphis, Tennessee, leverages Spectrum-X’s advanced features like congestion control and Remote Direct Memory Access (RDMA) to deliver unprecedented performance with zero packet loss and 95% data throughput.
The Colossus system, currently the world’s largest AI supercomputer, is being used by xAI to train its Grok language models, soon to expand to 200,000 GPUs. The supercomputer was built in a record-breaking 122 days, with training on Grok models starting just 19 days after installation began. The Spectrum SN5600 Ethernet switch, central to the Spectrum-X platform, supports speeds up to 800Gb/s, enabling efficient and scalable AI processing.
NVIDIA’s Gilad Shainer highlighted Spectrum-X’s role in advancing AI with faster, more cost-efficient solutions, while Elon Musk commended the achievement, calling Colossus the “most powerful training system in the world.” xAI’s spokesperson emphasized how the Spectrum-X platform and NVIDIA Hopper GPUs allow Colossus to redefine the limits of large-scale AI training.