SANTA CLARA — In a significant leap forward for conversational AI, NVIDIA has achieved groundbreaking milestones in language understanding, drastically reducing training times and inference durations for AI models. The announcement, made today, heralds a new era where real-time conversational AI becomes a tangible reality, poised to revolutionize various industries including retail, customer service, automotive, and banking.
NVIDIA’s advancements in language understanding technology have led to the training of one of the most sophisticated AI language models, BERT, in an astonishing 53 minutes, setting a new industry record. Moreover, the company has achieved lightning-fast AI inference, completing tasks in just over 2 milliseconds, a feat previously considered unattainable.
This breakthrough paves the way for developers to integrate state-of-the-art language understanding capabilities into large-scale applications, potentially reaching hundreds of millions of consumers worldwide. Early adopters of NVIDIA’s advancements include tech giant Microsoft and several innovative startups, who are leveraging these capabilities to create highly intuitive, responsive language-based services for their customers.
The challenge of deploying extremely large AI models in real-time has hindered the progress of conversational AI until now. NVIDIA’s optimizations to its AI platform have addressed this obstacle, enabling the deployment of the largest language model of its kind to date.
“Large language models are revolutionizing AI for natural language,” stated Bryan Catanzaro, vice president of Applied Deep Learning Research at NVIDIA. “NVIDIA’s groundbreaking work accelerating these models allows organizations to create new, state-of-the-art services that can assist and delight their customers in ways never before imagined.”
The demand for AI services powered by natural language understanding is projected to surge in the coming years. According to Juniper Research, digital voice assistants are expected to skyrocket from 2.5 billion to 8 billion within the next four years. Moreover, Gartner predicts that by 2021, 15% of all customer service interactions will be entirely handled by AI, marking a 400% increase from 2017.
NVIDIA’s AI platform optimizations have resulted in three significant performance records:
- Fastest Training: Utilizing NVIDIA’s DGX SuperPOD™, BERT-Large was trained in an unprecedented 53 minutes, slashing the typical training time from several days. Additionally, NVIDIA demonstrated the scalability of its GPUs by training BERT-Large on just one NVIDIA DGX-2 system in 2.8 days.
- Fastest Inference: Leveraging NVIDIA T4 GPUs running NVIDIA TensorRT™, inference on the BERT-Base SQuAD dataset was accomplished in a remarkable 2.2 milliseconds, well below the 10-millisecond threshold for real-time applications.
- Largest Model: NVIDIA Research constructed and trained the world’s largest language model based on Transformers, boasting 8.3 billion parameters, 24 times the size of BERT-Large.
The adoption of NVIDIA’s AI platform is already widespread among developers globally. Microsoft Bing, for instance, has optimized the inferencing of BERT using NVIDIA GPUs, resulting in significant improvements in search quality.
Several startups in NVIDIA’s Inception program, including Clinc, Passage AI, and Recordsure, are leveraging NVIDIA’s AI platform to develop cutting-edge conversational AI services across various sectors, including finance, healthcare, and automotive.
NVIDIA has made the software optimizations used to achieve these breakthroughs available to developers, ensuring broader accessibility and fostering innovation in the field of conversational AI.
The convergence of NVIDIA’s performance breakthroughs and the growing demand for AI-powered language understanding heralds a future where seamless, natural interactions with technology become the norm, transforming the way businesses engage with their customers.