SAN JOSE — Recogni Inc., a leader in Generative AI inference technology, has introduced Pareto, its revolutionary logarithmic number system that promises to redefine AI chip design. This breakthrough system enables AI models to run more efficiently by converting multiplications into additions, resulting in smaller, faster, and more energy-efficient chips, without compromising the accuracy required for high-performance applications.
With the growing demands of next-generation AI models, which require calculations in the petaFLOP range, traditional methods of computation have struggled with power consumption and processing speed. Pareto addresses these issues head-on by significantly reducing both power usage and execution time while maintaining accuracy. This unique logarithmic approach outperforms existing quantized number systems used in GenAI inference, positioning Recogni as the first company to bring this technology to market.
The primary benefits of Pareto include:
- Smaller Chip Design: By increasing efficiency, Pareto enables the development of more compact chips, allowing data centers to accommodate more compute power, thereby reducing costs.
- Lower Power Consumption: Pareto’s logarithmic system consumes far less power than traditional FP8 and FP16 formats, making it ideal for sustainable AI computing.
- High Accuracy: Models running on Pareto retain over 99.9% accuracy compared to high-precision models, with less than 0.1% drop at 16-bit precision and under 1% at 8-bit precision, all without requiring retraining.
Marc Bolitho, CEO of Recogni, commented, “Pareto is a game-changer for AI. By turning multiplications into additions, it reduces power consumption, latency, and chip size, making it the ideal choice for modern AI chip design. This technology allows organizations to significantly cut operational costs while maintaining the highest levels of accuracy across a range of GenAI inference applications.”
Extensive testing across major AI models, including Mixtral-8x22B, Llama3-70B, Falcon-180B, and Stable Diffusion XL, shows that Pareto outperforms traditional systems in both power efficiency and accuracy. By achieving FP16 accuracy at a fraction of the power consumption of competitors’ FP8 models, it enables faster deployment of trained models with minimal accuracy loss, eliminating the need for time-intensive retraining.
Gilles Backhus, Founder and VP of AI at Recogni, added, “With Pareto, we’ve created a number system that directly addresses the needs of businesses and machine learning developers. It enables companies to deploy their AI models with exceptional power efficiency and accuracy, cutting down on time and costs. While other businesses struggle to convert models to lower precision to reduce operational expenses, Pareto streamlines the process, making it quicker, cheaper, and more effective.”
After seven years of development, Recogni’s Pareto technology has already demonstrated its impact. The company’s initial chip, built with 7nm TSMC technology, exceeded performance expectations and validated all key hypotheses. Moving forward, Recogni is preparing to announce a strategic partnership that will broaden the accessibility of Pareto, accelerating the pace of AI innovation.
Recogni will showcase its cutting-edge Pareto technology at NeurIPS 2024, offering a deep dive into its capabilities and potential for transforming AI operations.