Paul Barham once said, “You can have a second computer once you’ve shown you know how to use the first one.”
In August 2019, the world was introduced to the Wafer-Scale Engine, the largest commercial chip ever manufactured, and the industry’s first wafer-scale processor, built from the ground up to solve the problem of deep learning compute. This was the entry of Cerebras Systems into the AI hardware market and today it is in the running to challenge the legends Nvidia too.
Cerebras Systems was founded by Andrew Feldman, Gary Lauterbach, Michael James, Sean Lie, and Jean-Phillippe Fricker. The founders set out with an ambitious goal: to revolutionize AI computing by creating the largest chip ever made—which ended up 60 times larger than anything before it. Their mission was clear: to dramatically speed up AI workloads, turning tasks that once took months into mere minutes.
They called it reducing the “cost of curiosity,” making breakthroughs in areas like cancer research and clean water testing happen faster. Having previously teamed up at SeaMicro, they brought their expertise to tackle this bold new challenge, aiming to push the limits of technology and scientific discovery.
“We take great pride in solving problems that others can’t solve, that others were afraid to solve, and that others thought couldn’t be solved. Those are the things we’re proud of. We’re proud of our approach to solving previously unsolved problems.” said CEO Andrew Feldman.
Wafer-Scale Engine (WSE)
The WSE-1, introduced in 2019, was a technological marvel featuring 1.2 trillion transistors and 400,000 AI-optimized cores on a single 46,225 mm² silicon wafer.
“Nvidia gets about 30 chips from a wafer. Each chip is put on a circuit board. If they sell it in a DGX, they have to buy two Intel processors and put together. If they sell it in a DGX-2, they have to put in switches. And to compete with us, they have to put in 20 to 30 switches. I just have one piece of Silicon!” said Andrew Feldman
This was followed by the WSE-2 in 2021, which expanded to 2.6 trillion transistors and 850,000 cores, doubling the AI compute power.
“The market is moving unbelievably quickly,” Feldman said
The latest iteration, WSE-3, announced in 2024, pushes the boundaries even further with 4 trillion transistors and 900,000 AI-optimized cores, delivering 125 petaflops of AI performance. The WSE-3 has garnered significant attention for its ability to outperform NVIDIA’s H100 GPU by a factor of 210 in carbon capture simulations, showcasing its potential in critical climate change mitigation efforts.
“Our strategic partnership with Cerebras has been instrumental in propelling innovation at G42, and will contribute to the acceleration of the AI revolution on a global scale,” told Kiril Evtimov, Group CTO of G42
The WSE’s unique wafer-scale architecture eliminates many of the bottlenecks associated with traditional chip designs, allowing for unprecedented speed and efficiency in AI computations.
Cerebras AI Model Studio
Launched in November 2022, the Cerebras AI Model Studio is a cloud-based platform on Cerebras Cloud @ Cirrascale that allows users to train cutting-edge GPT models using the powerful Cerebras CS-2 accelerator. Optimized for large language models, it offers deterministic performance across clusters of millions of AI cores without the complexity of distributed computing. With a pay-by-the-model service, users can train models from 1.3 billion to 175 billion parameters, boasting speeds up to 8 times faster and at half the cost of traditional cloud providers, making it ideal for efficient large-scale AI development.
Cerebras Inference
Launched in August 2024, Cerebras Inference promises to shake up the AI hardware landscape with what it claims is the world’s fastest AI inference service, delivering speeds up to 20 times faster than NVIDIA’s A100 GPUs at just 1/5th the cost. Powered by Cerebras’ cutting-edge CS-3 system and its revolutionary Wafer Scale Engine 3 (WSE-3) processor, this service handles 1,800 tokens per second for the Llama 3.1 8B model and 450 tokens per second for the Llama 3.1 70B model, all while maintaining 16-bit accuracy. With 7,000 times more memory bandwidth than NVIDIA’s H100 GPU, Cerebras Inference is built for AI applications demanding complex reasoning, high accuracy, and fast response times. Positioned as a direct challenge to NVIDIA’s dominance, Cerebras CEO Andrew Feldman summed it up: “When you compete against a leader like Nvidia, you have to bring a product that is vastly better.”
"When you compete against a leader like Nvidia, you have to bring a product that is vastly better."
— Cerebras (@CerebrasSystems) August 29, 2024
CNBC discussed Cerebras Inference with Andrew Feldman this morning.
https://t.co/cOpN8USTgL
Try it out! https://t.co/jBOftRmrkI pic.twitter.com/ZGR4PDkbM9
Cerebras DocChat
Cerebras Systems launched DocChat on August 16, 2024, a series of models for document-based conversational question answering, including Llama3-DocChat and Dragon-DocChat. Trained in just hours on Cerebras’ hardware, they claim to deliver “GPT-4 level” performance, excelling in tasks like arithmetic, entity extraction, and handling unanswerable questions. Outperforming rivals on ChatRAG benchmarks, Cerebras has open-sourced the models, training recipes, and datasets for developers to build upon. This release showcases Cerebras’ ability to quickly train high-performance models, setting a new standard in AI-driven conversations.
Source: Cerebras systems
Partnerships
Cerebras Systems has established key partnerships across industries, leveraging its AI computing power to drive innovation. In healthcare, its multi-year collaboration with Mayo Clinic focuses on developing large language models (LLMs) to improve diagnostics, including a Rheumatoid Arthritis model that enhances diagnostic accuracy using patient data and DNA.
“Mayo Clinic selected Cerebras as its first generative AI collaborator for its large-scale, domain-specific AI expertise to accelerate breakthrough insights for the benefit of patients,” said Matthew Callstrom, MD, PhD, of the Mayo Clinic.
In scientific research, Cerebras partnered with Argonne National Laboratory to deploy the world’s fastest AI computer for COVID-19 research and with Lawrence Livermore National Laboratory to accelerate AI initiatives.
“We’ve partnered with Cerebras for more than two years and are extremely pleased to have brought the new AI system to Argonne,” said Rick Stevens, Argonne’s associate laboratory director for computing, environment, and life sciences
Additionally, its work with AstraZeneca aims to speed up drug discovery using advanced AI models, while a partnership with nference focuses on enhancing natural language processing for biomedical research.
“Testing technologies like Cerebras on real use cases is important to understand what investments we need to make in AI strategically.” – Nick Brown, Head of AI Engineering at AstraZeneca
In the generative AI space, Cerebras teamed up with Jasper to boost the accuracy and efficiency of AI-generated content.
Dave Rogenmoser, CEO of Jasper, said, “Our collaboration with Cerebras accelerates the potential of generative AI, bringing its benefits to our rapidly growing customer base around the globe.”
Funding and Valuation and Growth
Cerebras Systems has raised a substantial $720 million through various funding rounds, marking significant progress in the AI chip sector. The company’s Series F round in November 2021 brought in $250 million, boosting its valuation to over $4 billion. Earlier, the Series E round in November 2019 contributed $273.35 million. A Series F-1 round in June 2024 reportedly added $420 million, though this figure varies across sources.
Cerebras has drawn interest from major investors including Alpha Wave Ventures, Abu Dhabi Growth Fund (ADG), and G42. Achieving unicorn status with a $1.8 billion valuation during its Series D funding in November 2018, the company is now preparing for a highly anticipated IPO. Although market conditions are currently uncertain, particularly with recent fluctuations in the chip sector, the IPO could happen as early as late 2024. As part of its IPO preparations, the company is constructing the third of nine $100 million supercomputers, which will be linked by Emirati AI firm G42 to create “the largest AI training supercomputer globally.” Cerebras aims to surpass its 2021 valuation of $4 billion, with Barclays joining its list of underwriters. The IPO will test the company’s ability to compete with leaders like Nvidia in the evolving AI hardware market.
Cerebras Systems today announced that we have confidentially submitted a draft registration statement on Form S-1 with the U.S. Securities and Exchange Commission (“SEC”) relating to the proposed initial public offering of its common stock.
— Cerebras (@CerebrasSystems) August 1, 2024
This press release does not…
August 2024, Cerebras Systems expanded its board with the appointments of Glenda Dorchak and Paul Auvil, effective August 7. Dorchak, with a distinguished career at IBM, Intel, and Spansion, brings over three decades of operational leadership and will chair the Compensation Committee. Auvil, who previously served as CFO at VMware and Proofpoint, offers over 35 years of finance and technology expertise and will lead the Audit Committee. Additionally, the company welcomed Bob Komin as the new Senior Vice President and Chief Financial Officer (CFO).
Partnering with Docker, Cerebras aims to streamline the deployment of AI-powered applications across various environments. Collaboration with Nasdaq hints at potential innovations in high-frequency trading and financial technology through advanced AI capabilities. The company is also integrating with LangChain and LlamaIndex to boost natural language processing and large language model applications. Additionally, the partnership with Weights & Biases is likely focused on improving MLOps workflows, while collaborations with Weaviate, AgentOps, and Log10 suggest efforts to build a robust ecosystem around Cerebras’ AI hardware. A partnership with DeepLearning.AI is set to empower the next generation of AI developers with the necessary skills to leverage Cerebras’ technology.
On the global expansion front, Cerebras Systems has made significant strides. The Tokyo office, opened in September 2020 and led by Hiromasa Ebi, is focused on penetrating the Japanese AI and high-performance computing market. The Toronto office, also opened in September 2020 and led by Nish Sinnadurai, aims to accelerate R&D efforts and establish an AI center of excellence, with plans to triple its engineering team. In 2023, Cerebras launched its Bangalore office under the leadership of Lakshmi Ramachandran, which is dedicated to boosting R&D and supporting local customers, with a goal of employing over 60 engineers by the end of the year.
“If you imagine the work along the X axis being slower at one end and faster and more complex at the other end, it’s definitely a commodity business running lots and lots of slow jobs at one end. But at the other end, fast, long workloads, that is not at all a commodity, that is very sophisticated. To the extent the industry shifts to those faster, more complex types of work, that’s where we win.” said Andrew.
Cerebras Systems is challenging NVIDIA’s dominance in the AI GPU market with its cutting-edge Wafer Scale Engine (WSE) chips. The WSE-3, featuring 4 trillion transistors and 900,000 AI-optimized cores, claims to outperform NVIDIA’s H100 GPU, while Cerebras’ pricing for inference services—starting at just 10 cents per million tokens—offers a cost-effective alternative.
As the company builds its ambitious $100 million supercomputers and advances its technology, the question remains: Can Cerebras truly disrupt NVIDIA’s established market lead? With the AI industry shifting towards more complex workloads, Cerebras aims to leverage its innovations to gain a competitive edge. Will they succeed in challenging NVIDIA, or will NVIDIA’s entrenched position and ongoing advancements prove too formidable?
As Senior VP of Products and Strategy Andy Hock noted, “We were able to pivot before, and then pivot back again.” With Cerebras’ proven adaptability and bold vision, Nvidia may need to watch out.