Forbes November 1, 2024
A food fight erupted at the AI HW Summit earlier this year, where three companies all claimed to offer the fastest AI processing. All were faster than GPUs. Now Cerebras has claimed insanely fast AI performance with their latest software running on the company’s Wafer-Scale Engine.
Why does anyone need incredibly fast inference processing that can generate text far faster than anyone can read? Its because the output of one AI can become the input to another AI, creating scalable thinking applications for search, self-correcting summarization, and very soon, agentic AI.
What did Cerebras Announce?
Recently, Cerebras updated their inference processing capabilities with an astonishing 2100 tokens per second running Llama 3.1-70B. That is four pages of text per second....