In an ambitious and bold move, AI startup Anthropic has unveiled a new suite of language models called the Claude 3 family, claiming their flagship offering can outperform OpenAI’s vaunted GPT-4.
The former OpenAI engineers who founded Anthropic are making a big swing with this launch. The Claude 3 lineup includes three models tailored for different use cases – the speedy but economical Haiku, the mid-range Sonnet balancing speed and capability, and the top-of-the-line Opus which Anthropic says surpasses GPT-4 in language understanding.
“We’ve been rigorously testing and refining Claude 3 Opus for over a year now,” said Jack Clark, Anthropic’s co-founder and chief scientist. “Our model demonstrates superior accuracy, recall, and robustness compared to GPT-4 across a wide range of benchmarks.”
That’s a bold claim given the immense resources OpenAI has poured into GPT-4 and the model’s strong performance. But Clark and the Anthropic team seem eager to take on the AI leader head-to-head.
More than just impressive benchmarks, the Claude 3 models bring some compelling real-world capabilities. They can ingest and understand documents, pdfs, diagrams and other data formats using improved multimodal techniques. The models also have extremely long context windows up to 1 million tokens for certain use cases.
Perhaps most impressively, Anthropic says Claude 3 Opus delivers a 2x gain in accuracy on open-ended queries versus GPT-4 by reducing unnecessary refusals to answer and better leveraging context.
“Modern AI should be able to engage with messy, real-world prompts in a substantive way,” said Clark. “That’s where our models really shine compared to more cautious, refusal-prone systems.”
While Google recently demurred when asked to compare their new AI language model Gemini to GPT-3, Anthropic is taking the opposite tack by frontally challenging OpenAI’s flagship.
Whether Claude 3 Opus lives up to the billing remains to be seen. But Anthropic’s brazen strategy is sure to grab attention and highlights the intensifying battle among AI labs to develop safe, capable and scalable language models.