” New Models from Anthropic Outperform GPT-4

All show “increased capabilities” in analysis and forecasting, Anthropic claims, as well as enhanced performance on specific benchmarks versus models like GPT-4 (but not GPT-4 Turbo) and Google’s Gemini 1.0 Ultra (but not Gemini 1.5 Pro). A model’s context, or context window, refers to input data (e.g. In a technical whitepaper, Anthropic admits that Claude 3 isn’t immune from the issues plaguing other GenAI models, namely bias and hallucinations (i.e. Unlike some GenAI models, Claude 3 can’t search the web; the models can only answer questions using data from before August 2023. Here’s the pricing breakdown:Opus: $15 per million input tokens, $75 per million output tokensSonnet: $3 per million input tokens, $15 per million output tokensHaiku: $0.25 per million input tokens, $1.25 per million output tokensSo that’s Claude 3.

Welcome to the world of AI, where possibilities are endless and advancements are relentless. Today, AI startup Anthropic has raised hundreds of millions in venture capital, and with the potential of securing even more in the near future, the company has announced its latest version of GenAI technology, Claude. Claiming to rival OpenAI’s GPT-4 in terms of performance, Anthropic presents Claude 3 as a family of models, including Claude 3 Haiku, Claude 3 Sonnet, and Claude 3 Opus (the most powerful option). The company boasts of “increased capabilities” in analysis and forecasting, with enhanced performance on specific benchmarks compared to models like GPT-4 and Google’s Gemini 1.0 Ultra.

“Notably, Claude 3 is Anthropic’s first multimodal GenAI, meaning that it can analyze text as well as images.”

Claude 3’s capabilities extend to image analysis as well, similar to its rivals GPT-4 and Gemini. This versatile model can process various types of documents, including photos, charts, graphs, and technical diagrams from PDFs and slideshows. Anthropic takes it a step further, enabling Claude 3 to analyze multiple images in a single request, with the ability to compare and contrast them.

But like any other AI model, Claude 3 has its limitations. Anthropic has disabled its ability to identify people, possibly due to ethical and legal implications. It also admits that the model may make mistakes with “low-quality” images and struggle with tasks involving spatial reasoning and object counting. Additionally, Claude 3 does not generate artwork; its focus is solely on image analysis.

The company promises that customers can expect better performance from Claude 3 when it comes to following multi-step instructions, producing structured output in formats like JSON, conversing in multiple languages, and citing the source of its answers for verification. Anthropic also mentions that the model generates more expressive and engaging responses, making it easier to prompt and steer with shorter and more concise prompts.

“Claude 3’s improvements stem from its expanded context, giving it a better grasp of the narrative flow of data and the ability to generate more contextually rich responses.”

The context window of a model refers to input data, such as text, that it considers before generating output. A larger context window allows for a better understanding of the conversation and produces more relevant responses. Anthropic states that Claude 3 will initially support a 200,000-token context window, with select customers getting up to a million-token context window, similar to Google’s newest GenAI model, Gemini 1.5 Pro.

While Claude 3 is an upgrade from its predecessors, it still has flaws. In a technical whitepaper, Anthropic acknowledges issues such as bias and hallucinations (creating content). The model cannot search the web and is limited to data from before August 2023. Although multilingual, Claude 3 may not be as fluent in certain “low-resource” languages as it is in English.

In the coming months, Anthropic plans to release enhancements to Claude 3, as the company believes that model intelligence is far from reaching its limits. Opus and Sonnet are currently available online and through Anthropic’s dev console and API, Amazon’s Bedrock platform, and Google’s Vertex AI. Haiku will be released later this year.

Here’s a breakdown of the pricing for Claude 3:

Opus: $15 per million input tokens, $75 per million output tokens
Sonnet: $3 per million input tokens, $15 per million output tokens
Haiku: $0.25 per million input tokens, $1.25 per million output tokens

But what does all of this mean in the grand scheme of things? Anthropic’s ultimate goal, as we’ve reported before, is to create an algorithm for “AI self-teaching,” which could be used to build virtual assistants capable of answering emails, performing research, and even generating art and books. This concept has already been introduced through models like GPT-4.

Anthropic hints at this in their blog post, stating that they plan to add features that would allow Claude 3 to interact with other systems, code “interactively,” and deliver advanced agentic capabilities. Similar to OpenAI’s reported ambitions, Anthropic aims to create a software agent that can automate complex tasks, such as transferring data or filling out forms.

Anthropic’s unique technique for training models, called “constitutional AI,” aims to align AI with human intentions, making it easier to understand, predict, and adjust as needed. The company has also added a principle to Claude 3, based on crowdsourced feedback, that instructs the model to be understanding and accessible to people with disabilities.

Although Anthropic remains focused on their goals for AI, it’s clear they are in it for the long haul. With plans to raise billions in the next year and constant updates promised for Claude 3, Anthropic is determined to remain competitive with its rivals.

As we continue to watch the evolution of AI, it’ll be interesting to see Anthropic’s progress and the potential risks and ethical implications it may bring. For now, the company seems to be dedicated to pushing the boundaries of what is possible with AI and providing innovative solutions for its customers.