Anthropic's New Claude Model Claims to Beat GPT-4

Welcome Artisan,

There is a new model on the block, and its name is Claude.

Well, Claude is obviously not new, but Claude 3 sure is, and its coming in hot, making some heavy claims about its size and performance.

Let’s dive in.

Hot Off The Press

The One Big Thing

GPT-4 Is No Longer the Top Model…or so its competitors claim

The world we live in is such that when a new foundation model comes out, we do not study its merits by itself but rather compare its performance to GPT-4 and decide whether or not its worth our time to check it out.

With its Clause 3 release, Anthropic front ran the conversation and boldly came out and claimed that Opus, its largest and most performant model, is already ahead of GPT-4 and Google Gemini in every single relevant metric (seen above). Here’s everything we know:

The Basics:

  • Launch of three models: Claude 3 Haiku, Claude 3 Sonnet, and Claude 3 Opus

  • Models vary by intelligence, speed, and cost, catering to different user needs.

  • Long context understanding and near-perfect recall for handling extensive information (similar to Gemini)

  • Advanced in adhering to brand voice and complex instructions, many leaps in preventing hallucinations

  • Opus and Sonnet available for use, with Haiku set to launch soon.

The thing about Large Language Models is they are black boxes. Unless they are open source models (which we heavily advocate for here) we do not get to see how they work. This is to say that we do not have accurate performance testing for Claude 3 yet, so there is not a completely accurate way to test whether or not their claims are true.

That has not stopped people from trying. And the results are… interesting, to say the least. Early reviews seem to show that the Claude 3 models are actually more expensive than advertised, and potentially trained on specific data sets that make them display better performance initially.

It also seems like Anthropic is not quite making an apples to apples comparison here:

On the other hand it’s already solving PhD level problems which were previously unsolvable:

The main takeaway here is the competition is good. The more options people have to choose from will force the models to get better, faster, and cheaper. The more tooling there is to test model performance, the more accountable these companies will be in their releases.

While we very much wish these models were all open sourced from the get-go, the fact that there are new players in town to put pressure on OpenAI and Google (lol) means that the Great AI Arms Race of this century is starting to heat up.

Stability AI Releases a 3D Modeling Tool

Congrats, you can now have a chatbot of a dead loved one

Tools

Must have tools for every Renaissance creator to add to their toolkit:

  • TripoSR: A new image-to-3D model capable of creating high quality outputs in less than a second

  • Zero-shot Audio Editing using DDPM Inversion

  • Rehearsal: LLM-empowered virtual partner to help you practice conflict resolution skills with tailored feedback

  • Free course to learn to make games with AI for Unity

  • Course: Vector Databases: from Embeddings to Applications

  • LaVague: Fully open-source AI pipeline to turn natural language into browser actions

Deep Tech

The newest and coolest in the research world that you need to know about:

Closing Thought

Large context models are a really crazy thing that we haven’t wrapped our heads around yet

Thinking about turning movies into TikToks in one click

Work With Us!

The AI Renaissance is coming and we are building the best community of the people making it happen.

Contact us to sponsor your product or brand and reach the exact audience for your needs across our newsletter and podcast network.