How many AI models is too many? It’s a question that may not have a straightforward answer. However, one thing is certain – 10 a week is probably a bit much. In the last few days, we’ve seen a remarkable proliferation of AI models of varying sizes and scopes, from niche developers to large, well-funded ones. As we look at the sheer number of models released this week alone, it’s increasingly hard to say whether and how these models compare to one another – if it was ever possible to begin with. So, we may ask – what’s the point?
We’re in a strange moment in the evolution of AI, although the entire journey has been quite surreal. From niche developers to big corporations, there seems to be no shortage of AI models being introduced. Let’s take a look at the list from this week, shall we? I have condensed as far as possible what sets each model apart.
- Model 1 – This model was announced while I was writing this, bringing the total to 11 models.
- Model 2 – This model is an example of combining existing models and fine-tuning them for a specific purpose, like Idefics 2.
- Model 3 – This one is an experimental or niche model that may not have a wide use case.
- Model 4 – And the list goes on, with numerous other releases and previews this week alone.
And let’s be clear – this is not an exhaustive list of all the models being released or previewed this week. It’s just a small selection that we’ve seen and discussed so far. If we were to expand our scope, we would have to include dozens of fine-tuned existing models, combinations of models, and experimental or niche projects. We can’t possibly “review” all of them. But the real question is – how can we help you, our readers, understand and keep up with this never-ending influx of AI models?
Now, the truth is, you don’t have to keep up, and neither does almost anyone else. The AI landscape has undergone a shift: some models, like ChatGPT and Gemini, have evolved into entire web platforms with multiple use cases and access points. Other large language models such as LLaMa or OLMo, while sharing a common architecture, do not necessarily serve the same function. These models are intended to be used in the background as a service or component, rather than in the foreground as a name brand.
However, there has been a deliberate merging of these two concepts, as model developers wish to borrow some of the limelight typically associated with major AI platform releases, like GPT-4V or Gemini Ultra. Every developer wants their model to seem important and impactful. But let’s be real – while their models may be important to someone, that someone is likely not you.
Think about it this way – it’s similar to the world of cars. When cars were first invented, you simply bought “a car”. Then, you had the option to choose between a big car, a small car, or a tractor. Nowadays, there are hundreds of car models released every year, but you probably don’t need to know about most of them – because nine out of ten are either not what you need, or aren’t even recognizable as a car. In the same way, we’re moving from the era of big/small/tractor AI models towards an era of unprecedented proliferation, where even AI specialists can’t possibly keep up with and test every model coming out.
But here’s the flip side of the story – we were already in this stage long before models like ChatGPT and the others were released. Back then, fewer people were paying attention, but we covered it nonetheless because it was evident that this technology was on the brink of a breakthrough – which ultimately happened. There were papers, models, and constant research being published, and conferences like SIGGRAPH and NeurIPS were filled with machine learning engineers exchanging notes and building on one another’s work. Here’s a fun fact – I wrote a story in 2011 about understanding AI through visual means!
Every day, this activity continues. But as AI has become the biggest business in tech right now, the latest models have been given extra significance, because people are curious whether one of these AI models will be the definitive leap over ChatGPT, just like ChatGPT was an improvement over its predecessors.
The simple truth is – none of these models are going to revolutionize AI once again. After all, OpenAI’s advancement was based on a fundamental shift in machine learning architecture, and every other company has now adopted it. Since this shift has yet to be superseded, we can only look forward to incremental improvements such as slightly better performance on a synthetic benchmark, or somewhat more convincing language or imagery for the time being.
But does that mean that none of these models matter at all? Not at all. You can’t make it to version 3.0 without first going through 2.1, 2.2, and so on. That’s what researchers and engineers work tirelessly on. And sometimes, these advancements address critical flaws, expose vulnerabilities, or provide useful insights. We do our best to cover only the most interesting ones, but even then, it’s just a fraction of the total number. We’re currently working on a piece that will highlight all the models we believe AI enthusiasts should be aware of – and the list is already more than a dozen.
But there’s no need to fret – when a truly major model is released, everyone – including TechCrunch – will make sure you know about it. It’ll be as obvious to you as it is to us.
[…] all the analysis we brought to This Week in AI and more, including a spotlight on noteworthy new AI models, right […]