Given the frequency with which their developers toss around the phrase “general purpose humanoids,” more attention ought to be paid to the first bit.
After decades of single purpose systems, the jump to more generalized systems will be a big one.
The use of generative AI in robotics has been a white-hot subject recently, as well.
One of the biggest challenges on the road to general purpose systems is training.
The proliferation of multi-purpose systems would take the industry a step closer to general purpose dream.
Saining Xie, a computer science professor at NYU, began the research project that spawned the diffusion transformer in June 2022.
Diffusion models typically have a “backbone,” or engine of sorts, called a U-Net.
In other words, larger and larger transformer models can be trained with significant but not unattainable increases in compute.
The current process of training diffusion transformers potentially introduces some inefficiencies and performance loss, but Xie believes this can be addressed over the long horizon.
“I’m interested in integrating the domains of content understanding and creation within the framework of diffusion transformers.
Stability has announced Stable Diffusion 3, the latest and most powerful version of the company’s image-generating AI model.
Sora, OpenAI’s impressive video generator, apparently works on similar principles (Will Peebles, co-author of the paper, went on to co-lead the Sora project).
(Anthropic, for its part, has not focused on image or video generation publicly, so it isn’t really part of this conversation.)
Stable Diffusion seems to want to be the white label generative AI that you can’t do without, rather than the boutique generative AI you aren’t sure you need.
Interestingly, the company has put safety front and center in its announcement, stating:We have taken and continue to take reasonable steps to prevent the misuse of Stable Diffusion 3 by bad actors.
OpenAI’s consistency model is an exciting new development in imagegeneration. It can already do simple tasks an order of magnitude faster than the likes of DALL-E, and may have already…