Stability AI, the startup behind the AI-powered art generator Stable Diffusion, has released an open AI model for generating sounds and songs that it claims was trained exclusively on royalty-free recordings.
Called Stable Audio Open, the generative model takes a text description (e.g.
Stability AI says that it’s not optimized for this, and suggests that users looking for those capabilities opt for the company’s premium Stable Audio service.
Stable Audio Open also can’t be used commercially; its terms of service prohibit it.
And it doesn’t perform equally well across musical styles and cultures or with descriptions in languages other than English — biases Stability AI blames on the training data.
Stability has announced Stable Diffusion 3, the latest and most powerful version of the company’s image-generating AI model.
Sora, OpenAI’s impressive video generator, apparently works on similar principles (Will Peebles, co-author of the paper, went on to co-lead the Sora project).
(Anthropic, for its part, has not focused on image or video generation publicly, so it isn’t really part of this conversation.)
Stable Diffusion seems to want to be the white label generative AI that you can’t do without, rather than the boutique generative AI you aren’t sure you need.
Interestingly, the company has put safety front and center in its announcement, stating:We have taken and continue to take reasonable steps to prevent the misuse of Stable Diffusion 3 by bad actors.
Microsoft’s Edge browser now includes Bing’s AI chatbot in a sidebar, which can be used to interact with the bot and learn more about it. Edge Copilot is a feature…