Model

Unlocking GenAI: How Diffusion Transformers are Revolutionizing OpenAI’s Sora

Screenshot 2024 02 15 At 2.45.48 Pm Transformed
Saining Xie, a computer science professor at NYU, began the research project that spawned the diffusion transformer in June 2022. Diffusion models typically have a “backbone,” or engine of sorts, called a U-Net. In other words, larger and larger transformer models can be trained with significant but not unattainable increases in compute. The current process of training diffusion transformers potentially introduces some inefficiencies and performance loss, but Xie believes this can be addressed over the long horizon. “I’m interested in integrating the domains of content understanding and creation within the framework of diffusion transformers.

SambaNova introduces an all-inclusive collection of generative artificial intelligence models

Gettyimages 1142858578
SambaNova, an AI chip startup that’s raised over $1.1 billion in VC money to date, is gunning for OpenAI — and rivals — with a new generative AI product geared toward enterprise customers. SambaNova today announced Samba-1, an AI-powered system designed for tasks like text rewriting, coding, language translation and more. The company’s calling the architecture a “composition of experts” — a jargony name for a bundle of generative open source AI models, 56 in total. But is Samba-1 really superior to the many, many other AI systems for business tasks out there, least of which OpenAI’s models? Rather, it’s a set-it-and-forget it package — a full-stack solution with everything included, including AI chips, to build AI applications.

“Efficiently Produce Code with the Powerful StarCoder 2: Utilizing GPUs for Optimal Performance”

Gettyimages 1439425791 1
Like most other code generators, StarCoder 2 can suggest ways to complete unfinished lines of code as well as summarize and retrieve snippets of code when asked in natural language. Trained with 4x more data than the original StarCoder, StarCoder 2 delivers what Hugging Face, ServiceNow and Nvidia characterize as “significantly” improved performance at lower costs to operate. Setting all this aside for a moment, is StarCoder 2 really superior to the other code generators out there — free or paid? As with the original StarCoder, StarCoder 2’s training data is available for developers to fork, reproduce or audit as they please. Hugging Face, which offers model implementation consulting plans, is providing hosted versions of the StarCoder 2 models on its platform.

Robots Provide ‘Trash’ Answers for Voting and Elections Questions

Gettyimages 1204951069
A number of major AI services performed poorly in a test of their ability to address questions and concerns about voting and elections. Their concern was that AI models will, as their proprietors have urged and sometimes forced, replace ordinary searches and references for common questions. They submitted these questions via API to five well-known models: Claude, Gemini, GPT-4, Llama 2 and Mixtral. The AI model responses ranged from 1,110 characters (Claude) to 2,015 characters, (Mixtral), and all of the AI models provided lengthy responses detailing between four and six steps to register to vote. GPT-4 came out best, with only approximately one in five of its answers having a problem, pulling ahead by punting on “where do I vote” questions.

GitHub’s Enterprise Copilot Reaches General Release

Gettyimages 1785159335
GitHub today announced the general availability of Copilot Enterprise, the $39/month version of its code completion tool and developer-centric chatbot for large businesses. Many teams already keep their documentation in GitHub repositories today, making it relatively easy for Copilot to reason over it. On top of talking about today’s release, I also asked Dohmke about his high-level thinking of where Copilot is going next. “Different use cases require different models. We will continue going down that path of using the best models for the different pieces of the Copilot experience,” Dohmke said.

Cutting-Edge Writer’s Innovations: Turning Images into Text with Chart and Graph Integration

Gettyimages 1432468418
Today, the company announced a new capability for its Palmyra model that generates text from images, including graphs and charts, they call Palmyra-Vision. May Habib, company co-founder and CEO, says that they made a strategic decision to concentrate on multimodal content, and being able to generate text from images is part of that strategy. “We are going to be focused on multimodal input, but text output, so text generation and insight that is delivered via text,” Habib told TechCrunch. She reserves the right to create charts and graphs at some point from data, but that’s not something they are doing at the moment. This particular release is focused on generating text from those kinds of images.

Mistral AI receives $16 million investment from Microsoft

Gettyimages 1930518491
But Microsoft and Mistral AI buried the news — or at least an important part. At the time, the company raised €385 million (around $415 million) with Andreessen Horowitz (a16z) leading the investment round. Unlike previous Mistral AI releases, Mistral Large isn’t open source. With this investment, Microsoft is now an investor in OpenAI’s capped profit subsidiary and Mistral AI. As for Mistral AI, the so-called European AI champion looks more and more like its American competitors with a closed-source approach and a long list of American backers.

“Inside the World of LLM Building in China: Insights from an Alibaba Employee”

17081555076855 .pic E1610431113848
Chinese tech companies are gathering all sorts of resources and talent to narrow their gap with OpenAI, and experiences for researchers on both sides of the Pacific Ocean can be surprisingly similar. The parallel glimpse into their typical day reveals striking similarities, with wake-up times at 9 a.m. and bedtime around 1 a.m. Both start the day with meetings, followed by a period of coding, model training and brainstorming with colleagues. Besides building its own LLM in-house, Alibaba has been aggressively investing in startups such as Moonshot AI, Zhipu AI, Baichuan and 01.AI. Facing competition, Alibaba has been trying to carve out a niche, and its multilingual move could become a selling point.

Mistral AI unveils cutting-edge competitor to GPT-4 alongside revolutionary chat assistant

Gettyimages 1630826367
Paris-based AI startup Mistral AI is gradually building an alternative to OpenAI and Anthropic as its latest announcement shows. Founded by alums from Google’s DeepMind and Meta, Mistral AI originally positioned itself as an AI company with an open-source focus. Mistral AI’s business model looks more and more like OpenAI’s business model as the company offers Mistral Large through a paid API and usage-based pricing. Mistral AI claims that it ranks second after GPT-4 based on several benchmarks. The first benefit of that partnership is that Mistral AI will likely attract more customers with this new distribution channel.

“Google’s Image-Generating AI: A Confession of Loss of Control”

Adobe Firefly Dogwalkers Tilt Blur
Google has apologized (or come very close to apologizing) for another embarrassing AI blunder this week, an image generating model that injected diversity into pictures with a farcical disregard for historical context. While the underlying issue is perfectly understandable, Google blames the model for “becoming” over-sensitive. But if you ask for 10, and they’re all white guys walking goldens in suburban parks? Where Google’s model went wrong was that it failed to have implicit instructions for situations where historical context was important. These two things led the model to overcompensate in some cases, and be over-conservative in others, leading to images that were embarrassing and wrong.