Enhancing Enterprise Creation with LLMs: Databricks Unleashes Mosaic AI Expansion

Now rebranded as Mosaic AI, the platform has become integral to Databricks’ AI solutions. Today, at the company’s Data + AI Summit, it is launching a number of new features for the service. Databricks is launching five new Mosaic AI tools at its conference: Mosaic AI Agent Framework, Mosaic AI Agent Evaluation, Mosaic AI Tools Catalog, Mosaic AI Model Training, and Mosaic AI Gateway. And we’ve also found in our internal AI applications, like the assistant applications for our platform, that this is the way to build them,” he said. But when you really pushed people, they were using Open AI.

A year ago, Databricks acquired MosaicML for $1.3 billion. Now rebranded as Mosaic AI, the platform has become integral to Databricks’ AI solutions. Today, at the company’s Data + AI Summit, it is launching a number of new features for the service. Ahead of the announcements, I spoke to Databricks co-founders CEO Ali Ghodsi and CTO Matei Zaharia.

Databricks is launching five new Mosaic AI tools at its conference: Mosaic AI Agent Framework, Mosaic AI Agent Evaluation, Mosaic AI Tools Catalog, Mosaic AI Model Training, and Mosaic AI Gateway.

“It’s been an awesome year — huge developments in Gen AI. Everybody’s excited about it,” Ghodsi told me. “But the things everybody cares about are still the same three things: how do we make the quality or reliability of these models go up? Number two, how do we make sure that it’s cost-efficient? And there’s a huge variance in cost between models here — a gigantic, orders-of-magnitude difference in price. And third, how do we do that in a way that we keep the privacy of our data?”

Today’s launches aim to cover the majority of these concerns for Databricks’ customers.

Zaharia also noted that the enterprises that are now deploying large language models (LLMs) into production are using systems that have multiple components. That often means they make multiple calls to a model (or maybe multiple models, too), and use a variety of external tools for accessing databases or doing retrieval augmented generation (RAG). These compound systems speed up LLM-based applications, save money by using cheaper models for specific queries or caching results and, maybe most importantly, make the results more trustworthy and relevant by augmenting the foundation models with proprietary data.

“We think that is the future of really high-impact, mission-critical AI applications,” he explained. “Because if you think about it, if you’re doing something really mission critical, you’ll want engineers to be able to control all aspects of it — and you do that with a modular system. So we’re developing a lot of basic research on what’s the best way to create these [systems] for a specific task so developers can easily work with them and hook up all the bits, trace everything through, and see what’s happening.”

As for actually building these systems, Databricks is launching two services this week: the Mosaic AI Agent Framework and the Mosaic AI Tools Catalog. The AI Agent Framework takes the company’s serverless vector search functionality, which became generally available last month and provides developers with the tools to build their own RAG-based applications on top of that.

Ghodsi and Zaharia emphasized that the Databricks vector search system uses a hybrid approach, combining classic keyword-based search with embedding search. All of this is integrated deeply with the Databricks data lake and the data on both platforms is always automatically kept in sync. This includes the governance features of the overall Databricks platform — and specifically the Databricks Unity Catalog governance layer — to ensure, for example, that personal information doesn’t leak into the vector search service.

Talking about the Unity Catalog (which the company is now also slowly open sourcing), it’s worth noting that Databricks is now extending this system to let enterprises govern which AI tools and functions these LLMs can call upon when generating answers. This catalog, Databricks says, will also make these services more discoverable across a company.

Ghodsi also highlighted that developers can now take all of these tools to build their own agents by chaining together models and functions using Langchain or LlamaIndex, for example. And indeed, Zaharia tells me that a lot of Databricks customers are already using these tools today.

“There are a lot of companies using these things, even the agent-like workflows. I think people are often surprised by how many there are, but it seems to be the direction things are going. And we’ve also found in our internal AI applications, like the assistant applications for our platform, that this is the way to build them,” he said.

To evaluate these new applications Databricks is also launching the Mosaic AI Agent Evaluation, an AI-assisted evaluation tool that combines LLM-based judges to test how well the AI does in production, but also allows enterprises to quickly get feedback from users (and let them label some initial data sets, too). The Quality Lab includes a UI component based on Databricks’ acquisition of Lilac earlier this year, which lets users visualize and search massive text data sets.

“Every customer we have is saying: I do need to do some labeling internally, I’m going to have some employees do it. I just need maybe 100 answers, or maybe 500 answers — and then we can feed that into the LLM judges,” Ghodsi explained.

Another way to improve results is by using fine-tuned models. For this, Databricks now offers the Mosaic AI Model Training service, which — you guessed it — allows its users to fine-tune models with their organization’s private data to help them perform better on specific tasks.

The last new tool is the Mosaic AI Gateway, which the company describes as a “unified interface to query, manage, and deploy any open source or proprietary model.” The idea here is to allow users to query any LLM in a governed way, using a centralized credentials store. No enterprise, after all, wants its engineers to send random data to third-party services.

In times of shrinking budgets, the AI Gateway also allows IT to set rate limits for different vendors to keep costs manageable. Additionally, these enterprises then also get usage tracking and tracing for debugging these systems.

As Ghodsi told me, all of these new features are a reaction to how Databricks’ users are now working with LLMs. “We saw a big shift happen in the market in the last quarter and a half. Beginning of last year, anyone you talk to, they’d say: we’re pro open source, open source is awesome. But when you really pushed people, they were using Open AI. Everybody, no matter what they said, no matter how much they were touting how open source is awesome, behind the scenes, they were using Open AI.” Now, these customers have become far more sophisticated and are using open models (very few are really open source, of course), which in turn requires them to adopt an entirely new set of tools to tackle the problems — and opportunities — that come with that.