XetHub, a Seattle-based startup making data management easy with Git, today announced the successful completion of a $7.5 million seed financing round led by Madrona. XetHub enables developers to work with data as seamlessly as they do code – unlocking collaboration potential similar to that offered by tools like Git. It is an innovative “collaborative storage platform for data management.”
Yucheng Low (CEO), Ajit Banerjee and Rajat Arya co-founded the company. All three had years of experience working with large data platforms: Low previously co-founded ML startup Turi, where Arya was the first employee; at Apple they met Banerjee who had worked at Inktomi, Amazon and Facebook and founded two startups. After Apple acquired Turi in 2016, Low & Arya worked on various parts of its ML platform stack – with Arya leading its data platform team.
XetHub repository view makes navigating and visualizing data repositories easy and intuitive, automatically summarizing common file formats (e.g. CSV) with custom visualizations options to suit your individual needs. Image Credits: XetHub
The Apple data platform team discovered ample opportunity for improvement in data management.
Low emphasizes the critical importance of data: “Data is far more important than anything else–more than models, even. But the way we manage it today feels like source code management from 30 years ago—copy-and-paste and version control by manual processes.”
XetHub enables developers to use their favorite tools, such as Git, for collaborating on data projects.
Low noted that the team created a tool to enable developers to work on data like they do code, while maintaining an experience akin to Git with all its familiar integrations.
XetHub extends Git to support large files, offering efficient storage and transfer with data deduplication while preserving full Git compatibility. Photo courtesy of XetHub.
Our service can currently handle 1TB of data in repositories, with plans to expand this to 100TB. To make it easier for developers, we offer the ability to mount these repositories and use them as a local file system – whether on their laptop or a GPU cluster. Additionally, our tool is format-agnostic.
The team is focusing its marketing efforts on AI/ML, but XetHub can be used to manage any data.
XetHub is now available with a free community edition for managing up to 20GB of deduplicated storage. The company has already started conversations with enterprise customers, but isn’t yet ready to reveal any names.
With over a decade of machine learning experience, Yucheng and the XetHub team have revolutionized how developers at Apple create intelligent and generative applications. According to Matt McIlwain, managing director of Madrona, XetHub enables developers to work with large datasets collaboratively without being hindered by legacy infrastructure or complex data workflows.