When enterprise teams experiment with AI today, they rarely start from scratch. More often, they begin with fragmented data spread across cloud buckets, SaaS tools, legacy databases, and on-prem infrastructure each governed by different policies, formats, and stakeholders. For most organizations, this patchwork isn’t just a nuisance. It’s the bottleneck standing between AI pilots and real production use.
That infrastructure problem is where Starburst is placing its bet.
The Boston-based data platform, known for its open-source roots in Trino and distributed SQL analytics, is expanding its scope. This week, the company announced a set of AI-focused upgrades across its two core products, Starburst Enterprise and Starburst Galaxy including a new built-in agent, vector-native AI search, and what it calls AI Workflows, a framework for orchestrating AI pipelines on top of a governed data lakehouse.
“At the end of the day, your AI is only as powerful as the data it can access,” said Justin Borgman, CEO and co-founder of Starburst. “And right now, most enterprise architectures just aren’t ready.”
The new capabilities reflect a growing recognition inside large enterprises: access to AI models isn’t the hard part anymore, access to high-quality, well-governed, context-rich data is. Borgman believes solving that challenge requires more than building connectors. It requires a rethinking of how enterprise data is stored, secured, and made accessible, not just to humans, but to agents.
A Shift in Where AI Happens
Starburst’s pitch is not that it makes AI smarter. It’s that it brings AI tools closer to where enterprise data already lives. Rather than pushing customers to centralize data into a single warehouse or cloud, the company’s architecture allows AI to run “lakeside”—at the edge of existing lakehouses spread across environments.
The core idea isn’t new, but the execution is evolving. Starburst’s new AI Workflows suite introduces components that allow teams to search unstructured data using vector embeddings, run prompts through large language models using SQL functions, and enforce fine-grained model access policies all without building external pipelines or moving data out of secure domains.
The vector search functionality stores embeddings directly inside Apache Iceberg, the open table format Starburst has adopted as its default for scalable lakehouse storage. According to Borgman, that integration was driven by customer demand to keep AI development inside the same governed data architecture used for analytics.
“We’re really doubling down on Iceberg as the open format of choice,” he said. “That means the full end-to-end RAG workflow can now happen inside Starburst—no movement, no silos, no external orchestration.”
Agents Without the Hype
While much of the industry has focused on building proprietary AI copilots, Starburst is taking a more foundational route. The company introduced a prebuilt AI agent, designed to allow analysts and business teams to query data in natural language, generate documentation, and produce governed data products. The agent is not intended to replace BI tools; it’s meant to serve as a secure interface that understands both business context and enterprise data policies.
In early usage, customers are deploying the agent as a query layer for internal teams or embedding it into broader agent-based workflows. In both cases, the focus is on reducing the friction that usually slows AI from experimentation to deployment.
“Some are using it like a copilot for analysts,” said Borgman. “Others are using it to define the business metadata that makes the data more usable for agents—less hallucination, more signal.”
The company’s approach leans heavily on governance. Its auto-tagging feature, now generally available in Galaxy, uses LLMs to classify sensitive data like PII at the column level, allowing teams to scale attribute-based access controls without constant manual policy creation. These safeguards are central to Starburst’s broader push to make AI infrastructure compliant by default, especially in industries like finance and healthcare.
From Analytics Platform to Strategic Infrastructure
Starburst’s new direction didn’t come from a vacuum. According to Borgman, many of the company’s enterprise customers, particularly in banking, were already building their own internal agents on top of Starburst’s data layer. The platform’s ability to support Trino-based SQL access across hybrid environments made it a natural fit.
One of those customers is Citibank, which recently expanded its relationship with Starburst and made a strategic investment in the company. The move wasn’t about financial return, Borgman says. It was about securing a stake in a platform the bank had grown dependent on.
“We weren’t raising,” he said. “Citi came to us and said, ‘You’re now critical to our infrastructure. We want to be invested in this.’”
That validation comes at a moment when enterprise interest in on-premises AI is resurging, not as a rejection of cloud, but as a response to regulatory constraints and data residency requirements. Starburst’s support for secure, air-gapped environments combined with its federated query engine and governance stack makes them one of the few vendors that can operate across both.
“The death of on-prem has been greatly exaggerated,” Borgman said. “These environments aren’t legacy—they’re necessary. Especially when AI enters the picture.”
A Fragmented Landscape
Starburst’s move into AI infrastructure places it in a growing field of companies trying to meet enterprise AI demand without forcing customers into proprietary ecosystems. Vendors like Databricks and Snowflake have pushed to become full-stack AI platforms. Starburst, by contrast, is building around open formats like Iceberg and pushing composability over consolidation.
Whether that approach can scale into mainstream adoption remains to be seen. Much of the AI tooling Starburst announced is still in private preview. And while the platform is widely deployed in industries like finance, its footprint in less regulated verticals is still growing.
But Borgman is betting on a simple dynamic: as enterprises get more serious about AI, they’ll need fewer dashboards and more reliable access to governed data.
“We believe the future of software is composable, agentic, and data-centric,” he said. “The monolithic SaaS application? That era’s ending. The next wave is all about the data layer.”