Blog

The Models are Smart Enough. Your Data Strategy Might Not Be.

Model capabilities are no longer the bottleneck. The gap holding enterprises back is context, and proprietary data is the key to closing it.

At HumanX in San Francisco, Databricks co-founder and CEO Ali Ghodsi opened his keynote session with a statement that reframes the entire enterprise AI conversation. The models, he argued, have already met the bar for artificial general intelligence (AGI) as the field originally defined it years ago. So why hasn’t enterprise transformation followed?

“There is a lot of talk about super intelligence. The models are already pretty smart. But there is a reliability gap in the enterprise. The models are missing the context needed for reliable outputs. We have to give them the enterprise context they need,” said Ghodsi.

That “reliability gap” is the real frontier, and at its core it is a data problem.

The Context Bottleneck

Enterprises have spent the past few years deploying AI pilots, and more recently, exploring agentic use cases. Results have been uneven, and teams cycle through prompting strategies trying to coax consistent performance.

Ghodsi’s framing points to the core of the issue. The models are capable, but they are missing the organizational context that makes outputs accurate and actionable in a specific enterprise.

This is the context bottleneck. An AI system operating on general knowledge produces general results. The outputs that actually drive enterprise decisions come from models working with the specific history and context of that organization — its “corporate memory.”

Feeding models that context means solving a harder problem: what data do you have, where does it live, and how do you surface the right pieces with confidence?

Proprietary Data Is the Differentiator

Prem Natarajan, Chief Scientist and Head of Enterprise AI at Capital One, put it directly at HumanX: your proprietary data is your AI advantage.

Capital One’s AI trajectory illustrates the point. The company began its cloud migration over eight years ago, driven by a founding belief that data-driven decisions were a competitive differentiator — D&A is in the organization’s DNA. That infrastructure commitment positioned them to move quickly when generative AI arrived. Deploying generative AI in car dealer-facing applications improved outcomes by 55 percent. In customer call centers, the ability to surface accurate answers rose from 84 percent to 93 percent.

Every enterprise today can access the same foundational models. Capital One’s advantage came from years of accessible, well-governed proprietary data ready to be put to work.

Natarajan also emphasized that Capital One’s most important AI decisions involve determining what to exclude from production after rigorous sandbox testing. Proprietary data becomes an advantage when you control what enters AI feeds and what stays out.

Governance Makes Data Usable

Enterprises today are sitting on petabytes of unstructured data across email, file shares, collaboration platforms, and document repositories. Indiscriminately feeding everything to AI degrades output quality and creates legal and compliance exposure. Sensitive content, stale records, and redundant, outdated, or trivial (ROT) files need to be governed across all enterprise repositories, then excluded from AI pipelines.

Turning a data moat into a data foundation for AI requires:

  • Enterprise-wide discovery of unstructured data across all repositories
  • Automated classification and curation so models ingest only appropriate information for their defined purpose
  • Policy enforcement at the data layer aligned with regulatory requirements and internal governance rules
  • Full auditability and traceability for every AI retrieval and access event
  • Lifecycle governance to identify, manage, and dispose of ROT and high-risk content before it reaches AI

These governance capabilities determine whether proprietary data closes the reliability gap Ghodsi described.

Data Readiness is AI Readiness

The enterprises pulling ahead in AI adoption share common traits. They invested in understanding what’s in their data estate before they tried to use it. They built the infrastructure to govern it, curate it, and surface the right data for the right use cases.

Organizations still cycling through AI pilots without consistent results are often working with the same model capabilities as their competitors. The gap now is context, and context starts with data governance.

See how ZL Tech helps enterprises govern and curate unstructured data to close the AI reliability gap.

Valerian received his Bachelor's in Economics from UC Santa Barbara, where he managed a handful of marketing projects for both local organizations and large enterprises. Valerian also worked as a freelance copywriter, creating content for hundreds of brands. He now serves as a Content Writer for the Marketing Department at ZL Tech.