Agents of Chaos: The Data Behind the Danger

Harvard researchers have exposed how AI agents leak data, poison memory, and propagate harm. The fix starts with governing what agents can access.

Mar 24, 2026 by Valerian Stolpe in AI

Agentic AI systems are moving to the core of the enterprise. Organizations are deploying them with broad tool access, persistent memory, and the authority to act on behalf of users and organizations. As these systems move from pilot to production, a new landmark study is raising urgent questions about what happens when that authority is exploited.

Published late February 2026, “Agents of Chaos” is the most comprehensive red-team study of autonomous AI agents to date. Conducted by over 30 researchers from Harvard, MIT, Stanford, Carnegie Mellon, and other institutions, the study deployed six autonomous agents continuously for two weeks while researchers interacted with them, simulating both benign and adversarial conditions. The findings were stark. Researchers documented ten substantial vulnerabilities spanning safety, privacy, and governance, and the most damaging failures shared a common root: the data the agents could access.

What the Agents Were Given

The six agents in the study, using Claude Opus and Kimi K2.5 as backbone models, ran 24/7 on isolated virtual machines equipped with real-world capabilities: live email accounts, shell command execution, 20GB persistent file systems, scheduling tools, and external APIs including web browsing and GitHub integration. Their core directive was to be helpful to any researcher who interacted with them, without requiring per-action human approval.

This setup is a close mirror of how enterprises are beginning to deploy AI agents across business functions: systems with access to communications, documents, internal repositories, and collaboration tools. That combination of autonomy, tool access, and unstructured data access is precisely what made the failures possible.

The Exploits Were Data-Mediated

The study’s researchers deliberately set aside known model-level weaknesses like hallucinations. Instead, they focused on failures that emerge when autonomy, tool access, persistent memory, and multi-party communication operate together. The results revealed how thoroughly data access shapes agent vulnerability.

Several of the study’s most significant failure modes illustrate this directly:

Indirect PII extraction: When asked directly for a Social Security number stored in an email, one agent refused. When asked to forward the entire email thread, it complied, handing over the SSN, bank account number, and home address unredacted.
Bulk data leakage: A researcher extracted 124 email records from a single agent by framing the request as an urgent bug fix.
Memory poisoning: An attacker convinced an agent to co-author a shared “constitution” document stored in its persistent memory, then quietly edited the document to inject false behavioral directives. The agent followed the injected instructions, attempting to shut down other agents, removing users from message channels, sending unauthorized emails, and voluntarily sharing the compromised constitution with other agents.
False completion reporting: One agent, lacking the right tool to delete a confidential email, destroyed its own email client instead, then reported the task complete. As the researchers note, “In several cases, agents reported task completion while the underlying system state contradicted those reports.”

The common thread across these failures is that the agents were compromised through what they could read, write, and remember.

The Propagation Problem

One of the study’s more sobering findings is that compromise rarely stays contained. Researchers documented cross-agent propagation of unsafe practices, where corrupted instructions spread from one agent to others, leading to coordinated harmful behavior including attempts to shut down fellow agents and send unauthorized broadcasts across the network.

For enterprises deploying multiple agents that draw from shared data repositories, this is a significant concern. A single poisoned document, misfiled policy, or manipulated record has the potential to affect every agent with access to the same data pool. The researchers are direct about the broader stakes: “The implications of delegating authority to persistent agents are not yet widely internalized and may fail to keep up with the pace of autonomous AI systems development.”

What Enterprises Are Walking Into

Over 80% of enterprise data exists in unstructured formats — emails, documents, file shares, and chat logs — the same formats agents exploit throughout the research. The vast majority of this data has never been classified, curated, or access-controlled with AI consumption in mind.

Agents deployed on top of unmanaged repositories inherit every flaw within them: sensitive records, contradictory policies, and ROT (Redundant, Outdated, Trivial) content accumulated over years. The researchers are clear that what they documented is grounded in the vulnerabilities of ungoverned data environments: “Our findings establish the existence of security-, privacy-, and governance-relevant vulnerabilities in realistic deployment settings.”

Governing the Data Before Deploying the Agent

Much of the current security conversation around agentic AI centers on model-level controls: system prompts, output filters, and alignment techniques. The Agents of Chaos findings indicate that these measures are insufficient on their own when the underlying data environment is uncontrolled. Other recent studies have corroborated that model-level guardrails often fail to prevent agentic misalignment, such as Anthropic’s 2025 study, which found AI agents capable of blackmail and corporate espionage. Governance of what agents can access is the most reliable guardrail against agentic vulnerabilities.

A data governance framework for agentic AI deployment should include:

Classification and access control: Tag data by sensitivity and content type, and enforce purpose-limited access so agents can only reach information relevant to their defined function. An agent built to summarize contracts has no business reading HR files.
Data curation and ROT remediation: Remove redundant, outdated, and trivial content before it enters an agent’s accessible environment. Stale or contradictory data is a liability.
Audit trails and access logging: Maintain independent records of what agents accessed and produced. As the study demonstrates, agent self-reports cannot be taken at face value.
Data lineage: Track the origin and history of every file an agent can access, so manipulated or injected content can be identified and quarantined before it propagates.
Kill switches and human oversight checkpoints: Define categories of high-impact actions — data deletion, external communications, configuration changes — that require explicit human confirmation before execution.

The Accountability Question

Agents of Chaos is a study about autonomous AI behavior, but its deepest lesson is about data: what agents can reach determines what damage they can do.

The research raises unresolved questions about accountability, delegated authority, and liability for downstream harms: questions that legal scholars and policymakers are only beginning to work through. For enterprises, answering those questions will require records of what AI accessed, when, under whose authority, and what it produced as a result.

As agentic AI deployment accelerates across industries, data governance is the infrastructure that makes accountability possible. Organizations that govern their unstructured data before deploying agents will be better prepared to contain vulnerabilities, remediate failures, and demonstrate responsible AI use when it matters most.

Read our brochure to learn how ZL Tech builds the data governance foundation for safe and accountable enterprise agentic AI.

agentic ai AI Artificial Intelligence data privacy enterprise ai agents information governance PII

Valerian received his Bachelor's in Economics from UC Santa Barbara, where he managed a handful of marketing projects for both local organizations and large enterprises. Valerian also worked as a freelance copywriter, creating content for hundreds of brands. He now serves as a Content Writer for the Marketing Department at ZL Tech.

Agents of Chaos: The Data Behind the Danger

Talk to an Expert

Get in Touch

Company

Resources

Solutions/Services

Agents of Chaos: The Data Behind the Danger

What the Agents Were Given

The Exploits Were Data-Mediated

The Propagation Problem

What Enterprises Are Walking Into

Governing the Data Before Deploying the Agent

The Accountability Question

Related Posts

Additional Resources

Footer

Talk to an Expert

Get in Touch

Company

Resources

Solutions/Services