Hub

Sign-in

Pricing

Blog

About

Hub

Sign-in

White Paper

The AI agent trust layer — runtime verification for every AI agent

Endpoint-native runtime verification for Cursor, Claude Desktop, Claude Code, Codex, and OpenClaw: EDAMAME observes each agent from outside, at the endpoint boundary, and proves trust from both security posture and runtime behavior. Part 1 covers Agentic Posture Visibility — the deterministic, no-LLM inventory of every agent, its tools, and its reach. Part 2 covers Agentic Security — divergence scoring, attack-pattern findings, and automatic isolation of a compromised agent host.

Agentic Visibility

Agentic Security

Executive Summary

Software is increasingly written, refactored, and shipped by autonomous AI agents — and increasingly by self-improving ones. Across software delivery, the principal taking action is no longer only a human at a keyboard: it is an agent that reads code, runs shell commands, installs packages, reaches for credentials, and calls other agents — often on workstations, runners, and self-hosted hosts nobody is actively watching. Security has always had to follow the principal to wherever it acts; this is the moment it must follow software delivery’s newest one.

This paper argues for runtime verification of AI agents — an endpoint-native trust layer for the AI agent. The instinct is not new: EDR brought detection and response to the endpoint, and XDR correlated it across endpoint, network, and identity. EDAMAME carries that same instinct to software delivery’s newest principal — the autonomous, increasingly self-improving agent — a complement to EDR/XDR and the SOC, never a replacement. Its thesis is a vantage point: the endpoint boundary, where an agent’s intentions become real processes, files, sockets, and secret access, is the one place an agent can be observed from outside itself, on an evidence trail it cannot rewrite — with no plugin or SDK inside the agent. Hardening stays the precondition; from there EDAMAME runs the loop — see every agent and what it touched, prove what happened from host-observable evidence, and enforce when behavior diverges from declared intent or attack patterns appear. Trust is established from both the host’s security posture and the agent’s runtime behavior. The paper is organized around the two pillars of that loop: Part 1 — Agentic Posture Visibility, the deterministic, no-LLM inventory of every agent, its tools, and its reach; and Part 2 — Agentic Security, the verdict layer that scores divergence, detects attack patterns, and automatically isolates a compromised agent host.

The AI-Agent Risk Landscape

Modern agent workflows collapse several sensitive capabilities into one loop: read repositories, call tools, modify files, use network access, install packages, and touch credentials. Common risk patterns include:

- Hidden instructions in issues, docs, or chat that change future tool calls

- Malicious plugins, MCP servers, or package dependencies that reshape agent behavior

- Exfiltration of tokens, SSH keys, CI secrets, source code, or developer wallet material

- Posture drift on the host while the agent keeps working

Because agents operate with valid local context and legitimate tools, many failures do not immediately look like classic malware. The system may appear to be working normally while risk quietly increases.

This is exactly why AI-agent security needs better runtime visibility, not just stronger setup checklists.

Why Traditional Controls Fall Short

Supply-Chain and Setup Controls

Signed artifacts, dependency review, sandbox posture, guarded tool integrations, and conservative tool scopes still matter. They reduce accidental exposure before the agent loop runs fast.

But once an agent is running, those controls do not fully explain whether its observed behavior still matches the declared task. Setup quality reduces risk. It does not eliminate runtime drift.

Identity and Login-Time Trust

Identity systems do a good job at authenticating the user and establishing a trusted session. Device checks at sign-in can also improve the starting point.

The weakness is persistence. A workstation or server can change after login, and agent work can continue through tokens, SSH keys, or already-approved tooling. Login-time trust is not the same as continuous runtime trust.

Sandboxes, Tool Scopes, and Policy Hardening

Restricting tools and limiting privileges are essential for safe deployment. Narrow scopes reduce blast radius and make abuse harder.

Yet even a well-scoped agent can still perform undeclared behavior inside its allowed surface: unexpected egress, suspicious file access, risky package installation, or posture degradation on the host. Policy hardening helps, but it does not create runtime truth on its own.

Network and Perimeter Controls

VPNs, overlays, and allow-lists still matter for managing access to internal services and build infrastructure. They can make the environment harder to reach from the outside.

What they do not provide is process-aware runtime understanding. A network control can tell you that traffic exists. It usually cannot tell you whether the traffic matches the agent's declared task, or whether it came from a trusted agent process on a still-compliant host.

The Runtime Security Gap

The divergence gap shows up once baseline protections are humming: the workstation may remain patched and permissions may stay scoped, yet observable behavior during a session can peel away from the declared task. The attack-pattern gap is similar: a package can arrive through a legitimate publish path, while the payload still opens sensitive files or sends credentials at runtime.

- The task is read-only, but the process tree starts making undeclared outbound connections

- The agent claims to edit one file, but the host posture changes at the same time

- A plugin or tool introduces behavior that was not part of the original intent

- A workstation or runner loses compliance while the agent keeps operating

- The setup still looks correct, but runtime behavior no longer matches the declared task

EDAMAME Architecture Overview

EDAMAME applies this model across workstations, CI/CD, and self-hosted agent hosts with a product split that stays simple for users:

- EDAMAME Security: workstation trust anchor for developers and local devices

- EDAMAME Posture: CLI and host control surface for runners, servers, and agent hosts

- Integration packages: Cursor, Claude Desktop, Claude Code, Codex, and OpenClaw as named runtime surfaces

- Divergence engine: joins captured AI-agent intent with process, filesystem, network, and posture telemetry on the host

- Attack-pattern detection engine: runs deterministic checks on live telemetry for credential harvest, token exfiltration, sandbox exploitation, and supply-chain behavior

- EDAMAME Hub surfaces unsecured AI-agent installs across the fleet; EDAMAME Portal manages identity, entitlement, and shared subscription context for rollout

This is not another interface bolted onto your code and pipelines. It is a way to bring runtime verification and attack detection into places where developers and agents already work. MCP remains a separate path for the narrow case where an agent needs to read EDAMAME core security signals inside its own workflow.

Part 1 — Agentic Posture Visibility

Visibility comes first, and it is deterministic: no LLM, no configuration, no plugin inside the agent. On every anchored workstation, runner, or agent host, a structural pass inventories the agentic footprint and converts what it sees into ordinary posture checks that ride the existing score and security-checks pipeline up to EDAMAME Hub.

The visibility pass surfaces:

- Agent inventory: every AI agent installed or running — Cursor, Claude Desktop, Claude Code, Codex, OpenClaw — with observed-coverage validation, so a discovered agent whose observer is paused is itself a finding

- MCP servers and tools: the full capability surface each agent can call, organized as a capability graph with trust zones to spot over-provisioned reach

- Blast radius and harness coverage: unconfined agents sorted by danger with confinement remediation, and agents running without a governance harness such as AgentField or Rippletide flagged fleet-wide

- Agent SBOM drift: a CycloneDX inventory of tools, MCP servers, skills, and models, diffed against the approved baseline

- Flight Recorder: a structural history of system, network, tool, file, and communication activity per agent — the evidence trail the security verdicts later build on

Each observation lands in EDAMAME Hub as a posture check a CISO already knows how to operate. Define policy on these checks and enforce zero trust — for example, only authorize agents wrapped in a governance harness to connect — and Hub’s posture-gated conditional access keeps non-compliant hosts off your IdP and provider allow-lists.

Part 2 — Agentic Security: Divergence, Attack Patterns, and Automatic Isolation

Scenario 1: Hidden Instruction or Prompt Injection

Without runtime correlation: the agent keeps using allowed tools, but future actions drift away from the original task.

With EDAMAME: unexpected traffic, file access, or process activity can be compared against the declared workflow and surfaced as a mismatch.

Scenario 2: Compromised Workstation or Agent Host

Without continuous posture checks: the agent keeps operating on a device or server whose security state has degraded.

With EDAMAME: posture changes become part of the runtime picture, so risky drift can trigger escalation or access changes before trust silently persists.

Scenario 3: Tool, Plugin, or Package Poisoning

Without an independent observer: a malicious tool, plugin, or package can stay inside an allowed workflow while opening credentials, wallet files, or source material it has no business touching.

With EDAMAME: runtime verification checks whether observed behavior remained consistent with the task, while attack-pattern checks look for credential harvest, token exfiltration, or other compromised runtime behavior.

These scenarios do not promise perfect prevention. They show why runtime verification and attack detection create a stronger early-warning layer than setup controls alone — and why the response can be automatic: the moment an attack pattern or a divergence verdict lands, the device score degrades and EDAMAME Hub conditional access isolates the compromised agent host from your IdP and provider allow-lists, with human-in-the-loop escalation to inspect the evidence.

A Shared Language for Agentic Risk: OWASP GenAI

As autonomous agents proliferate, the industry needs a common language for what can go wrong with them — and it is converging on one. The OWASP GenAI Security Project now publishes two flagship taxonomies: the Top 10 for Agentic Applications (2026, ASI01–ASI10), for systems that plan, use tools, persist memory, and coordinate with other agents, and the Top 10 for LLM Applications (2025, LLM01–LLM10). EDAMAME is built for the first list. Because it observes the agent as a principal — its goals, tools, memory, and inter-agent calls — on the host and the network, rather than inspecting model internals or filtering prompts inline, its evidence maps onto these risks structurally, not by coincidence. What follows is an honest engineering crosswalk graded against what ships today, not a certification or endorsement.

- OWASP Top 10 for Agentic Applications (2026): strong coverage on 6 of 10 risks and partial on the rest — the list EDAMAME fits best, because watching a principal's real behavior on the host is exactly what these risks demand.

- OWASP Top 10 for LLM Applications (2025): 4 strong, 4 partial, 1 indirect, and 1 honestly out of scope — output factuality is a model-evaluation problem, not a runtime observer's job, and we say so rather than pad the score.

- Agent Goal Hijack (ASI01) → the two-plane divergence engine: a hijacked objective surfaces as actions that diverge from declared intent, and hidden-prompt exfiltration lands as token exfiltration on the host.

- Unexpected Code Execution and RCE (ASI05) → deterministic sandbox-exploitation detection on host process lineage — EDAMAME's home turf, because executed code has to land on the host, where EDAMAME already watches.

- Agentic and LLM Supply Chain (ASI04, LLM03) → a CycloneDX agent SBOM of tools, MCP servers, skills, and models, with runtime drift diffs and skill-supply-chain checks that catch a poisoned dependency after it ships.

- Sensitive Information Disclosure (LLM02) → deterministic credential-harvest and token-exfiltration findings drawn from the same host telemetry.

- Excessive Agency (LLM06) → capability-graph and governance-harness coverage show what an agent can reach and whether it is fenced; external nono and Anthropic srt policy runtimes can gate tool use, while EDAMAME independently verifies what the host actually did.

- Rogue Agents (ASI10) → divergence plus reversible-first response: pause, revoke, quarantine. This rests on the invariant that makes everything above trustworthy — observer-independence. Because the observer sits outside the agent, a rogue or hijacked agent cannot silence the findings about itself.

This is an honest internal crosswalk, not an OWASP certification. Where coverage depends on a connector still rolling out — external vector-store introspection for memory poisoning, for instance — we grade it partial rather than overclaiming, and a few risks (training-time poisoning, embedding-math weaknesses, output factuality) sit outside a runtime observer's scope by design. The evidence is also portable: EDAMAME reuses CycloneDX for the agent SBOM, OCSF for case export, and MCP and OpenTelemetry on the wire, so its findings travel into the OWASP, MITRE ATLAS, and NIST AI RMF tooling a security team already runs — instead of becoming one more island of agent telemetry.

A Detection Map for Adversarial AI: MITRE ATLAS

MITRE ATLAS is a separate detection view from the OWASP risk crosswalk. EDAMAME's shipped map covers 39 runtime-observable parent techniques across 12 tactics, each with independent AML.T tags, live findings, a strong / partial / indirect grade, and the telemetry rationale behind that grade.

- Denominator: 39 parent techniques a runtime host and network observer can witness — not the complete 170+ published ATLAS matrix.

- Coverage: 16 strong, 20 partial, and 3 indirect, graded row by row rather than collapsed into a certification claim.

- Evidence: independent ATLAS technique tags, tactic grouping, live findings, and the host/network telemetry rationale for every row.

- Unknown stays Unknown: when a divergence-backed technique has no configured LLM, EDAMAME does not guess. This is an engineering coverage map, not MITRE certification or endorsement.

Governance, Audit Readiness, and Operational Confidence

Leaders need more than a promise that an agent was configured correctly. They need evidence about what was allowed, what was observed, and what happened when trust changed. That is useful for security operations, internal reviews, and formal governance work alike.

- Continuous visibility into posture and runtime behavior

- Clear evidence trail for alerts, policy actions, and operator review

- Dynamic enforcement when trust degrades, not just periodic checks

This matters for audit readiness, but it matters even more for day-to-day decision-making. Better evidence reduces both blind trust and unnecessary panic.

Implementation and Deployment Overview

- Start with EDAMAME Security on developer workstations, aligning deployment waves with fleet visibility from EDAMAME Hub

- Use EDAMAME Posture on CI/CD runners, servers, and self-hosted agent hosts

- Connect Cursor, Claude Desktop, Claude Code, Codex, or OpenClaw through the packaged integration path once the EDAMAME trust anchor is present

- Define posture and runtime policies that match the environment

- Use EDAMAME Hub to discover unmanaged AI-agent stacks, correlate hosts, review divergence evidence and attack-pattern findings, and keep rollout tied to identities and entitlements in EDAMAME Portal

- Expand from one surface to another without changing the underlying model

The important point is simplicity: the runtime story should stay understandable even as the deployment surface grows. Compare intent with behavior, then detect attack-pattern findings from the same host telemetry.

The Strategic Advantage: Control + Speed + Trust

Organizations do not want to choose between developer speed and stronger security. They want a model that respects modern workflows while giving security teams better runtime evidence.

- Productivity from developer-first and agent-first workflows

- More trust because divergence scoring and attack detection continue after initial setup

- Lower friction than heavy admin-down or proxy-heavy approaches

- One security language across devices, runners, AI agents, and attack-pattern findings

In short: runtime verification lets teams delegate more safely because trust is checked against reality, not assumed. Attack detection makes the same host evidence useful when the problem is not agent drift, but compromised code.

Conclusion and Call to Action

AI agents have become part of the software delivery surface, and their share of the work will only grow. They deserve a security model that reflects what they actually do — read, execute, modify, connect, and decide at runtime — and one that scales to a world where most software changes pass through an agent’s hands.

The path forward is not to discard hardening, identity, or policy controls; it is to add the missing runtime layer: observe each agent from outside itself, compare declared intent with observed host behavior, act when they diverge, and detect compromised behavior when code starts harvesting credentials or exfiltrating tokens. Runtime verification is how detection and response caught up with the endpoint and the SOC; the AI agent trust layer is how it catches up with the agent.

With EDAMAME, teams use Hub to inventory unmanaged AI-agent installs, anchor the workstation or host with EDAMAME Security and EDAMAME Posture, and keep divergence evidence plus attack-pattern findings aligned across Cursor, Claude Desktop, Claude Code, Codex, and OpenClaw — on an evidence trail the agent cannot erase.

Explore EDAMAME for AI Agents

Bring EDAMAME Security, EDAMAME Posture, and EDAMAME Hub together: map unsecured agent footprints, anchor developer laptops and coding hosts, then score divergence and detect attack-pattern findings from the same host observability.

Explore EDAMAME for AI Agents

Explore EDAMAME for AI Agents

Agentic Visibility

Agentic Security

Talk to our team