ARCHITECTURE
Seven layers, one stack, every action mediated.
The 7-Layer Defence Architecture is the part of Vantage Workspace that does the structural work most “AI for the enterprise” platforms skip.
Each layer addresses a specific failure mode — and each failure mode shows up in the OWASP Top 10 for Agentic Applications, in the NIST AI RMF function categories, in the EU AI Act high-risk technical requirements, or in all three.

THE PREMISE
Why seven, and why structural rather than configurable.
The number isn't sacred — it's the point at which you can't decompose the architecture further without losing the audit story. The seven layers correspond to seven distinct categories of failure that every agentic AI platform has to answer for. Some platforms answer for a subset (e.g. just prompt defence and tool guardrails). Some answer for none, with the implicit assumption that the customer will assemble the missing layers themselves.
A configurable architecture would let the operator turn off layers. We considered this. We rejected it. The reason is the audit story: if layer 4 (Memory Safety) can be turned off, then the audit log can be challenged on the grounds that layer 4 may or may not have been running at any given time. A regulator's response to that is reasonable: “show us the evidence that layer 4 was active for every event in the audit window.” That evidence is much harder to produce when the answer is “we have a configuration history” than when the answer is “the layer is structural — it cannot be off.”
So the architecture is structural. The seven layers are present in every deployment. Their order is fixed. Their evidence outputs are continuous. Customers can configure the policies inside a layer (which prompts to flag, which tools to allow, which trust boundaries to enforce), but they cannot reorder layers, remove a layer, or run them in parallel.
The trade-off is operational complexity. A structural seven-layer stack is more containers, more inter-service communication, more configuration surface than a “single firewall” model. The trade-off is worth it when the alternative is “show us the evidence” — and the evidence has to span everything from policy enforcement to supply chain provenance.
The seven layers
Audited like a kernel. Sequenced like a pipeline.
Each layer is a discrete service. Each layer emits a signed event for every decision it makes.
The events ladder into a tamper-evident log that exports directly to your SIEM and aggregates into a Trust Report at request completion.
- 01
Policy Engine
Versioned, signed YAML policies that gate every action before it reaches a tool.
- 02
Prompt Defence
Injection detection, jailbreak heuristics, and prompt-shape validation against a pinned corpus.
- 03
Tool Guardrails
Per-agent tool catalogues, scope-checked at invocation, with runtime parameter validation.
- 04
Memory Safety
Embedding-inversion probes, retrieval scope checks, and tenant-isolated vector spaces.
- 05
Trust Boundaries
Identity attestation between agents, humans, and tools. No shared service accounts.
- 06
Inter-Service Auth
mTLS with short-lived, certificate-rooted credentials issued per session.
- 07
Supply Chain
OCI-signed images, SBOMs on every release, signature verification at deploy.
EVIDENCE THE LAYERS RUN
Every skill, every scope, every invocation — visible to the auditor.
Skill Analytics is the operator's view of the runtime: which skills the agents actually used in the audit window, the scope each skill was granted, and the invocation count against the policy ceiling.

LAYER REFERENCE
Layer-by-layer.
Policy Engine
Maps to: OWASP LLM01 (Prompt Injection), LLM06 (Sensitive Information Disclosure)
NIST AI RMF: GOVERN-1, MAP-2, MEASURE-2
What it does
The Policy Engine is the first layer every prompt encounters. It resolves which policy applies to this prompt, in this context, for this user — based on the user's identity (from the SSO layer), the agent that's processing the prompt (from the agent registry), the tools the agent might invoke (from the tool registry), and the data classifications in scope (from the file pillar's classification metadata).
Policy resolution happens before the prompt reaches the model. The output of the Policy Engine is a structured policy document (JSON Schema validated) that downstream layers consume — Layer 2 reads it to decide which firewall rules to apply, Layer 3 reads it to decide which tools the agent is permitted to invoke, Layer 5 reads it to decide which inter-agent boundaries are enforceable.
Policies are written in a YAML DSL. A policy can reference: identity attributes (department, clearance, role), data classifications (PII, PHI, financial, regulated), tool categories (read, write, external, communication), agent roles (Pilot, Hunter, Sentry, Concierge, Analyst, Custom), and time windows (business hours, after-hours, locked-down).
What it produces
event_type: policy.resolved identity: josh@example.com agent: analyst-1 prompt_hash: sha256:... policy_id: finance-team-default-v3 policy_version: 3.2.1 policy_decision: permit | restrict | deny restrictions: [ no-external-tools, no-pii-egress ] timestamp: 2026-05-05T14:23:11.124Z (RFC 3161 signed)
Failure mode if missing
No prompt is contextualised. The model receives every request with the same ambient policy — typically the most permissive — which means a finance team member's request and an HR team member's request go to the model with the same scope. The audit log has no record of what policy was in effect at the time of any given event.
Prompt Defence (NemoClaw)
Maps to: OWASP LLM01, LLM02, LLM05
NIST AI RMF: MEASURE-2, MEASURE-3, MANAGE-2
What it does
Prompt Defence is the inline firewall. Every prompt that has passed Layer 1 (Policy resolved) reaches Layer 2 next. The firewall — internal codename NemoClaw — applies the active rule set against the prompt before it reaches the model.
The rule set is the NemoClaw ATLAS rule library — currently 28 active rules covering: direct prompt injection (instruction override attempts, role-play escapes, payload smuggling), indirect prompt injection (poisoned RAG content, malicious tool output, file-borne injection vectors), data exfiltration patterns (training-data extraction probes, embedding inversion, system-prompt elicitation), and the canary-token detection patterns (used to detect injection attempts that try to mask themselves as legitimate completions).
Canary tokens are the structural defence against the “invisible” injection class. The platform inserts deterministic canary tokens into the prompt at well-known positions; the model's response is then scanned for whether the canary was preserved, modified, or substituted. A modified canary indicates the model's response was steered by injected content rather than by the user's intent.
NemoClaw runs in two modes: block (the prompt is refused, an audit event is logged, the user sees a structured error) and flag (the prompt proceeds, but is annotated for downstream layers, particularly Layer 4's memory safety check). The choice between block and flag per rule is policy-driven (Layer 1).
What it produces
event_type: prompt.firewall_evaluated prompt_hash: sha256:... rule_set_version: atlas-v2.6.4 rules_evaluated: 28 rules_triggered: [ atlas-12-instruction-override, atlas-19-canary-substitution ] decision: block canaries_intact: true | false | n/a timestamp: 2026-05-05T14:23:11.245Z (RFC 3161 signed)
Failure mode if missing
Direct prompt injection is undetected. The model receives whatever the user (or the user's tool output, or the user's RAG context) contains, and the model's response is influenced by injection content without any record. The most common downstream symptom is data exfiltration via crafted prompts.
Tool Guardrails
Maps to: OWASP LLM02, LLM07
NIST AI RMF: MAP-3, MEASURE-2, MANAGE-3
What it does
When the model produces a response that includes a tool call (e.g. “send an email to alice@example.com with subject X”), the tool call does not execute immediately. It enters Tool Guardrails first.
Tool Guardrails performs three checks before execution:
- Permission scope check. Does the agent that produced the tool call have permission to invoke this tool with these parameters? The permission scope is enforced from the policy resolved at Layer 1; the tool registry knows which agent roles can invoke which tools and under what restrictions.
- Parameter validation. Are the tool's parameters well-formed? Does the email address pass syntactic validation? Does the file path resolve within the agent's permitted scope? Does the API endpoint match the registered tool's allowed endpoint pattern?
- Side-effect classification. Is this tool call read-only, modifying, or external? Read-only calls execute. Modifying calls require user confirmation if the agent's permission scope demands it. External calls (anything that leaves the platform's trust boundary) require explicit user consent and are flagged in the audit log with the destination domain.
If any of the three checks fails, the tool call is rejected, an audit event is logged, and the model is given the rejection back as part of its conversation context (so the model can either retry with corrected parameters or report the failure to the user). The model never bypasses the guardrails.
What it produces
event_type: tool.call_evaluated
agent: concierge-1
tool: email.send
parameters: { to: "alice@example.com", subject: "...", body_hash: "sha256:..." }
permission_check: pass | fail
parameter_check: pass | fail
side_effect: modifying | external | read-only
user_consent: required | granted | n/a
decision: execute | reject | hold-for-consent
timestamp: 2026-05-05T14:23:11.512Z (RFC 3161 signed)Failure mode if missing
Tool calls execute without permission checks. The agent has effectively god-rights — it can invoke any tool with any parameters at any time. The audit log shows that a tool was invoked, but not whether it should have been allowed.
Memory Safety
Maps to: OWASP LLM03, LLM06, LLM10
NIST AI RMF: MAP-4, MEASURE-3, MANAGE-2
What it does
Memory Safety governs what the agent remembers, across what conversational scope, for how long. It addresses the failure mode where information from one session leaks into another — typically because a long-running conversation context inadvertently carries forward content that should have been scope-isolated.
The layer enforces conversation isolation: each user-agent session has its own memory scope, with a strict TTL (configurable by policy, default 24 hours), and content from one session is not surfaced as context for another session unless explicitly authorised. Agents cannot read each other's memory. The Pilot orchestrator does not retain content from specialist agents' working memory beyond the orchestration window.
The layer also enforces cross-tenant isolation — but this is an availability defence, not a privacy defence, because Vantage Workspace is single-tenant by deployment model (each customer gets their own deployment). The cross-tenant layer is a defence-in-depth: even within a single deployment, agents serving different organisational units cannot access each other's working memory.
The layer also runs embedding inversion detection — periodically sampling the embedding store for patterns that match known model-extraction probe shapes, flagging the responsible agent for review.
What it produces
event_type: memory.access_evaluated agent: analyst-1 memory_scope: session-id-... access_type: read | write | persist isolation_check: pass | fail ttl_remaining: 86400s embedding_inversion_flag: false timestamp: 2026-05-05T14:23:11.671Z (RFC 3161 signed)
Failure mode if missing
A conversation context carries forward across sessions, agents, or users. A finance team member's query last week influences an HR team member's query this week, because both went to the same Analyst agent and the agent's memory was unbounded. The compliance failure is information disclosure across permission boundaries.
Trust Boundaries
Maps to: OWASP LLM04, LLM08
NIST AI RMF: MAP-5, MANAGE-3
What it does
Trust Boundaries enforces the rules around inter-agent communication. When the Pilot delegates a task to a specialist (Hunter, Sentry, Concierge, Analyst), the delegation message itself crosses a trust boundary. Trust Boundaries validates the message:
- The Pilot is authorised to delegate to this specialist.
- The specialist is authorised to receive this delegation.
- The delegation payload is well-formed (matches the expected schema for this specialist).
- The delegation does not exceed rate limits (preventing the “agent storm” denial-of-service pattern).
The layer also enforces role isolation: the agent that proposed an action is not the same agent that validates it. The Pilot proposes; a specialist executes; a separate review path (in the policy-permitted cases) validates. The “excessive agency” failure mode — where one agent decides, executes, and audits its own decision — is structurally prevented.
For human-in-the-loop scenarios, the Trust Boundaries layer enforces the consent surface. A modifying tool call (Layer 3, side-effect: modifying) that policy says requires user consent triggers a Trust Boundaries check — the consent message is delivered through a separate, identity-rooted channel (typically the user's chat client), and the consent response is verified before the tool call executes.
What it produces
event_type: trust.boundary_evaluated source_agent: pilot-1 target_agent: sentry-1 delegation_type: security-review schema_validation: pass rate_limit_check: pass role_isolation: enforced user_consent: n/a | requested | granted | denied timestamp: 2026-05-05T14:23:11.840Z (RFC 3161 signed)
Failure mode if missing
Agents trust each other implicitly. A compromised specialist agent can elevate by impersonating the Pilot, or by delegating to other specialists outside its scope. The audit log shows a delegation occurred, but not whether the trust relationship was valid.
Inter-Service Auth
Maps to: OWASP LLM04, LLM09
NIST AI RMF: GOVERN-3, MAP-2
What it does
The twenty platform containers communicate over the network. Layer 6 is the rule that says: every service-to-service communication is authenticated using mutual TLS (mTLS), with certificates rooted in a per-deployment certificate authority operated by the identity service.
There are no shared service-account API keys. There are no INTERNAL_API_TOKEN environment variables. A service that has not presented a valid mTLS certificate cannot make a request to another service. A service whose certificate has been revoked (because the operator rotated it, because the certificate's expiry has passed, or because the identity service detected anomalous behaviour from that service) cannot continue to make requests.
Certificate rotation is automatic — every certificate has a 24-hour TTL by default, with renewal handled by the identity service. The operational impact: a compromised service container's blast radius is bounded by the rotation interval, because the compromised certificate stops working within 24 hours.
The layer also produces evidence for Layer 7 (Supply Chain) — every certificate issuance includes a cryptographic attestation of the requesting service's container image hash, which Layer 7 verifies against the supply-chain manifest.
What it produces
event_type: service.auth_evaluated source_service: agent-orchestrator target_service: file-store certificate_id: cert-abc123 certificate_ttl: 23h47m attestation_check: pass timestamp: 2026-05-05T14:23:12.001Z (RFC 3161 signed)
Failure mode if missing
A compromised container can make requests to any other container in the deployment using a shared API key that never rotates. The audit log shows requests are being made, but not whether the requesting service is the one it claims to be.
Supply Chain
Maps to: OWASP LLM05, LLM09
NIST AI RMF: GOVERN-2, MAP-1, MEASURE-1
What it does
Supply Chain is the layer that verifies the platform itself — the container images, the dependency manifests, the model artefacts, the policy bundles — at load time and continuously thereafter.
Every container image shipped by Handvantage has a cryptographic manifest signed with our publishing key (root of trust published at /.well-known/handvantage-publishing-key, also rotated quarterly with overlap). The manifest declares: the image's content hash, the SBOM (software bill of materials) for every dependency, the model artefact hashes for any pinned model versions, and the build-provenance attestation (SLSA Level 3 minimum).
At deployment time, the operator's install pipeline verifies every image's manifest against the published key. At runtime, the platform's supply-chain service periodically re-verifies — mostly to detect accidental drift (e.g. an image that was retagged in the customer's local registry without re-attestation) and to detect any publishing-key revocation.
For customer-supplied components (custom agents, custom tools, custom policy bundles), the same manifest discipline applies at install time. A custom agent without a signed manifest cannot be loaded into the agent registry. The customer signs their own manifests with their own key; the verification is local.
What it produces
event_type: supply_chain.verified component: container | model | policy | custom-agent component_id: file-store-v2.6.4 manifest_hash: sha256:... signature_valid: true sbom_present: true slsa_level: 3 timestamp: 2026-05-05T14:23:12.184Z (RFC 3161 signed)
Failure mode if missing
A swapped container image — accidentally or maliciously — runs in production without the operator noticing. A customer-supplied custom agent runs without provenance. The audit log shows the platform is operating, but not whether the platform is the platform that was reviewed and approved.
OWASP CROSS-WALK
OWASP Top 10 for Agentic Applications — coverage matrix.
The platform claims 10/10 coverage of the OWASP Top 10 for Agentic Applications. The table below shows which layer (or layers) close each category, and which evidence event proves it.
| OWASP # | Category | Layer(s) | Evidence event |
|---|---|---|---|
| LLM01 | Prompt Injection | Layer 1, Layer 2 | policy.resolved · prompt.firewall_evaluated |
| LLM02 | Insecure Output Handling | Layer 2, Layer 3 | prompt.firewall_evaluated · tool.call_evaluated |
| LLM03 | Training Data Poisoning | Layer 4, Layer 7 | memory.access_evaluated · supply_chain.verified |
| LLM04 | Model Denial of Service | Layer 5, Layer 6 | trust.boundary_evaluated · service.auth_evaluated |
| LLM05 | Supply Chain Vulnerabilities | Layer 2, Layer 7 | prompt.firewall_evaluated (atlas-RAG) · supply_chain.verified |
| LLM06 | Sensitive Information Disclosure | Layer 1, Layer 4 | policy.resolved · memory.access_evaluated |
| LLM07 | Insecure Plugin Design | Layer 3 | tool.call_evaluated |
| LLM08 | Excessive Agency | Layer 5 | trust.boundary_evaluated |
| LLM09 | Overreliance | Layer 6, Layer 7 | service.auth_evaluated · supply_chain.verified |
| LLM10 | Model Theft | Layer 4 | memory.access_evaluated (embedding inversion) |
The corresponding NIST AI RMF function mapping and the EU AI Act Annex IV technical-documentation cross-walk are on the compliance page.
WORKED EXAMPLE
What an end-to-end request looks like.
A finance team member named Alice asks the platform: “Summarise the Q1 vendor invoices in /finance/2026-Q1/, draft a memo to the CFO, and schedule a review meeting next Tuesday.”
The request enters the platform. Here's what happens, layer by layer, in roughly 4-7 seconds:
- Identity (SSO). Alice's session token is validated against the OIDC provider. Her identity attributes (department=Finance, role=Senior Analyst, clearance=Standard) are loaded.
- Layer 1 — Policy Engine. The request is mapped to the
finance-team-default-v3.2.1policy. The policy permits: read on/finance/*, document draft, calendar write. The policy restricts: external tool calls (no), PII egress (no), after-hours actions (no — it's 14:23, business hours). - Layer 2 — Prompt Defence. The prompt is scanned against 28 ATLAS rules. None trigger. Canary tokens are inserted at well-known positions in the prompt's context.
- Pilot decomposition. The Pilot receives the prompt + resolved policy. It decomposes into three sub-tasks: (a) read and summarise vendor invoices, (b) draft memo, (c) schedule meeting.
- Layer 5 — Trust Boundaries. The Pilot proposes delegation to: Hunter (for sub-task a), Analyst (for sub-task b), Concierge (for sub-task c). Each delegation is validated — schemas pass, rate limits pass, role isolation enforced.
- Hunter executes. Hunter reads the invoices in
/finance/2026-Q1/. Layer 3 (Tool Guardrails) validates every file read against Alice's policy scope — passes. Layer 4 (Memory Safety) confirms Hunter's working memory is scope-isolated to this session. Hunter returns a structured summary to the Pilot. - Analyst executes. Analyst receives the summary, drafts a memo.
document.draftis invoked; Layer 3 validates — passes (modifying tool, but draft mode requires no consent). The memo is saved as a suggestion in the document pillar (not committed). - Concierge executes. Concierge invokes
calendar.create_eventfor next Tuesday. Layer 3 flags this as modifying. Policy says calendar.create requires consent. Layer 5 issues a consent prompt to Alice through her chat client. Alice approves. Concierge proceeds. - Layer 2 (response check). Each agent's response is scanned against ATLAS rules before being aggregated. Canary tokens are verified intact (they are).
- Pilot aggregates. Pilot returns the result to Alice: summary, draft memo (link), meeting created (link).
- Audit trail emitted. Every layer has produced events. The events are signed, sequenced into the tamper-evident log, and exported in real time to the customer's SIEM in CEF format. A Trust Report can now be generated for this session and would include all 30+ events with their full metadata.
Total elapsed time: 5.2 seconds. Total audit events generated: 32. Total tool calls: 7. Total user-facing consent prompts: 1.
TRADE-OFFS
The architectural decisions we made — and the alternatives we rejected.
Every architecture is a set of trade-offs. The ones below are the structural decisions that shape the platform; we name them here so an evaluating architect can compare like-for-like with alternative platforms.
We did not build single-pass inference.
The 7-layer architecture means every request goes through seven processing steps before reaching the model and several more on the response. The single-pass alternative is faster (sub-second response times instead of 4-7s) but produces no audit evidence. We chose the slower, evidenced path. For latency-sensitive use cases (e.g. real-time customer chat), the platform is currently the wrong fit; we say so, on the contact page.
We did not build a custom model.
Several “AI for the enterprise” platforms ship a fine-tuned in-house model. We integrate with the customer’s choice. The trade-off is that we cannot promise model-level differentiation; what we promise is the architecture around the model. If your differentiation is model behaviour, you should evaluate model-shipping vendors. If your differentiation is governable, auditable use of any good model, the architecture matters more than the model.
We did not build multi-tenant SaaS.
Every customer gets their own deployment. We do not operate a shared production environment. The trade-off is operational complexity (each customer needs a deployment) and our inability to rapidly ship product improvements to all customers simultaneously. The benefit is data sovereignty and a much cleaner audit story — every event in every customer’s audit log is unambiguously theirs.
We did not build a chat-first UI.
The platform’s primary surface is the integrated workspace (email, files, chat, meetings, documents). The chat is one pillar, not the front door. If your use case is a single chat application that subsumes the rest, we are not the right shape; lighter chat-first vendors fit better.
We did not build a low-code agent builder.
Custom agents are configured in YAML, with a permission-scope schema enforced by Layer 7 at registration time. The low-code alternative (drag-and-drop visual builder) ships faster custom agents but produces less auditable agent definitions. We will revisit this trade-off if customer demand for visual builders becomes evident; we have not seen it yet from regulated-industry buyers.
CONTINUE THE CONVERSATION
If your team is evaluating architecture, schedule an architect session.
The most useful conversation about the architecture is the one where we sit with your engineering and security leaders for forty-five minutes and walk through your existing AI deployment, your audit posture, and where the seven layers would fit. We bring the architecture diagram, the ATLAS rule set, and a worked-example walkthrough.
Talk to us about your stack →Or write to hello@handvantage.com directly.
