EU AI Act Annex IV: what your technical documentation actually has to show.

When a CISO says "we're working on EU AI Act compliance," the next question to ask is which document they're working on. The regulation refers to several. The one that matters for any system in the high-risk classification — and the one most teams underestimate — is the Annex IV technical documentation. Article 11 makes it the basis of conformity assessment. Article 47 makes it the basis of the declaration of conformity. Article 71 makes it the basis of the EU database registration. If the Annex IV dossier is incomplete or unverifiable, the system cannot legally be placed on the market in the EU. Everything else in the high-risk regime is downstream of this document.

The interesting thing about Annex IV is that most teams reading it for the first time treat it as a paper exercise. The text reads like a procurement template: "a general description," "a detailed description," "a list of harmonised standards." An organisation can plausibly produce all of that in a week of focused work, ship it to a Notified Body, and receive a conformity assessment. That read of the document misses what an auditor actually does with it. The dossier is not the assessment. The dossier is the index that the assessment uses to walk through the system's behaviour during the audit window. Six of the nine elements require contemporaneous, machine-verifiable evidence — not narrative — to survive that walk.

Annex IV has nine numbered sections. Sections 1, 6, 7, and 8 are largely descriptive. Section 1 is a general description: intended purpose, identification of the provider, version of the system, hardware on which it runs, photographs or illustrations of the user interface, instructions for use. Section 6 is a description of relevant changes made through the system's lifecycle. Section 7 is the list of harmonised standards applied (ISO/IEC 42001, ISO/IEC 23894, and so on, where relevant). Section 8 is a copy of the EU declaration of conformity. These four sections are documents. They can be drafted, reviewed, and signed by a competent author in a focused effort. They are the parts most teams have under control.

Sections 2, 3, 4, 5, and 9 are different. They are descriptions of system behaviour over time, and an auditor reads them with the audit log open. Section 2 — the detailed description of the system's elements and the development process — references the data sets used, the data labelling and curation practices, the methodology for training and validation, the system's architecture, the design specifications, and the rationale for design choices. Section 3 — monitoring, functioning, and control — describes how the system is monitored in operation and what mechanisms exist to detect malfunction. Section 4 — performance metrics — describes the appropriateness and validation of the metrics used. Section 5 — risk management — describes the system that implements Article 9's continuous risk management requirements, with the residual risks accepted by the provider. Section 9 — post-market monitoring — describes the system in place to evaluate AI performance after deployment, with the plan for collecting feedback and reporting incidents.

Each of these five sections has a paper component (the description) and an evidence component (the proof that what's described is actually happening). The paper component is straightforward. The evidence component is where most platforms break. An auditor reading Section 3 doesn't only want to know "what monitoring exists"; they want to see the monitoring records for the audit window — the events the monitor produced, in sequence, with timestamps that survive third-party verification. An auditor reading Section 5 doesn't only want the risk register; they want to see, for each control declared in the risk register, the specific events in the audit log that demonstrate the control was operating during the window. An auditor reading Section 9 doesn't only want the post-market monitoring plan; they want the monitoring outputs for the period since deployment.

The audit window is the interval during which the system has been placed on the market or put into service. For a system going live on 2 August 2026 — the high-risk obligations deadline — the audit window opens that day. For an existing system that crosses into high-risk classification at deadline, it opens at the moment the classification changes. In neither case is the audit window something that begins later. The Annex IV dossier needs to be complete on the day the system goes into service, with the evidence components covering the operational history the system has at that point.

This is why retrofitted monitoring is structurally inferior. A monitoring tool installed two months before the deadline generates evidence covering those two months — and only those two months. Section 3 of the dossier reads as if the system had no contemporaneous monitoring before that. Sections 5 and 9 inherit the same gap. The Notified Body assessment may still pass on a narrow technical reading, but the dossier is fragile to challenge: any subsequent investigation, any post-market incident, any market-surveillance authority request will return to the same period and find an evidence gap.

The architectural alternative is to produce the evidence at the moment of each system action, by the system itself, with cryptographic integrity. Each event is timestamped against an RFC 3161 trusted time source. Each event is signed and sequenced into a tamper-evident log (Merkle-tree-rooted, periodically anchored). Each event is mapped — at the moment it is produced, not retrospectively — to the specific Annex IV section and Article 9 control category it satisfies. The evidence components of Sections 2, 3, 4, 5, and 9 then become slices of that log scoped to the time window and the system in question. This is what Vantage Workspace produces; the broader claim is that this is what any platform shipping into the EU AI Act regime needs to produce, regardless of vendor. A platform that does not produce the evidence forces the deployer to assemble it after the fact, which is the same project we have watched cancel agentic AI proposals for the last twelve months.

Even with a platform that produces complete Annex IV evidence, the deployer still owns specific responsibilities. The declaration of conformity (Section 8) is signed by the deployer, not the provider. The intended-purpose description (Section 1) reflects the deployer's chosen use case, which the provider does not know in advance. The post-market monitoring plan (Section 9) needs to reflect the deployer's actual deployment context — the user population, the data inputs, the operational environment. The harmonised-standards list (Section 7) reflects the deployer's compliance choices. A platform's evidence record provides the substrate; the deployer's compliance team is what assembles the dossier from the substrate.

Three questions to ask any AI vendor this week. First: "Show me the Annex IV evidence record for your platform's last 30 days of operation in your reference deployment." If the answer is "we generate logs, you can build the mapping," the platform has not been architected for the regulation. Second: "For each of Annex IV sections 2, 3, 4, 5, and 9, name the specific log event types that satisfy the section and show me an example." Vendors who can answer this will name three or four event types per section, with sample payloads. Vendors who cannot answer will redirect to a generic monitoring dashboard. Third: "What does the platform do that an auditor cannot independently verify against the customer's SIEM?" The right answer is "nothing." Every event the platform produces should be mirrored in real time to the customer's logging infrastructure, where the auditor's review cannot be silently filtered or replayed.

The 2 August 2026 deadline is fourteen weeks away as of this writing. The audit window for systems that need to comply is open today. The Annex IV dossier is what the high-risk regime is built around. If the technical documentation requirement still feels like a paper exercise, the conversation with the platform vendor — and with the compliance team — is about to get more difficult. The gap between platforms that produce contemporaneous evidence and those that don't will define which AI deployments survive the next twelve months in the EU and which will be retroactively reclassified, recalled, or quietly cancelled.

EU AI Act Annex IV: what your technical documentation actually has to show.

From B to A: the discipline behind the upgrade

If something here is what you're working on, talk to us.