Guillermo Llopis

Posted on Feb 24 • Originally published at annexa.eu

EU AI Act Annex IV: What Technical Documentation You Actually Need

#euaiact #compliance #regulation #ai

If your AI system is classified as high-risk, Article 11 of the EU AI Act requires you to prepare technical documentation meeting the requirements of Annex IV — before you place the system on the market or put it into service. This documentation must be kept up to date throughout the system's lifecycle.

Annex IV is not a suggestion. It is the backbone of the conformity assessment. Whether you self-assess under Annex VI or go through a notified body, this is the document package that proves your system complies. National market surveillance authorities can request it at any time, and they have the power to access your documentation, datasets, and source code.

Here is what Annex IV actually requires, section by section.

The nine sections of Annex IV

1. General description of the AI system

This is your executive summary. It covers:

The system's intended purpose and provider identity
How the system interacts with external hardware or software
Software and firmware version numbers and update requirements
Distribution format — whether it is embedded in hardware, delivered as a downloadable package, accessed via API, or deployed as a service
Hardware requirements for the system to run
The user interface provided to deployers
Instructions for use for the deployer

If your AI system is a component of a larger product, include photographs or illustrations showing external features, markings, and internal layout.

This section is straightforward but often underestimated. The instructions for use must be detailed enough for deployers to fulfill their own obligations under Article 26 — including understanding the system's capabilities, limitations, and human oversight requirements.

2. Development process and methodology

This is the most detailed and technically demanding section. It covers:

Design specifications — the general logic of the system and its algorithms, key design choices with their rationale, what the system is designed to optimize for, the relevance of different parameters, classification choices, expected output format, and output quality expectations.

System architecture — how software components build on or feed into each other and integrate into the overall processing pipeline.

Computational resources — what hardware and compute was used for development, training, testing, and validation.

Data requirements — this is where most companies struggle. You must document:

Training methodologies and techniques used
Training datasets including their provenance, scope, and main characteristics
How data was obtained and selected
Labelling procedures (who labelled, what guidelines, what quality controls)
Data cleaning methodologies applied

Human oversight measures — the technical measures put in place to help deployers interpret the system's outputs, in accordance with Article 14.

Pre-determined changes — any changes to the system and its performance that were planned or anticipated at the time of the initial conformity assessment.

Validation and testing — the procedures used, the validation and test datasets and their characteristics, the metrics used to measure accuracy, robustness, compliance, and potentially discriminatory impacts. All test logs and reports must be dated and signed by the responsible persons.

Cybersecurity measures — what protections are in place against adversarial attacks, data poisoning, model extraction, and other AI-specific threats.

3. Monitoring, functioning, and control

This section addresses system behavior in production:

Performance capabilities and limitations — including degrees of accuracy for specific persons or groups the system is intended to be used on, and overall expected accuracy relative to intended purpose
Foreseeable unintended outcomes — sources of risk to health, safety, fundamental rights, and potential for discrimination
Human oversight specifications — the technical measures enabling human operators to understand and intervene in the system's operation
Input data specifications — what data the system expects and in what format

The emphasis on accuracy broken down by demographic groups is intentional. If your system performs differently across populations (which most do), you must document this explicitly.

4. Performance metrics justification

A short but important section: you must explain why the performance metrics you chose are appropriate for this specific AI system and its intended purpose.

This is not just about reporting accuracy. If you use F1 scores, AUC-ROC, precision-recall trade-offs, or domain-specific metrics, explain why those metrics meaningfully capture the system's real-world performance and compliance with the Act's requirements.

5. Risk management system

A detailed description of the risk management system established under Article 9. This is not a standalone risk register — it must document:

How risks were identified and analyzed throughout the lifecycle
Risk estimation and evaluation methods
Risk mitigation measures and their effectiveness
Residual risks and whether they are acceptable
How the risk management system integrates with the overall development process

Article 9 requires that the risk management system be a continuous iterative process planned and run throughout the entire lifecycle. Your documentation must reflect this — not just a point-in-time assessment.

6. Lifecycle changes

A record of all relevant changes made to the system throughout its lifecycle. This includes changes to:

Training data or data processing
Model architecture or parameters
System behavior or performance characteristics
Intended purpose or deployment context

This section turns Annex IV from a one-time deliverable into a living document. You need a process for maintaining change records, not just a document you write once and file away.

7. Applied harmonised standards

Here you must list the harmonised standards you applied, referencing their publication in the Official Journal of the EU.

The current problem: as of February 2026, no harmonised standards have been formally cited in the Official Journal. Seven primary standards are in development under CEN-CENELEC JTC 21, but none provide a presumption of conformity yet. The most advanced is prEN 18286 (Quality Management Systems for AI Act compliance), which entered public enquiry in October 2025. Full delivery of harmonised standards is targeted for Q4 2026 — after the August deadline.

Where no harmonised standards have been applied (which is everyone, right now), you must provide detailed descriptions of the alternative solutions you adopted to meet each requirement, plus a list of other relevant standards and technical specifications you followed.

In practice, this means referencing standards like ISO/IEC 42001 (AI Management Systems), ISO/IEC 23894 (AI Risk Management), or the AESIA guidance documents — while acknowledging these do not provide formal presumption of conformity under the AI Act.

8. EU Declaration of Conformity

A copy of the declaration of conformity issued under Article 47. This is a formal document where the provider declares that the AI system meets all applicable requirements of the regulation.

9. Post-market monitoring system

A detailed description of how you will evaluate the system's performance after deployment, including the post-market monitoring plan required by Article 72(3). This covers:

How you will collect and analyze data on the system's real-world performance
What metrics you will track post-deployment
How you will detect and respond to performance degradation
Your incident reporting procedures
How post-market findings feed back into the risk management system

Where standards stand today

The standards situation creates real uncertainty for companies preparing documentation now.

CEN-CENELEC JTC 21 is developing seven primary harmonised standards covering risk management, data governance, transparency, human oversight, accuracy, robustness, cybersecurity, bias management, and quality management systems. These were originally expected by late 2025 but are now targeted for Q4 2026. In October 2025, CEN-CENELEC adopted exceptional acceleration measures — allowing standards to skip the formal vote stage after a positive enquiry vote — to speed delivery.

prEN 18286 is the most mature standard. Its Annex ZA directly maps QMS requirements to Articles 11, 17, and 72 of the AI Act, making it the closest thing to a practical blueprint for Annex IV compliance.

ISO/IEC 42001, while internationally recognized, is not part of the EU harmonisation process. The EU AI Office indicated in 2024 that it does not fully align with the final AI Act text. The JRC found it provides limited coverage of logging and recordkeeping. It is complementary, not sufficient.

The practical implication: you cannot rely on any single standard for presumption of conformity today. You must document your own approach to meeting each requirement and be prepared to explain it to regulators.

The Digital Omnibus factor

The Digital Omnibus proposal (November 2025) could push the high-risk compliance deadline to December 2, 2027 for Annex III systems, contingent on harmonised standards not being ready. It would also extend simplified documentation requirements from microenterprises to all SMEs and introduce proportionate expectations for small mid-cap companies.

But the Omnibus is still a proposal. It must pass through Parliament and Council. And even if adopted, companies would still need to demonstrate "good faith" compliance effort. Documentation requirements under Annex IV do not change — only when they must be ready.

Best available guidance: AESIA

Spain's AI supervisory agency (AESIA) published 16 practical guidance documents in December 2025, developed through its AI regulatory sandbox where 12 real AI systems were tested across six sectors.

Guide 15 is a 62-page document focused specifically on technical documentation. It covers required content, preferred format, and best practices for structuring and storing Annex IV documentation. These are the most detailed publicly available compliance resources for Annex IV, developed by the EU's first operational national AI supervisory body.

The guides are non-binding and described as "living resources," but they offer the closest thing to a practical reference for Annex IV preparation. You can access them at aesia.digital.gob.es.

Common mistakes

Treating it as a one-time document. Sections 5, 6, and 9 explicitly require ongoing maintenance. If your documentation is a static PDF from six months ago, it is already non-compliant.
Generating documents without evidence. Regulators look for verifiable artifacts — actual test results that influenced design decisions, genuine risk assessments, real audit logs. Assertions without evidence will not pass conformity assessment.
Ignoring data provenance. Section 2 requires detailed documentation of training data origin, selection criteria, labelling procedures, and cleaning methods. This is nearly impossible to reconstruct after the fact if you did not track it during development.
Skipping the metrics justification. Section 4 is short but important. "We used accuracy" is not a justification. You need to explain why your chosen metrics meaningfully capture real-world performance for your specific use case.
Waiting for harmonised standards. The standards are late. The deadline may or may not move. Starting now with the available guidance (AESIA, ISO frameworks, the regulation text itself) is far better than waiting for perfect clarity.

What to do now

Audit what you already have. Most teams have scattered fragments of Annex IV documentation across design docs, Jupyter notebooks, model cards, and internal wikis. Inventory what exists and identify gaps systematically.
Start with the hardest sections. Sections 2 (development process) and 5 (risk management) are the most demanding and the most likely to have gaps. Start there while institutional knowledge is still fresh.
Implement documentation processes. Annex IV compliance is not a document — it is a process. Set up version-controlled documentation, data lineage tracking, and change logs now. Every week you delay makes retrospective documentation harder.
Use the AESIA guides as a reference. Spain's 16 practical guidance documents are the most detailed publicly available resource. Guide 15 on technical documentation is a practical starting point.
Generate a first draft. Annexa can analyze your codebase (Python, YAML, JSON) and generate a first-draft Annex IV dossier, identifying gaps and flagging sections that need legal review. It will not replace your engineering team's judgment, but it gives you a concrete starting point instead of a blank page.

The August 2026 deadline may or may not shift. The documentation requirements will not. Start now.

DEV Community