DEV Community

Cover image for Compliance as an engineering problem: building an open-source Information Security, Privacy and AI Governance Platform
Gregory Griffin
Gregory Griffin

Posted on

Compliance as an engineering problem: building an open-source Information Security, Privacy and AI Governance Platform

There are two kinds of compliance tooling on the market. The first is a spreadsheet dressed in good intentions. The second is a contract with a golf invitation attached. I spent a few years watching organisations pay handsomely for the second while quietly operating the first underneath it, and eventually decided to find out what it would take to build something that was neither.

This post is the architectural write-up. Not a pitch — I'm not selling anything. Just the decisions, the trade-offs, and the parts that were actually hard. If you've ever looked at the GRC market and wondered whether the prices were commensurate with the engineering, you're the intended reader.

The thesis

Treat compliance implementation as an engineering problem, not a consulting exercise. Every artefact — the policy, the implementation guide, the self-assessment scorecard, the crosswalk to other frameworks — should derive from the same structured source. Change a control, everything downstream updates consistently. Most commercial tools fail this test, which is why their "single source of truth" tends to drift within a quarter.

Every generated script and policy in the repo carries a QA_VERIFIED marker. If it hasn't been through the four-layer validation pipeline — existence, keyword coverage, semantic similarity against a 45-standard normative reference corpus (ISO 27001/27002/27701/27018/42001, NIST 800-series, GDPR, DORA, NIS2, PCI DSS, ENISA, NSA, CSA, OWASP, MITRE ATT&CK, and more — about 5,000 indexed chunks), and finally a Claude-driven gap analysis with reasoning — it doesn't ship. Feynman's "the first principle is that you must not fool yourself, and you are the easiest person to fool" is pinned to the README for a reason.

The stack

  • Backend: Python / FastAPI
  • Frontend: React 19 / TypeScript / Vite / Tailwind / MUI 6
  • DB: PostgreSQL with Alembic migrations (currently at revision 041)
  • Search: OpenSearch for full-text across the document corpus
  • Queue: Celery / Redis for evidence connector jobs and daily KPI snapshots
  • Proxy: Nginx with TLS
  • Deploy: single docker-compose.yml, 10-service stack, runs on anything with Docker 24+

What it covers

Five content products, one Platform, four ISO standards:

  • Framework — the full ISO 27001:2022 engineering product: 53 control packs covering all 93 Annex A controls, 188 Python generators for this product alone, 504 implementation guides (user and technical), multi-sheet assessment workbooks with automated scoring
  • Operational — a lightweight ISO 27001 foundation ISMS for SMEs: 53 operational policies with single-sheet compliance checklists
  • Privacy — ISO 27701:2025 extension: 21 control groups split across controller, processor, and shared, with 23 privacy policies and full GDPR Article 30 / RoPA traceability
  • Cloud — ISO 27018:2025 extension: 12 Annex A control groups for PII in public cloud, with SCCs, IDTA, and adequacy monitoring built in
  • AI — ISO 42001:2023 extension: 12 control groups for AI management system governance, with ISO 42005:2025 impact-assessment methodology structured in, plus EU AI Act + NIST AI RMF + OECD AI Principles crosswalks

Across all five: 317 Python generators, 590 implementation documents, ~377K lines of code, 99 control groups. All policy content renders in EN / FR / DE / IT across seven jurisdictions (CH / FR / BE / LU / DE / AT / IT / GB), with country-specific regulatory tokens applied at request time — no policy clones.

On the Platform side, 23 compliance assessment modules — NIS2, DORA, CIS Controls v8, BSI IT-Grundschutz, TISAX, NIST CSF 2.0, NIST AI RMF 1.0, EU AI Act, EU Cyber Resilience Act, EU Cloud Sovereignty Framework, BaFin BAIT, CSSF Circulaire 20-750, ACN Guidelines, Swiss nDSG, Swiss ISG (SR 128), Swiss CSRM (NCSC), UK NIS 2018, UK Operational Resilience (FCA/PRA), COBIT 2019, CyberFundamentals (CCB/BE), and others — connected by over 3,400 cross-framework mappings.

44 automated evidence connectors pull from the usual suspects and then some: Microsoft (Entra ID, Defender, Sentinel, Intune, M365, Azure CSPM), network (FortiGate, Cisco, Zscaler, PAN-OS), identity (AD, LDAP, CyberArk, Vault), vulnerability (Qualys, Tenable, CrowdStrike, SentinelOne, Wazuh), ITSM (ServiceNow, Jira), monitoring (PRTG, Graylog, Zabbix), cloud posture (AWS Security Hub, Azure CSPM, GCP SCC), and threat intel (OpenCTI, OpenAEV).

Threat intelligence is wired in through a dedicated feeds container pulling six sources on schedule: MITRE ATT&CK v18 (641 techniques, 135 threat actor groups, 640 software entries, 25 campaigns), MITRE ATLAS (AI/ML adversarial), CISA KEV (daily), FIRST EPSS (daily), NVD CVE (~342K entries), and NVD CPE. The ATT&CK heatmap supports Navigator-style filtering by actor, sub-technique, and software attribution.

The odds and ends that took longer than expected: EBIOS RM (full 5-workshop ANSSI methodology), BIA with RTO/RPO/MTPD, TPRM with DORA ICT fields, 5×5 risk register with BSI 200-3 scoring, Projects Workspace with document-variable substitution and in-platform WYSIWYG editing, a Cytoscape.js control dependency graph (229 intra-ISO-27001 relationships), six-role RBAC with multi-org support, TOTP MFA.

The parts that were actually hard

The hardest part, by a considerable margin, was the crosswalk methodology. Mapping over 3,400 relationships between twenty-three frameworks with defensible confidence scores is not a one-afternoon job, and the internet is full of crosswalks that collapse on inspection. Structured domain tagging plus human review on every mapping was the only approach that produced something I'd stake my name on.

Multilingual policy generation was the second-hardest. Legal and regulatory tone doesn't translate cleanly — each jurisdiction has its own conventions, its own preferred phrasings, its own idea of what "reasonable" means in a control objective. The architecture settled on runtime rendering with country-specific regulatory tokens rather than maintaining separate policy forks per jurisdiction. There is no shortcut, and machine translation alone will embarrass you in front of an auditor.

The QA engine was the third. A one-shot LLM "does this policy cover the control" is not serious engineering — the model will cheerfully hallucinate coverage it cannot substantiate. The four-layer pipeline exists because each layer catches a different class of failure: existence catches missing artefacts, keyword coverage catches shallow drafts, semantic similarity against the 45-standard reference corpus catches policies that sound right but say the wrong thing, and the Claude gap analysis with reasoning catches everything the other three miss. Removing any of them visibly degrades the output.

The stack choices — OpenSearch for the document corpus, Celery/Redis for the connector fleet and KPI snapshots — were deliberate and I'd make them again. Full-text search across ~590 implementation documents plus 5,000+ reference chunks is not a job for PostgreSQL's tsvector, and 44 evidence connectors running on their own schedules need a real queue. At this scope, those are table stakes, not luxuries.

If you want to look under the hood

Repository: https://github.com/isms-core-project/isms-core-platform.

Site with the full tour: https://isms-core.com.

It's Docker Compose, self-hosted, and will happily run without ever phoning home.

Happy to discuss any of the architectural decisions, the crosswalk methodology, the QA pipeline, or the runtime multilingual rendering in the comments — particularly the crosswalk and the QA pipeline, which I suspect are the parts most likely to interest anyone who has tried to do either themselves.

Top comments (0)