Delafosse Olivier

Posted on May 21 • Originally published at coreprose.com

Designing Secure Agentic AI: How Cisco’s Foundry Specification Can Standardize Open-Source Defenses

#ai #machinelearning #llm #programming

Originally published on CoreProse KB-incidents

Agentic AI is moving from chat windows into pipelines that touch code, infrastructure, and production data. These systems can perceive, plan, act with tools, and learn from memory with limited human supervision, which fundamentally raises the blast radius of any failure or compromise.[1]

Vendors are reacting:

Databricks added Agentic AI as the 13th component in its AI Security Framework (DASF v3.0), with 35 new risks and 6 mitigation controls dedicated to agents.[9]
OpenAI’s Daybreak and CrowdStrike’s AgentWorks both treat agents as first-class security subjects, not just UX layers on top of LLMs.[4][7]

Cisco’s open-source Foundry specification is emerging as a potential common language for defining how agents are structured, governed, and defended. Foundry-style specifications can standardize how we describe agent capabilities, tool surfaces, memory behavior, and observability—much like DASF’s Agentic AI Extension standardizes risk taxonomy.[9]

💡 Orientation: This article treats Foundry as a reference design. We position it among existing frameworks, infer core principles, walk through a reference architecture, and outline how to test and govern “Foundry-compliant” agent stacks in practice.

1. The security problem: why agentic AI needs a formal specification

Agentic AI goes beyond classic prompt–response models. Instead of “answer this question,” we now have systems that:[1]

Decompose goals into sub-tasks
Call external tools and APIs
Coordinate with other agents
Persist and update long-term memory

They operate autonomously across a perception → reasoning/planning → action → learning loop using LLMs and other ML components.[1]

⚠️ Risk shift: The concern moves from “Did the model hallucinate?” to “What can the agent do if it’s wrong—or if it’s compromised?”[9]

Agentic AI in formal security frameworks

Databricks formalized this shift by extending DASF v3.0 with a dedicated Agentic AI component:[9]

13th system component: Agentic AI
35 new agent-specific risks: memory, planning, tool use, and communication[9]
6 new mitigation controls: least privilege, sandboxing, human supervision, MCP hardening, and more[9]

This treats autonomy as a fundamentally new risk surface that needs its own control plane, not just optional checks on chatbots.

High-privilege SOC workflows already depend on agents

Security operations centers are wiring agents into SIEM and SOAR workflows.[3] Typical pattern:

Ingest SIEM alerts
Enrich with asset, identity, and threat-intel data
Propose classifications and next steps
Trigger SOAR playbooks for low-risk actions

In one prototype, a triage agent silently downgraded noisy alerts. A parsing error on a rare alert type suppressed critical notifications from one EDR source; only a human noticing fewer cases revealed the problem.[3]

➡️ Lesson: Silent, automated failures in high-privilege environments require a formal spec with built-in guardrails and monitoring.

Dedicated agentic security platforms and early-by-design moves

Security vendors are building agent-centric platforms:

CrowdStrike AgentWorks – lets SOC teams design, test, and deploy agents into Falcon, with governance, controls, and multi-model support (Claude, GPT, Nemotron).[4]
OpenAI Daybreak – wraps GPT‑5.5 variants and Codex Security into a stack that embeds security analysis into SDLC: code review, threat modeling, patch generation, sandbox testing.[7][8]

Daybreak’s stance: security must be built in from the first lines of code, not bolted on later.[7]

💼 Implication for Foundry: Cisco’s open-source Foundry spec can be the vendor-neutral counterpart to these platforms: a shared schema for agent capabilities, tool wiring, permissions, and governance that any cloud, on-prem, or hybrid implementation can adopt.[9]

2. Situating Cisco’s Foundry among agentic AI and security frameworks

What “agentic AI” means for Foundry

In the Foundry context, “agentic AI” refers to autonomous systems that combine LLMs, other ML models, and external tools to complete complex tasks with minimal supervision.[1][2] They:

Maintain internal state and memory
Plan multi-step workflows
Call APIs, databases, and runtimes
Learn from past interactions

This aligns with open-source views of agents as chaining perception, decision, and action rather than single-shot inferences.[2]

Open-source agents vs proprietary platforms

Open-source stacks and specs (like Foundry) imply:[2]

Inspectable code/configs: audit, extend, or fork flows as needed
Deployment freedom: on-prem, private cloud, hybrid; easier sovereignty/compliance
Governance responsibility: you own permissions, logging, and incident response

Trade-off: deep control and alignment with your infra, but you inherit operational burden and must enforce controls yourself.

By contrast, AgentWorks and Daybreak provide managed environments with strong embedded guardrails but limited portability.[4][7]

Mapping Foundry to DASF’s Agentic AI Extension

Databricks’ Agentic AI Extension is a useful checklist for what any spec should cover:[9]

Risk catalog for planning, memory, tool use, multi-agent communication
Controls: least privilege, sandboxing, human-in-the-loop review, MCP hardening[9]
Guidance for mapping every agent, tool, and integration to these risks[9]

A Foundry spec can mirror this: define components, enumerate threats per component, and attach control families (authz, isolation, monitoring, fail-safes).

Foundry vs Daybreak and heterogeneous ecosystems

Daybreak is a vertically integrated, managed stack combining GPT‑5.5 and Codex Security for general, defensive, and offensive cyber workflows.[7][8] Suitable for organizations that want one vendor to control models, tools, and sandboxing.

Foundry targets portability: a blueprint that can run across clouds and stacks, including on-prem and hybrid scenarios like the OpenAI–Dell Codex partnership.[5] That partnership emphasizes deploying “where the enterprise data already resides,” with sovereignty and residency as core design constraints.[5]

Security vendors are also embedding agent design studios (AgentWorks) inside SOC platforms, with multi-model support and SOAR interoperability.[4] Any Foundry-style spec must therefore assume:[4][5][9]

Multiple LLM providers
Mixed on-prem/cloud tools
Cross-vendor trust and data boundaries

📊 Mini-conclusion: Against DASF, Daybreak, AgentWorks, and Codex-on-Dell, Foundry’s distinctive role is to define a vendor-neutral schema for secure agent behavior that can move across heterogeneous environments.[4][5][9]

3. Core principles likely defined in Cisco’s Foundry agentic AI security spec

3.1 Formalizing the agent lifecycle

A mature spec should model the full lifecycle explicitly:[1][2]

Perception – ingest text, events, logs, metrics
Reasoning / planning – decide on a course of action
Action via tools – call APIs, execute code, update systems
Memory – read/write long-term state

Open-source guidance stresses that production agents must make these phases inspectable and auditable, not hide them inside opaque prompts.[2]

⚡ Spec expectation: Foundry should require explicit interfaces, policies, and logs for each phase, rather than “magic” inside a single LLM call.

3.2 Tool and memory risks as first-class citizens

DASF’s Agentic AI Extension treats tool ecosystems and MCP-based integrations as major attack surfaces.[9] Risks include:[9]

Prompt injection leading to dangerous API calls
Escalation via overly broad tool scopes
Cross-tenant or cross-environment data exfiltration

A Foundry spec should therefore:[9]

Define per-tool scopes and least-privilege defaults
Mandate sandboxing for high-risk tools (code execution, infra changes)
Set policies for memory retention, redaction, and access control

3.3 SOC-embedded agent principles

For SOC use cases (SIEM triage, enrichment, SOAR orchestration), the spec should define tiers of authority:[3]

Read-only analysis: log/asset queries, scoring, explanations
Low-risk automation: ticket enrichment, tagging, suggested playbooks
High-impact response: firewall changes, account lockdowns, EDR actions

Some SOCs already prohibit agents from closing incidents or performing containment without explicit, logged human approval—even at high confidence—to prevent large-scale, wrongheaded actions.[3]

3.4 Security-by-design in development workflows

Daybreak’s Codex Security analyzes codebases, builds editable threat models, and tests patches in sandboxed environments before recommending merges.[7][8] A Foundry-aligned spec should:[7][8]

Treat code and infra analysis as standard agent patterns
Require sandboxed test environments for any code-modifying agent
Encourage machine-readable evidence (test logs, exploit traces) for audits

3.5 Guardrails, ethics, and limitations

Guardrail frameworks (Weights & Biases Guardrails, NVIDIA NeMo Guardrails, nexos.ai, and others) enforce behavioral constraints, detect PII leakage, moderate tool calls, and manage compliance.[6]

A Foundry spec should codify:[3][6]

Mandatory risk assessment and monitoring hooks
Policy-based blocking of specific tool calls or data flows
Documentation of model biases and limitations, especially in SOC contexts

💡 Mini-conclusion: Across these principles, agents are treated like critical software components—with explicit lifecycle, threat models, and guardrails—not “smart prompts.”[2][6][9]

4. Reference architecture: implementing a Foundry-compliant secure agent stack

4.1 Layered architecture

A practical mental model:[1][2][3][4][5][9]

LLM/Model layer – GPT, Claude, Nemotron, open-source LLMs[4]
Agent core – planner, memory manager, tool orchestrator[1][2]
Security substrate (Foundry layer) – policies, authorization, logging, guardrails[9]
Integration layer – SIEM, SOAR, ticketing, CI/CD, data platforms[3][5]

CrowdStrike’s AgentWorks already shows pluggable model layers under a common governance umbrella; Foundry should formalize this separation so LLMs can be swapped without redoing security controls.[4]

4.2 Control plane and governance APIs

For an open-source agent stack, the control plane should expose:[2]

Policy definitions per agent and tool
Audit logs for every tool call and memory write
Versioning for prompts, workflows, and policies

Example YAML-style Foundry policy:

agent: soc-triage
permissions:
  tools:
    - name: query_siem
      scope: read_only
      max_rows: 5000
    - name: trigger_soar_playbook
      scope: restricted
      require_human_approval: true
memory:
  retention_days: 30
  pii_redaction: enabled
guardrails:
  provider: nemo
  blocks:
    - data_exfiltration
    - prompt_injection

This is the level of explicitness a spec should encourage for secure operations.[2][6][9]

4.3 On‑prem and hybrid patterns

The OpenAI–Dell Codex integration illustrates how agents can run close to enterprise data on Dell AI Data Platform and Dell AI Factory, keeping inference and context within the customer perimeter.[5] Foundry-aligned deployments can:[2][5]

Self-host the agent runtime in a data center or VPC
Connect to local data platforms and observability stacks
Route only minimal signals to external LLM APIs under strict policies

📊 Compliance angle: This topology directly supports sovereignty and residency requirements common in European and regulated sectors.[2][5]

4.4 SOC-centric deployment example

A Foundry-compliant SOC agent stack might:[3][6]

Ingest SIEM alerts via streaming
Enrich alerts with CMDB, IAM, and threat-intel data
Apply guardrails to prevent direct destructive actions
Surface suggested triage decisions in the analyst console
Trigger only low-risk SOAR tasks (tagging, ticket creation) automatically

SOC agent guides emphasize including architecture diagrams, performance metrics, and operational guardrails—elements a Foundry spec should require in deployment documentation.[3]

4.5 DevSecOps workflow with sandboxed agents

Inspired by Daybreak, a secure development workflow can:[7][8]

Scan codebases and build an editable threat model
Identify realistic attack paths and risky dependencies
Propose code/config changes
Test changes in a sandbox that attempts known exploits
Export evidence and results for human review before merges

⚠️ Non-negotiable: Agents never write directly to production; changes flow through CI/CD with standard approvals.[7][8]

4.6 Guardrail layer integration

Guardrail products (W&B Guardrails, NeMo Guardrails, nexos.ai) can act as a pre-execution filter for prompts, tool outputs, and planned actions.[6]

Example orchestration:

plan = agent.plan(observation)
validated_plan = guardrails.validate(plan)
if not validated_plan.approved:
    log_and_alert(validated_plan.issues)
else:
    agent.execute(validated_plan)

This pattern helps catch risky tool invocations before they hit critical systems.[6][9]

5. Testing, guardrails, and evaluation for Foundry-style agentic AI

5.1 Testing the full perception–reason–act loop

Open-source reliability work stresses that single-prompt unit tests are insufficient.[2] You must test the complete loop:

Input parsing under noisy, real-world data
Planning correctness and robustness to adversarial inputs
Tool invocation safety and idempotence
Memory consistency and leakage risks

💡 Practice: Treat scenarios like integration tests: replay end-to-end flows and assert on both outcomes and side effects.

5.2 Using DASF’s 35 agentic risks as test generators

Each of DASF’s 35 agent-specific risks can seed red-team scenarios and simulation tests:[9]

Prompt injection causing unauthorized tool calls
Memory poisoning leading to misclassification
Multi-agent collusion or cross-boundary confusion[9]

Teams can build a test matrix: risk ID → scenario → expected behavior → guardrail checks.

5.3 SOC-specific replay testbeds

SOC guides recommend replaying historical alerts and incidents to evaluate agents under realistic load and distribution.[3] Track:

Precision/recall vs human triage
Time-to-triage per alert
False-positive amplification or suppression
Latency overhead per event[3]

📊 Tip: Start in “shadow mode”: agents score and enrich alerts but do not affect live workflows until performance is validated.

5.4 Guardrails from day one

Guardrail frameworks (W&B Guardrails, NeMo Guardrails) are meant to be embedded early.[6] For a Foundry-style deployment:

Define clear policies for unacceptable behaviors and tool calls
Instrument all agent actions with logging and alerts
Regularly review incidents to refine policies and update the spec

6. Where Foundry fits next

Cisco’s Foundry specification can give enterprises a shared, open-source vocabulary for describing agent behavior, tool access, memory, and guardrails—independent of any single vendor stack.[1][2][9] Positioned alongside DASF, Daybreak, and AgentWorks, it helps shift the industry from opaque “clever bots” to documented, testable, and governable agentic systems.[4][7][9]

As agentic AI moves deeper into SOCs, CI/CD pipelines, and data platforms, the organizations that succeed will be those that treat their agents like critical infrastructure: specified, tested, monitored, and continuously hardened—not just prompted.[3][6][9]

About CoreProse: Research-first AI content generation with verified citations. Zero hallucinations.

🔗 Try CoreProse | 📚 More KB Incidents

DEV Community