DEV Community

Cover image for OpenClaw on AWS Lightsail — Threat Model Alignment: OWASP, MITRE ATLAS, and the Gap No Framework Anticipated (Part 3)
Gerardo Castro Arica for AWS Heroes

Posted on

OpenClaw on AWS Lightsail — Threat Model Alignment: OWASP, MITRE ATLAS, and the Gap No Framework Anticipated (Part 3)

Part 3 of the series: In Part 1 we audited the initial OpenClaw setup on AWS Lightsail — outdated kernel, the gateway + allow combination as a critical attack chain, and the Gateway Token exposed in plaintext. In Part 2 we went deep into the full dashboard — channels, agents, cron jobs, logs, and configuration panels. If you haven't read the previous parts, the full index is at the end of this post.

Note on findings: Vectors #7, #8, and #10 identified in Part 2 are pending live validation. The framework mapping in this post confirms that the vector exists and has documented precedent in similar systems — not that it was executed on this specific OpenClaw instance. Controlled environment validation is coming in Part 4.

References used in this post:

The starting point

In the first two parts of this series we audited OpenClaw deployed on AWS Lightsail — not as a user, but as a Cloud Security Engineer with the goal of mapping the attack surface that deployment opens.

The result was a list of 13 findings distributed across three layers that don't always appear together in the same analysis:

IaaS Layer — what Lightsail brings by default and the operator inherits without necessarily knowing it:

# Finding Severity
1 Blueprint with outdated kernel and libraries High
4 IPv6 enabled by default Medium
5 Apache2 without documented hardening Medium

Application Layer — what OpenClaw exposes as a system, regardless of where it runs:

# Finding Severity
3 Gateway Token in plaintext on the dashboard High
6 No granular access control on channels High
7 Indirect prompt injection via external channels Critical
8 Memory poisoning via unvalidated context High
9 53 skills active by default — opt-out model High
10 No permission inheritance model in agent chains High
13 Config and Debug exposed via Gateway Token Medium

Intersection Layer — where operator configuration activates vectors that no application framework anticipates alone:

# Finding Severity
2 exec_host_policy: gateway + shell_approval: allow = no isolation Critical
11 Cron Jobs as persistence and defense evasion vector High
12 Local logs without external export High

These three layers are not a design accident — they are the result of a deploy decision. Someone saw the OpenClaw blueprint on Lightsail, clicked, and inherited an attack surface that no application threat model fully contemplates, because none was designed to do so.

The question this post answers: what does each existing security framework cover, where do these 13 findings map, and what is left without a home?

MITRE ATLAS — OpenClaw's official threat model

MITRE ATLAS (Adversarial Threat Landscape for AI Systems) is the reference framework for modeling threats against artificial intelligence systems. It works with the same logic as MITRE ATT&CK — tactics, techniques, and procedures — but applied to the ML and AI ecosystem.

The OpenClaw team adopted it as the basis for their official threat model, available at trust.openclaw.ai/threatmodel. The result is a matrix of 37 threats distributed across 8 tactics, with 6 classified as critical.

Before mapping the findings, there is something important to understand about the scope of this threat model: it was designed to model threats against OpenClaw as a system — its skills, its gateway, its execution model, its channels. It was not designed to model what happens when someone deploys OpenClaw on a Lightsail VPS with an outdated kernel. That distinction is not a criticism — it is the correct scope for an application threat model.

With that in mind, the mapping:

What ATLAS anticipated — and matches the findings

Finding ATLAS Threat Tactic
#7 Indirect prompt injection T-EXEC-002 Indirect Prompt Injection Execution
#8 Memory poisoning T-PERSIST-005 Prompt Injection Memory Poisoning Persistence
#9 53 skills opt-out T-ACCESS-004 Malicious Skill as Entry Point Initial Access
#11 Cron Jobs persistence T-EVADE-004 Staged Payload Delivery Defense Evasion
#13 Config/Debug exposed T-DISC-002/003/004 Session/Prompt/Env Enumeration Discovery
#3 Gateway Token T-ACCESS-003 Token Theft Initial Access

These findings have a home in ATLAS. The framework anticipated them, assigned a technique, and placed them in an attack chain.

What ATLAS does not cover — and why

Finding Why it falls outside
#1 Outdated kernel Operator infrastructure — out of scope by design
#4 IPv6 enabled Operator network configuration — out of scope by design
#5 Apache2 without hardening Operator responsibility — out of scope by design
#2 exec_host_policy + allow Intersection between operator configuration and application design
#6 No access control on channels ATLAS documents AllowFrom as a trust boundary, but does not model the gap when that control is simply not configured
#10 Permission inheritance No threat models privilege escalation between agents with different configurations
#12 Local logs Completely absent from the threat model

The most interesting finding from the mapping: Finding #2 best illustrates the gap between application threat model and deploy reality. OpenClaw documents exec_host_policy as a configuration the operator controls. ATLAS models T-EXEC-004 Exec Approval Bypass as a threat. But neither explicitly models what happens when the operator activates the most permissive combination by default — which is exactly what the Lightsail blueprint does.

The threat model assumes the operator makes informed decisions. The blueprint assumes the operator wants simplicity. The intersection of those two assumptions is Finding #2.

OWASP Top 10 for Agentic Applications 2026

In December 2025, OWASP published the first security framework specific to agentic applications. It is not an extension of the Top 10 for LLMs — it is an independent list, built on the recognition that autonomous agents have a fundamentally different threat model than a chatbot.

The central difference: an LLM that fails produces an incorrect response. An agent that fails executes incorrect actions on real systems.

The 10 framework categories:

# Category
AG01 Prompt Injection
AG02 Memory Poisoning
AG03 Tool Misuse
AG04 Privilege Escalation
AG05 Unsafe Agent Chaining
AG06 Resource Exhaustion
AG07 Data Exfiltration
AG08 Uncontrolled Agent Spawning
AG09 Over-Permissioned Agents
AG10 Excessive Trust of Agent Output

The mapping with the findings

Finding OWASP Category Observation
#7 Indirect prompt injection AG01 Prompt Injection Direct coverage — OWASP distinguishes direct and indirect prompt injection
#8 Memory poisoning AG02 Memory Poisoning Direct coverage — and in OpenClaw memory is plain Markdown files on disk, which amplifies the vector
#9 53 skills opt-out AG03 Tool Misuse + AG09 Over-Permissioned Agents Double mapping — the opt-out model inverts the principle of least privilege
#10 Permission inheritance AG04 Privilege Escalation + AG05 Unsafe Agent Chaining Double mapping — the absence of documentation on permission inheritance is exactly AG05
#11 Cron Jobs AG08 Uncontrolled Agent Spawning Partial — Cron Jobs don't spawn new agents, but create unsupervised autonomous executions with the same effect
#6 No access control on channels AG09 Over-Permissioned Agents A channel without access control is equivalent to an agent with excessive permissions over its input surface
#2 exec_host_policy + allow AG03 Tool Misuse Partial — OWASP models tool abuse, but not the deploy configuration that eliminates the sandbox

What OWASP does not cover

Finding Why it falls outside
#1 Outdated kernel Infrastructure — out of scope by design
#4 IPv6 Infrastructure — out of scope by design
#5 Apache2 Infrastructure — out of scope by design
#12 Local logs OWASP has no observability category — the absence of external logging has no home
#13 Config/Debug exposed OWASP touches it tangentially in AG07 (exfiltration) but does not model the exposure of internal panels via a shared token

The most revealing finding from the OWASP mapping:

Finding #9 maps simultaneously to AG03 and AG09 — Tool Misuse and Over-Permissioned Agents. That is not an editorial coincidence. It is the direct consequence of the opt-out model: when everything is enabled by default, each unnecessarily active skill is simultaneously an excessive permission and a potential abuse surface.

OWASP anticipated it conceptually. What it did not anticipate is that a real and popular system would implement it in reverse — not "enable what you need" but "disable what you don't want." The difference is philosophical but the consequences are operational.

AWS Agentic AI Security Scoping Matrix

AWS published two complementary frameworks that coexist and reference each other.

The first is the Generative AI Security Scoping Matrix — 5 scopes based on how much ownership the organization has over the model. OpenClaw on Lightsail falls in Scope 3: building a custom application using a pre-trained model via API. The model is not trained or fine-tuned — it is invoked. Security responsibilities in Scope 3 are divided between AWS (the model, Bedrock infrastructure) and the operator (the application, the integration, the data passed to it).

The second is the Agentic AI Security Scoping Matrix — published in November 2025, specifically designed for autonomous systems. It categorizes four scopes based on two dimensions: level of agency (what actions the system can take) and level of autonomy (how much human oversight exists).

This second matrix speaks most directly to the findings in this series.

What scope does OpenClaw operate in with the audited configuration?

The default configuration of the Lightsail blueprint — exec_host_policy: gateway + shell_approval: allow — places OpenClaw in Scope 4: Full Agency.

AWS's definition for Scope 4 is clear: systems that self-initiate, operate continuously with minimal human supervision, and can execute complex workflows autonomously. OpenClaw's Cron Jobs are exactly this — scheduled tasks that execute without operator intervention.

The problem is not that Scope 4 exists. The problem is that Scope 4 requires the most sophisticated controls in the framework — continuous behavioral monitoring, automatic circuit breakers, guaranteed rollback mechanisms, real-time anomaly detection. And the Lightsail blueprint does not configure any of them.

It is the same pattern identified with MITRE ATLAS: the system arrives configured to operate with maximum autonomy, but without the controls that autonomy requires.

The mapping with the 6 security dimensions

The Agentic AI Security Scoping Matrix organizes controls across six dimensions. This is where the findings find their most precise placement:

Identity Context — user, service, and agent identity management.

Finding Observation
#3 Gateway Token in plaintext The token is the gateway's only authentication mechanism — no rotation, no scope, no expiration
#6 No granular access on channels The framework requires appropriate authentication per scope level. In Scope 4, any channel participant can instruct the agent

Data, Memory & State Protection — persistent memory security and memory poisoning prevention.

Finding Observation
#8 Memory poisoning AWS explicitly names memory poisoning as a critical vector in this dimension. In OpenClaw memory is Markdown files on disk — no integrity validation, no encryption

This is the finding with the most direct coverage in the entire matrix. AWS anticipated it, named it, and described exactly the risk. OpenClaw's implementation materializes it.

Audit & Logging — complete action traceability and reasoning chain capture.

Finding Observation
#12 Local logs without export The framework requires tamper-resistant logs, especially in Scope 4. Modifiable local logs are the opposite of what Scope 4 requires
#11 Cron Jobs as persistence Without external logging, a malicious task can execute and erase its trail before anyone detects it

Agent & LLM Controls — guardrails, behavioral monitoring, sandboxing.

Finding Observation
#2 exec_host_policy + allow Eliminates the sandbox. The framework requires containerization and resource quotas in Scope 4 — the default configuration does exactly the opposite
#7 Indirect prompt injection The framework mentions behavioral monitoring as a control. Without it, an indirect injection can execute undetected

Agency Perimeters & Policies — operational boundaries and dynamic constraint evaluation.

Finding Observation
#9 53 skills opt-out The framework is explicit: agents must operate within the limits of their designed purpose. 53 skills active by default is the opposite — maximum surface, minimum restriction
#10 Permission inheritance The framework does not specify how permissions should propagate in agent chains — that gap is Finding #10

Orchestration — agent-to-system interaction management, tool access, execution flow control.

Finding Observation
#10 Permission inheritance The Orchestration dimension discusses inter-agent coordination protocols, but does not define the permission inheritance model in delegation
#11 Cron Jobs The framework mentions transaction management and rollback mechanisms — Cron Jobs have neither

What the matrix does not cover

The Agentic AI Security Scoping Matrix implicitly assumes deployment is on AWS managed services — Bedrock, AgentCore, Lambda. It says nothing about what happens when the blueprint arrives with an outdated kernel, Apache without hardening, or IPv6 enabled by default.

That is not a criticism of the framework — it is the correct scope. AWS designed this matrix for the application and orchestration layer, not for the IaaS layer. The operator who chooses Lightsail inherits a layer of responsibilities that neither matrix contemplates.

That layer is unmapped territory.

The gap — what no framework anticipated

Three frameworks reviewed. One designed by the OpenClaw team itself, one by OWASP, one by AWS. All three do their job well. And all three leave the same territory uncovered.

Before naming it, it is worth seeing the pattern:

Finding MITRE ATLAS OWASP Agentic AWS Scoping Matrix
#1 Outdated kernel ❌ out of scope ❌ out of scope ❌ out of scope
#2 exec_host_policy + allow ⚠️ partial ⚠️ partial ⚠️ partial
#3 Gateway Token ✅ T-ACCESS-003 ⚠️ tangential ✅ Identity Context
#4 IPv6 ❌ out of scope ❌ out of scope ❌ out of scope
#5 Apache2 ❌ out of scope ❌ out of scope ❌ out of scope
#6 No channel access ⚠️ partial ✅ AG09 ✅ Identity Context
#7 Prompt injection ✅ T-EXEC-002 ✅ AG01 ⚠️ mentioned, not resolved
#8 Memory poisoning ✅ T-PERSIST-005 ✅ AG02 ✅ Data & Memory
#9 53 skills opt-out ✅ T-ACCESS-004 ✅ AG03+AG09 ✅ Agency Perimeters
#10 Permission inheritance ❌ absent ✅ AG04+AG05 ⚠️ partial
#11 Cron Jobs ⚠️ partial ⚠️ partial ⚠️ partial
#12 Local logs ❌ absent ❌ absent ✅ Audit & Logging
#13 Config/Debug ⚠️ partial ⚠️ tangential ⚠️ tangential

The pattern is clear. Findings from the application layer have a home in at least one of the three frameworks. Findings from the IaaS layer have no home in any of them. And findings from the intersection layer have partial coverage in all — no framework captures them completely because none was designed to see both layers at the same time.

The gap has a name

What is missing is not one more finding. It is a layer of analysis that existing frameworks do not contemplate by design: the attack surface opened by the decision to deploy on IaaS when the system being deployed is an autonomous agent.

An application threat model assumes that infrastructure is the operator's responsibility. An infrastructure framework assumes that the application running on top is the developer's responsibility. Neither models the intersection — the point where operator configuration activates vectors that the application threat model did not anticipate, and vice versa.

Finding #2 is the clearest example: exec_host_policy: gateway + shell_approval: allow is an operator configuration decision that eliminates the agent's only isolation mechanism. It is not an OpenClaw bug — it is a documented option. It is not an operator error — it is the blueprint's default value. Responsibility is distributed in a way that no framework captures in a single place.

The validation that was not expected

This is not just an independent observation. The official AWS repository — sample-OpenClaw-on-AWS-with-Bedrock — documents it explicitly in its README:

"Plan A is soft enforcement — the LLM can theoretically be bypassed via prompt injection. Plan E catches what Plan A misses. For hard enforcement via AgentCore Gateway MCP mode, see Roadmap."

AWS, in its own reference repository for deploying OpenClaw, acknowledges that prompt injection is not resolved and places it on the roadmap. This is not an inferred gap — it is a gap the AWS technical team documented while building the solution.

And prompt injection is Finding #7 — the critical vector identified in Part 2, which none of the three frameworks fully resolves in the context of an IaaS deployment.

What this means for anyone deploying OpenClaw today

Deploying OpenClaw on Lightsail following the official blueprint means operating an agent at Scope 4 of the AWS Agentic AI Security Scoping Matrix — maximum autonomy — without the controls Scope 4 requires, on infrastructure that no agentic security framework models, with a prompt injection vector that the AWS team itself acknowledges as pending work.

OpenClaw is not broken. Lightsail is not insecure. The combination of both on the default configuration opens an attack surface that requires an analysis that does not yet exist as a unified framework.

That is what comes next.

What's next

Three frameworks analyzed. Thirteen findings mapped. A gap identified with real evidence.

Something important about several findings mapped in this post: findings #7, #8, and #10 are marked as risk vectors identified but not executed live. The framework mapping confirms the vector exists and has documented precedent. Validation that it works in this specific implementation comes in Part 4.

And the demo may reveal something that static analysis did not show. That is not a warning — it is the method. An autonomous agent cannot be audited solely from the dashboard. At some point you have to see what it does when something goes wrong.

Part 4 closes the series with what frameworks cannot replace: live evidence. Prompt injection executed in a controlled environment, validation of pending findings, and the first elements of something that does not yet exist in LATAM: a unified framework for evaluating the security posture of an autonomous agent considering all three layers together.

Not a whitepaper. Not a checklist. Something built from real findings, on a real system, deployed on real infrastructure.

If you arrived here without reading Part 1 and Part 2, the full series index is here:

Part 1 — Secure setup and first infrastructure findings
Part 2 — Full dashboard: channels, agents, cron jobs, and logs
→ Part 3 — this post
→ Part 4 — coming soon

About the author

Gerardo Castro is an AWS Security Hero and Cloud Security Engineer focused on LATAM. Founder and Lead Organizer of the AWS Security Users Group LatAm. He believes the best way to learn cloud security is by building real things — not memorizing frameworks. He writes about what he builds, what he finds, and what he learns along the way.

🔗 LinkedIn: gerardokaztro
🔗 Blog: roadtocloudsec.la

Top comments (0)