DEV Community: Marko Arnauto

OpenClaw and GDPR

Marko Arnauto — Thu, 19 Feb 2026 07:13:26 +0000

Europe has a new tech-celebrity. When Austrian developer Peter Steinberger published OpenClaw at the end of November 2025, neither he nor the tech world could have predicted the fallout. Both he and his software became enormously popular, breaking records across the open-source community.

As the charts show, OpenClaw slashed n8n's momentum within just a few weeks of its release. The core idea is as simple as it is brilliant: give a LLMs actual access to your PC, turning it from an isolated chatbot into an autonomous agent that can execute shell commands, read files, and handle complex real-world workflows. However, giving an LLM that much control comes with severe security risks and a massive compliance headache.

What about GDPR

Each European developer is, to some extent, already familiar with our strict data privacy protection laws. Therefore, devs are naturally wondering whether they are even allowed to use OpenClaw in a professional environment. If you use it for business purposes and not just as a toy project, there is a high likelihood that you are the data controller, which comes with great legal responsibility.

Fortunately, OpenClaw is open-source software, giving you the flexibility to run and configure it entirely on your own terms.

Deployment is just one piece of that puzzle, but it is the critical foundation. This article focuses strictly on that foundational step. Let's concentrate on how to build your infrastructure using the European Stack: which LLMs, servers, and messengers will give you the best baseline?

Virtual private servers

Because OpenClaw requires a persistent environment to act as your agent's gateway, you'll need a reliable host. You can use any provider with enough RAM, but to keep your data safely within the EU, consider these major European hosts:

Hetzner
Hostinger
netcup
UpCloud
OVHcloud
IONOS

LLMs

Running LLMs on a dedicated machine is, security-wise, a fantastic option. However, it drastically impairs the agent's capabilities because your local models are generally not top-notch. It's incredibly hard to run a massive model like Kimi 2.5 (with its 1000B parameters) locally without enterprise-grade hardware.

Because of this limitation, most people actually choose LLM cloud endpoints to power OpenClaw's "brain."

Why ZDR is not enough

There are some endpoints providing ZDR (Zero Data Retention). While this is a great feature from a security standpoint, you still need to have a Data Processing Agreement (DPA) in place if you process personal data.

A good compromise is to use GDPR-compliant LLM cloud endpoints hosted by European companies. Based on the European Stack, your best options are:

Mistral AI
cortecs
OVHcloud
IONOS

Messenger

OpenClaw's primary user interface is the messaging app you already use. While many users default to Discord, WhatsApp, or Slack, these are not ideal for strict GDPR compliance. To keep your communication layer secure and European-based, you should look at decentralized or self-hosted platforms:

Matrix
Nextcloud

Next steps

Now, even if your deployment is done perfectly right, it really gets tricky. Setting up a European-hosted infrastructure is just the foundation; operating an autonomous agent introduces severe, structural security risks that you must actively manage. The AI agent landscape is currently a security minefield. Some of the major known vulnerabilities include:

Prompt Injections: Because OpenClaw reads untrusted content (like incoming emails or webpages) while having system-level privileges and external communication abilities, an attacker can embed hidden instructions in a document. If the agent reads it, it can be hijacked and silently exfiltrate your data or execute malicious commands without your knowledge.

Key Leaks and Exposed Interfaces: Misconfigurations are rampant. Early on, tens of thousands of OpenClaw instances were left wide open on the internet due to improper port bindings or reverse proxy setups. Attackers can bypass authentication entirely to steal API keys, gateway tokens and your plaintext credentials.

Malicious Skills: The ClawHub marketplace has been heavily targeted by threat actors. They upload scripts disguised as legitimate tools that actually operate as info-stealers, silently grabbing your passwords, browser data, and session tokens.

Securing this setup requires strict network isolation (such as running it strictly over a VPN like Tailscale rather than exposing it to the public internet) and rigorous, manual auditing of any skills you install.

Summary

Autonomous agents like OpenClaw offer immense potential to revolutionize workflows, but they carry significant and well-documented security risks. Achieving a safe and GDPR-compliant setup is a complex puzzle. While selecting a solid European Stack provides the necessary data-privacy foundation, the real challenge lies in mitigating the ongoing operational threats like prompt injections and malicious plugins.

How Opencode Just Dethroned Claude

Marko Arnauto — Fri, 30 Jan 2026 13:01:06 +0000

When it comes to agentic coding, Cline was one of the first movers. With a brilliant idea to provide integrations into VS Code, they eliminated the need to switch from your favourite coding IDE. Then, around mid-2025, Anthropic flexed its muscles with Claude Code, leveraging their massive models to take the dominant position. For a moment, it looked like Claude had won the race.

But look at the red line! 📈
That vertical trajectory is Opencode.

In early 2026, Opencode didn't just pass Cline, it surpassed the reigning champion. It has become the fastest-growing coding assistant in history, proving that when it comes to dev tools, the community still leans towards open ecosystems.

Why are Devs ditching Claude?

Why leave a polished tool like Claude Code for a new open-source alternative?

Independence: Claude locks you into the Anthropic ecosystem. Opencode is different, it isn't tied to one industry giant. You aren't building your workflow on a foundation that could change its terms of service or pricing overnight. You own the stack.
Native OS Model Support: With claude-code, trying to run other models often requires fragile workarounds to bridge with tools like Ollama. Opencode supports a massive array of models natively. Whether you want to test the new Kimi 2.5 or swap between GPT-5 and Gemini, it works out of the box.
No Enterprise Tax: To get enterprise-grade security from vendors like Anthropic, you often have to sign contracts with significant markups. Opencode lets you dodge this premium and leverage radically cheaper models (Qwen, GLM, Kimi, ...).

However, dodging the enterprise tax and choosing freely between providers means losing some guardrails. When you move to opencode, you basically inherit the role of Security Officer. A single developer piping code to a non-compliant API can leak your IP.

Use Opencode without Getting Fired (European perspective 🇪🇺)

So, how do you unlock the power of the most popular assistant without violating Data Residency laws or GDPR requirements? You generally have two paths.

1. The Local Route (BYOM)
The "Bring Your Own Model" approach involves running local models. This offers the ultimate form of data sovereignty because your code never leaves your physical machine. However, since your laptop isn't an B200 cluster, running a 1-trillion-parameter Kimi might be a bit of a challenge.

2. The Privacy Gateway
For professionals, especially in Europe where digital sovereignty is a priority, the solution is a Privacy Gateway such as cortecs. This middleware layer enables strict data residency, ensures no-training guarantees on your code, and enables Zero-Data-Retention policies.

(Full disclosure: I’m part of the cortecs' team)

Conclusion

Opencode surpassing claude is a historic moment for vibe coding enthusiasts.

But as you make the switch, remember that with open tools, security is no longer a "feature" but a configuration you must manage. Whether you choose a local stack or a privacy gateway, make sure your security posture grows as fast as your star count.

Open Source > Closed Source

Marko Arnauto — Wed, 29 Oct 2025 14:44:10 +0000

OpenCode > Claude Code

Asmae Elazrak for cortecs ・ Oct 29

#cortecs #llm #terminal #claudecode

Best LLM? Fastest endpoints? Let a router decide.

Marko Arnauto — Thu, 17 Jul 2025 11:47:49 +0000

Comparing LLM Routers

Asmae Elazrak for cortecs ・ Jul 16

#cortecs #llm #routers #eu

European devs, pay attention!

Marko Arnauto — Fri, 20 Jun 2025 13:25:56 +0000

Asmae Elazrak for cortecs

Jun 20 '25

Choosing the Right AI Provider in Europe 🇪🇺

#cortecs #ai #europe #llm

5 min read

CAG > RAG

Marko Arnauto — Mon, 10 Feb 2025 15:56:42 +0000

Context Caching: Is It the End of Retrieval-Augmented Generation (RAG)? 🤔

Abhinav Anand ・ Sep 19 '24

#ai #gpt3 #rag #deeplearning

High Workloads -> Dedicated LLMs

Marko Arnauto — Tue, 04 Feb 2025 14:43:48 +0000

Streamline Your Batch Jobs: The Power of LLM Workers 🤖

Asmae Elazrak for cortecs ・ Jan 17

#ai #llm #nlp #cortecs

llm workers

Marko Arnauto — Fri, 17 Jan 2025 15:13:23 +0000

LLMs for Big Data

Marko Arnauto — Mon, 13 Jan 2025 12:00:19 +0000

We all love our chatbots, but when it comes to heavy-loads, they just don’t cut it. If you need to analyze thousands of documents at once, serverless inference — the go-to for chat applications — quickly shows its (rate) limits.

One Model — Many Users

Imagine working in a shared co-working space: it’s convenient, but your productivity depends on how crowded the space is. Similarly, serverless models like OpenAI, Anthropic or Groq rely on shared infrastructure, where performance fluctuates based on how many users are competing for resources. Strict rate limits, like Groq’s 7,000 tokens per minute, can grind progress to a halt.

Dedicated Compute — One Model per User

In contrast, dedicated inference allocates compute resources exclusively to a single user or application. This ensures predictable and consistent performance, as the only limiting factor is the computational capacity of the allocated GPUs. According to Fireworks.ai, a leading inference provider,

Graduating from serverless to on-demand deployments starts to make sense economically when you are running ~100k+ tokens per minute.

There are typically no rate limits on throughput. Billing for dedicated inference is time-based, calculated per hour or minute depending on the platform. While dedicated inference is well-suited for high-throughput, it involves a tedious setup process as well as the risk of overpaying due to idle times.

Tedious Setup

Deploying dedicated inference requires careful preparation. First, you need to rent suitable hardware to support your chosen model. Next, an inference engine such as vLLM must be configured to match the model’s requirements. Finally, secure access must be established via a TLS-encrypted connection to ensure encrypted communication. According to Philipp Schmidt, the co-founder of Hugging Face, you need one full-time developer to setup and maintain such a system.

Idle Times

Time-based billing makes cost-projections easier but on the other hand idle resources can quickly become a cost-overhead. Dedicated inference is cost-effective only when GPUs are busy. To avoid unnecessary expenses, the system should be turned off when not in use. Managing this manually can be tedious and error-prone.

LLM Workers to the Rescue

To address the downsides of dedicated inference, providers like Google, and Cortecs offer dedicated LLM workers.Without any additional configurations these workers are started and stopped on-demand — avoiding setup overhead and idle times. The required hardware is allocated, the inference engine is configured, and API connections are established all in the background. Once the workload is completed, the worker shuts down automatically.

Example

As I’m involved in the cortecs project I’m going to showcase it using our library. It can be installed with pip.

pip install cortecs-py

We will use the OpenAI python library to access the model.

pip install openai

Next, register at cortecs.ai and create your access credentials at the profile page. Then set them as environment variables.

export OPENAI_API_KEY="Your cortecs api key" export CORTECS_CLIENT_ID="Your cortecs id" export CORTECS_CLIENT_SECRET="Your cortecs secret"

It’s time to choose a model. We selected a model supporting 🔵 instant provisioning which was phi-4-FP8-Dynamic. Models that support instant provisioning enable a warm start, eliminating provisioning latency — perfect for this demonstration.

from openai import OpenAI
from cortecs_py import Cortecs

cortecs = Cortecs()
my_model = 'cortecs/phi-4-FP8-Dynamic'

# Start a new instance
my_instance = cortecs.ensure_instance(my_model)
client = OpenAI(base_url=my_instance.base_url)

completion = client.chat.completions.create(
  model=my_model,
  messages=[
    {"role": "user", "content": "Write a joke about LLMs."}
  ]
)
print(completion.choices[0].message.content)
# Stop the instance
cortecs.stop(my_instance.instance_id)

All provisioning complexity is abstracted by cortecs.ensure_instance(my_model) and cortecs.stop(my_instance.instance_id). Between these two lines, you can execute arbitrary inference tasks—whether it's generating a simple joke about LLMs or producing billions of words.

LLM Workers are a game-changer for large-scale data analysis. With no need to manage complex compute clusters, they enable seamless big data analysis and generation without the typical concerns of rate limits or exploding inference costs.
Imagine a future where LLM Workers handle highly complex tasks, such as proving mathematical theorems or executing reasoning-intensive operations. You could launch a worker, let it run at full GPU utilization to tackle the problem, and have it shut itself down automatically upon completion. The potential is enormous, and this tutorial demonstrates how to dynamically provision LLM Workers for high-performance AI tasks.