<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Marko Arnauto</title>
    <description>The latest articles on DEV Community by Marko Arnauto (@markus_tretzmller_1d02bf).</description>
    <link>https://dev.to/markus_tretzmller_1d02bf</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1534067%2F9e407eaf-e2ca-4547-8e15-5e69ed59c328.jpg</url>
      <title>DEV Community: Marko Arnauto</title>
      <link>https://dev.to/markus_tretzmller_1d02bf</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/markus_tretzmller_1d02bf"/>
    <language>en</language>
    <item>
      <title>OpenClaw and GDPR</title>
      <dc:creator>Marko Arnauto</dc:creator>
      <pubDate>Thu, 19 Feb 2026 07:13:26 +0000</pubDate>
      <link>https://dev.to/markus_tretzmller_1d02bf/openclaw-and-gdpr-5e40</link>
      <guid>https://dev.to/markus_tretzmller_1d02bf/openclaw-and-gdpr-5e40</guid>
      <description>&lt;p&gt;Europe has a new tech-celebrity. When Austrian developer Peter Steinberger published OpenClaw at the end of November 2025, neither he nor the tech world could have predicted the fallout. Both he and his software became enormously popular, breaking records across the open-source community.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkiixquq22149k2hlq6v8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkiixquq22149k2hlq6v8.png" alt=" " width="800" height="578"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As the charts show, OpenClaw slashed n8n's momentum within just a few weeks of its release. The core idea is as simple as it is brilliant: give a LLMs actual access to your PC, turning it from an isolated chatbot into an autonomous agent that can execute shell commands, read files, and handle complex real-world workflows. However, giving an LLM that much control comes with severe security risks and a massive compliance headache.&lt;/p&gt;

&lt;h2&gt;
  
  
  What about GDPR
&lt;/h2&gt;

&lt;p&gt;Each European developer is, to some extent, already familiar with our strict data privacy protection laws. Therefore, devs are naturally wondering whether they are even allowed to use OpenClaw in a professional environment. If you use it for business purposes and not just as a toy project, there is a high likelihood that you are the data controller, which comes with great legal responsibility.&lt;/p&gt;

&lt;p&gt;Fortunately, OpenClaw is open-source software, giving you the flexibility to run and configure it entirely on your own terms. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foocwn955f7zlqtl66u0o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foocwn955f7zlqtl66u0o.png" alt=" " width="800" height="955"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Deployment is just one piece of that puzzle, but it is the critical foundation. This article focuses strictly on that foundational step. Let's concentrate on how to build your infrastructure using the European Stack: which LLMs, servers, and messengers will give you the best baseline?&lt;/p&gt;

&lt;h3&gt;
  
  
  Virtual private servers
&lt;/h3&gt;

&lt;p&gt;Because OpenClaw requires a persistent environment to act as your agent's gateway, you'll need a reliable host. You can use any provider with enough RAM, but to keep your data safely within the EU, consider these major European hosts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hetzner&lt;/li&gt;
&lt;li&gt;Hostinger&lt;/li&gt;
&lt;li&gt;netcup&lt;/li&gt;
&lt;li&gt;UpCloud&lt;/li&gt;
&lt;li&gt;OVHcloud&lt;/li&gt;
&lt;li&gt;IONOS&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  LLMs
&lt;/h3&gt;

&lt;p&gt;Running LLMs on a dedicated machine is, security-wise, a fantastic option. However, it drastically impairs the agent's capabilities because your local models are generally not top-notch. It's incredibly hard to run a massive model like Kimi 2.5 (with its 1000B parameters) locally without enterprise-grade hardware.&lt;/p&gt;

&lt;p&gt;Because of this limitation, most people actually choose LLM cloud endpoints to power OpenClaw's "brain."&lt;/p&gt;

&lt;h4&gt;
  
  
  Why ZDR is not enough
&lt;/h4&gt;

&lt;p&gt;There are some endpoints providing ZDR (Zero Data Retention). While this is a great feature from a security standpoint, you still need to have a Data Processing Agreement (DPA) in place if you process personal data.&lt;/p&gt;

&lt;p&gt;A good compromise is to use GDPR-compliant LLM cloud endpoints hosted by European companies. Based on the European Stack, your best options are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Mistral AI&lt;/li&gt;
&lt;li&gt;cortecs&lt;/li&gt;
&lt;li&gt;OVHcloud&lt;/li&gt;
&lt;li&gt;IONOS&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Messenger
&lt;/h3&gt;

&lt;p&gt;OpenClaw's primary user interface is the messaging app you already use. While many users default to Discord, WhatsApp, or Slack, these are not ideal for strict GDPR compliance. To keep your communication layer secure and European-based, you should look at decentralized or self-hosted platforms:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Matrix&lt;/li&gt;
&lt;li&gt;Nextcloud&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Next steps
&lt;/h2&gt;

&lt;p&gt;Now, even if your deployment is done perfectly right, it really gets tricky. Setting up a European-hosted infrastructure is just the foundation; operating an autonomous agent introduces severe, structural security risks that you must actively manage. The AI agent landscape is currently a security minefield. Some of the major known vulnerabilities include:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt Injections:&lt;/strong&gt; Because OpenClaw reads untrusted content (like incoming emails or webpages) while having system-level privileges and external communication abilities, an attacker can embed hidden instructions in a document. If the agent reads it, it can be hijacked and silently exfiltrate your data or execute malicious commands without your knowledge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Leaks and Exposed Interfaces:&lt;/strong&gt; Misconfigurations are rampant. Early on, tens of thousands of OpenClaw instances were left wide open on the internet due to improper port bindings or reverse proxy setups. Attackers can bypass authentication entirely to steal API keys, gateway tokens and your plaintext credentials.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Malicious Skills:&lt;/strong&gt; The ClawHub marketplace has been heavily targeted by threat actors. They upload scripts disguised as legitimate tools that actually operate as info-stealers, silently grabbing your passwords, browser data, and session tokens.&lt;/p&gt;

&lt;p&gt;Securing this setup requires strict network isolation (such as running it strictly over a VPN like Tailscale rather than exposing it to the public internet) and rigorous, manual auditing of any skills you install.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Autonomous agents like OpenClaw offer immense potential to revolutionize workflows, but they carry significant and well-documented security risks. Achieving a safe and GDPR-compliant setup is a complex puzzle. While selecting a solid European Stack provides the necessary data-privacy foundation, the real challenge lies in mitigating the ongoing operational threats like prompt injections and malicious plugins.&lt;/p&gt;

</description>
      <category>openclaw</category>
      <category>security</category>
      <category>gdr</category>
    </item>
    <item>
      <title>How Opencode Just Dethroned Claude</title>
      <dc:creator>Marko Arnauto</dc:creator>
      <pubDate>Fri, 30 Jan 2026 13:01:06 +0000</pubDate>
      <link>https://dev.to/markus_tretzmller_1d02bf/how-opencode-just-dethroned-claude-401k</link>
      <guid>https://dev.to/markus_tretzmller_1d02bf/how-opencode-just-dethroned-claude-401k</guid>
      <description>&lt;p&gt;When it comes to agentic coding, &lt;code&gt;Cline&lt;/code&gt; was one of the first movers. With a brilliant idea to provide integrations into VS Code, they eliminated the need to switch from your favourite coding IDE. Then, around mid-2025, Anthropic flexed its muscles with &lt;code&gt;Claude Code&lt;/code&gt;, leveraging their massive models to take the dominant position. For a moment, it looked like Claude had won the race.&lt;/p&gt;

&lt;p&gt;But look at the red line! 📈&lt;br&gt;
That vertical trajectory is &lt;code&gt;Opencode&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F18clb8oq4755argclevn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F18clb8oq4755argclevn.png" alt=" " width="800" height="585"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In early 2026, &lt;a href="https://github.com/anomalyco/opencode" rel="noopener noreferrer"&gt;Opencode&lt;/a&gt; didn't just pass Cline, it surpassed the reigning champion. It has become the fastest-growing coding assistant in history, proving that when it comes to dev tools, the community still leans towards open ecosystems.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why are Devs ditching Claude?
&lt;/h2&gt;

&lt;p&gt;Why leave a polished tool like Claude Code for a new open-source alternative?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Independence&lt;/strong&gt;: Claude locks you into the Anthropic ecosystem. Opencode is different, it isn't tied to one industry giant. You aren't building your workflow on a foundation that could change its terms of service or pricing overnight. You own the stack.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Native OS Model Support&lt;/strong&gt;: With claude-code, trying to run other models often requires fragile workarounds to bridge with tools like Ollama. Opencode supports a massive array of models natively. Whether you want to test the new Kimi 2.5 or swap between GPT-5 and Gemini, it works out of the box.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No Enterprise Tax&lt;/strong&gt;: To get enterprise-grade security from vendors like Anthropic, you often have to sign contracts with significant markups. Opencode lets you dodge this premium and leverage radically cheaper models (Qwen, GLM, Kimi, ...).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;However, dodging the enterprise tax and choosing freely between providers means losing some guardrails. When you move to opencode, you basically inherit the role of Security Officer. A single developer piping code to a non-compliant API can leak your IP. &lt;/p&gt;

&lt;h2&gt;
  
  
  Use Opencode without Getting Fired (European perspective 🇪🇺)
&lt;/h2&gt;

&lt;p&gt;So, how do you unlock the power of the most popular assistant without violating Data Residency laws or GDPR requirements? You generally have two paths.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. The Local Route (BYOM)&lt;/strong&gt;&lt;br&gt;
The "Bring Your Own Model" approach involves running local models. This offers the ultimate form of data sovereignty because your code never leaves your physical machine. However, since your laptop isn't an B200 cluster, running a 1-trillion-parameter Kimi might be a bit of a challenge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. The Privacy Gateway&lt;/strong&gt;&lt;br&gt;
For professionals, especially in Europe where digital sovereignty is a priority, the solution is a Privacy Gateway such as &lt;a href="https://cortecs.ai/" rel="noopener noreferrer"&gt;cortecs&lt;/a&gt;. This middleware layer enables strict data residency, ensures no-training guarantees on your code, and enables Zero-Data-Retention policies.&lt;/p&gt;

&lt;p&gt;(Full disclosure: I’m part of the cortecs' team)&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Opencode surpassing claude is a historic moment for vibe coding enthusiasts.&lt;/p&gt;

&lt;p&gt;But as you make the switch, remember that with open tools, security is no longer a "feature" but a configuration you must manage. Whether you choose a local stack or a privacy gateway, make sure your security posture grows as fast as your star count.&lt;/p&gt;

</description>
      <category>vibecoding</category>
      <category>security</category>
      <category>ai</category>
    </item>
    <item>
      <title>Open Source &gt; Closed Source</title>
      <dc:creator>Marko Arnauto</dc:creator>
      <pubDate>Wed, 29 Oct 2025 14:44:10 +0000</pubDate>
      <link>https://dev.to/markus_tretzmller_1d02bf/open-source-closed-source-11c6</link>
      <guid>https://dev.to/markus_tretzmller_1d02bf/open-source-closed-source-11c6</guid>
      <description>&lt;div class="ltag__link"&gt;
  &lt;a href="/cortecs" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__org__pic"&gt;
      &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F10180%2F1e7ba1da-bc26-4910-95a9-2d5a30e47b55.png" alt="cortecs" width="512" height="512"&gt;
      &lt;div class="ltag__link__user__pic"&gt;
        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2699951%2Fef864c33-b788-4115-b749-96ab666eb9e4.jpeg" alt="" width="388" height="436"&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/a&gt;
  &lt;a href="https://dev.to/cortecs/opencode-claude-code-1f0g" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__content"&gt;
      &lt;h2&gt;OpenCode &amp;gt; Claude Code&lt;/h2&gt;
      &lt;h3&gt;Asmae Elazrak for cortecs ・ Oct 29&lt;/h3&gt;
      &lt;div class="ltag__link__taglist"&gt;
        &lt;span class="ltag__link__tag"&gt;#cortecs&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#llm&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#terminal&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#claudecode&lt;/span&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/a&gt;
&lt;/div&gt;


</description>
      <category>cortecs</category>
      <category>llm</category>
      <category>terminal</category>
      <category>claudecode</category>
    </item>
    <item>
      <title>Best LLM? Fastest endpoints? Let a router decide.</title>
      <dc:creator>Marko Arnauto</dc:creator>
      <pubDate>Thu, 17 Jul 2025 11:47:49 +0000</pubDate>
      <link>https://dev.to/markus_tretzmller_1d02bf/best-llm-fastest-endpoints-let-a-router-decide-2155</link>
      <guid>https://dev.to/markus_tretzmller_1d02bf/best-llm-fastest-endpoints-let-a-router-decide-2155</guid>
      <description>&lt;div class="ltag__link"&gt;
  &lt;a href="/cortecs" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__org__pic"&gt;
      &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F10180%2F654ac5a0-7d5d-458d-9465-539f72465f6e.png" alt="cortecs" width="512" height="512"&gt;
      &lt;div class="ltag__link__user__pic"&gt;
        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2699951%2Fef864c33-b788-4115-b749-96ab666eb9e4.jpeg" alt="" width="388" height="436"&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/a&gt;
  &lt;a href="https://dev.to/cortecs/comparing-llm-routers-54dl" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__content"&gt;
      &lt;h2&gt;Comparing LLM Routers&lt;/h2&gt;
      &lt;h3&gt;Asmae Elazrak for cortecs ・ Jul 16&lt;/h3&gt;
      &lt;div class="ltag__link__taglist"&gt;
        &lt;span class="ltag__link__tag"&gt;#cortecs&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#llm&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#routers&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#eu&lt;/span&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/a&gt;
&lt;/div&gt;


</description>
      <category>cortecs</category>
      <category>llm</category>
      <category>routers</category>
      <category>eu</category>
    </item>
    <item>
      <title>European devs, pay attention!</title>
      <dc:creator>Marko Arnauto</dc:creator>
      <pubDate>Fri, 20 Jun 2025 13:25:56 +0000</pubDate>
      <link>https://dev.to/markus_tretzmller_1d02bf/european-devs-pay-attention-6gp</link>
      <guid>https://dev.to/markus_tretzmller_1d02bf/european-devs-pay-attention-6gp</guid>
      <description>&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/cortecs/choosing-the-right-ai-provider-in-europe-1lo1" class="crayons-story__hidden-navigation-link"&gt;Choosing the Right AI Provider in Europe 🇪🇺&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;
          &lt;a class="crayons-logo crayons-logo--l" href="/cortecs"&gt;
            &lt;img alt="cortecs logo" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F10180%2F1e7ba1da-bc26-4910-95a9-2d5a30e47b55.png" class="crayons-logo__image"&gt;
          &lt;/a&gt;

          &lt;a href="/asmae_elazrak" class="crayons-avatar  crayons-avatar--s absolute -right-2 -bottom-2 border-solid border-2 border-base-inverted  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2699951%2Fef864c33-b788-4115-b749-96ab666eb9e4.jpeg" alt="asmae_elazrak profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/asmae_elazrak" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Asmae Elazrak
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Asmae Elazrak
                
              
              &lt;div id="story-author-preview-content-2609359" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/asmae_elazrak" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2699951%2Fef864c33-b788-4115-b749-96ab666eb9e4.jpeg" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Asmae Elazrak&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

            &lt;span&gt;
              &lt;span class="crayons-story__tertiary fw-normal"&gt; for &lt;/span&gt;&lt;a href="/cortecs" class="crayons-story__secondary fw-medium"&gt;cortecs&lt;/a&gt;
            &lt;/span&gt;
          &lt;/div&gt;
          &lt;a href="https://dev.to/cortecs/choosing-the-right-ai-provider-in-europe-1lo1" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Jun 20 '25&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/cortecs/choosing-the-right-ai-provider-in-europe-1lo1" id="article-link-2609359"&gt;
          Choosing the Right AI Provider in Europe 🇪🇺
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/cortecs"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;cortecs&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/ai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;ai&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/europe"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;europe&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/llm"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;llm&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/cortecs/choosing-the-right-ai-provider-in-europe-1lo1" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/fire-f60e7a582391810302117f987b22a8ef04a2fe0df7e3258a5f49332df1cec71e.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/raised-hands-74b2099fd66a39f2d7eed9305ee0f4553df0eb7b4f11b01b6b1b499973048fe5.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;10&lt;span class="hidden s:inline"&gt; reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/cortecs/choosing-the-right-ai-provider-in-europe-1lo1#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              &lt;span class="hidden s:inline"&gt;Add Comment&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            5 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;


</description>
      <category>cortecs</category>
      <category>ai</category>
      <category>europe</category>
      <category>llm</category>
    </item>
    <item>
      <title>CAG &gt; RAG</title>
      <dc:creator>Marko Arnauto</dc:creator>
      <pubDate>Mon, 10 Feb 2025 15:56:42 +0000</pubDate>
      <link>https://dev.to/markus_tretzmller_1d02bf/cag-rag-26i2</link>
      <guid>https://dev.to/markus_tretzmller_1d02bf/cag-rag-26i2</guid>
      <description>&lt;div class="ltag__link"&gt;
  &lt;a href="/abhinowww" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__pic"&gt;
      &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1962925%2Fa021449f-f75d-491c-b662-9ac1f4ede10e.jpg" alt="abhinowww"&gt;
    &lt;/div&gt;
  &lt;/a&gt;
  &lt;a href="https://dev.to/abhinowww/context-caching-is-it-the-end-of-retrieval-augmented-generation-rag-55kp" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__content"&gt;
      &lt;h2&gt;Context Caching: Is It the End of Retrieval-Augmented Generation (RAG)? 🤔&lt;/h2&gt;
      &lt;h3&gt;Abhinav Anand ・ Sep 19 '24&lt;/h3&gt;
      &lt;div class="ltag__link__taglist"&gt;
        &lt;span class="ltag__link__tag"&gt;#ai&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#gpt3&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#rag&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#deeplearning&lt;/span&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/a&gt;
&lt;/div&gt;


</description>
      <category>ai</category>
      <category>gpt3</category>
      <category>rag</category>
      <category>deeplearning</category>
    </item>
    <item>
      <title>High Workloads -&gt; Dedicated LLMs</title>
      <dc:creator>Marko Arnauto</dc:creator>
      <pubDate>Tue, 04 Feb 2025 14:43:48 +0000</pubDate>
      <link>https://dev.to/markus_tretzmller_1d02bf/high-workloads-dedicated-llms-5e3h</link>
      <guid>https://dev.to/markus_tretzmller_1d02bf/high-workloads-dedicated-llms-5e3h</guid>
      <description>&lt;div class="ltag__link"&gt;
  &lt;a href="/cortecs" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__org__pic"&gt;
      &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F10180%2F0c946637-32dc-4415-aff6-20ab4c3e6f09.png" alt="cortecs" width="512" height="512"&gt;
      &lt;div class="ltag__link__user__pic"&gt;
        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2699951%2F46167a73-cfff-4562-b381-aca6aa84f402.png" alt="" width="96" height="96"&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/a&gt;
  &lt;a href="https://dev.to/cortecs/streamline-your-batch-jobs-the-power-of-cortecs-ai-inference-2jjl" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__content"&gt;
      &lt;h2&gt;Streamline Your Batch Jobs: The Power of LLM Workers 🤖&lt;/h2&gt;
      &lt;h3&gt;Asmae Elazrak for cortecs ・ Jan 17&lt;/h3&gt;
      &lt;div class="ltag__link__taglist"&gt;
        &lt;span class="ltag__link__tag"&gt;#ai&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#llm&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#nlp&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#cortecs&lt;/span&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/a&gt;
&lt;/div&gt;


</description>
      <category>ai</category>
      <category>llm</category>
      <category>nlp</category>
      <category>cortecs</category>
    </item>
    <item>
      <title>llm workers</title>
      <dc:creator>Marko Arnauto</dc:creator>
      <pubDate>Fri, 17 Jan 2025 15:13:23 +0000</pubDate>
      <link>https://dev.to/markus_tretzmller_1d02bf/llm-workers-jdb</link>
      <guid>https://dev.to/markus_tretzmller_1d02bf/llm-workers-jdb</guid>
      <description></description>
      <category>llm</category>
    </item>
    <item>
      <title>LLMs for Big Data</title>
      <dc:creator>Marko Arnauto</dc:creator>
      <pubDate>Mon, 13 Jan 2025 12:00:19 +0000</pubDate>
      <link>https://dev.to/cortecs/llms-for-big-data-1hfb</link>
      <guid>https://dev.to/cortecs/llms-for-big-data-1hfb</guid>
      <description>&lt;p&gt;We all love our chatbots, but when it comes to heavy-loads, they just don’t cut it. If you need to analyze thousands of documents at once, serverless inference — the go-to for chat applications — quickly shows its (rate) limits. &lt;/p&gt;

&lt;h2&gt;
  
  
  One Model — Many Users 
&lt;/h2&gt;

&lt;p&gt;Imagine working in a shared co-working space: it’s convenient, but your productivity depends on how crowded the space is. Similarly, &lt;strong&gt;serverless models&lt;/strong&gt; like OpenAI, Anthropic or Groq rely on shared infrastructure, where performance fluctuates based on how many users are competing for resources. Strict rate limits, like Groq’s 7,000 tokens per minute, can grind progress to a halt. &lt;/p&gt;

&lt;h2&gt;
  
  
  Dedicated Compute — One Model per User
&lt;/h2&gt;

&lt;p&gt;In contrast, &lt;strong&gt;dedicated inference allocates compute resources exclusively to a single user&lt;/strong&gt; or application. This ensures predictable and consistent performance, as the only limiting factor is the computational capacity of the allocated GPUs. According to &lt;a href="https://fireworks.ai" rel="noopener noreferrer"&gt;Fireworks.ai&lt;/a&gt;, a leading inference provider,&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Graduating from serverless to on-demand deployments starts to make sense economically when you are running ~100k+ tokens per minute.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;There are typically no rate limits on throughput. Billing for dedicated inference is time-based, calculated per hour or minute depending on the platform. While dedicated inference is well-suited for high-throughput, it involves a tedious setup process as well as the risk of overpaying due to idle times.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tedious Setup
&lt;/h3&gt;

&lt;p&gt;Deploying dedicated inference requires careful preparation. First, you need to rent suitable hardware to support your chosen model. Next, an inference engine such as vLLM must be configured to match the model’s requirements. Finally, secure access must be established via a TLS-encrypted connection to ensure encrypted communication. According to Philipp Schmidt, the co-founder of Hugging Face, &lt;a href="https://www.philschmid.de/cost-generative-ai" rel="noopener noreferrer"&gt;you need one full-time developer&lt;/a&gt; to setup and maintain such a system. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F18v3tpy9iric55w7h52z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F18v3tpy9iric55w7h52z.png" alt="Dedicated deployments require a tedious setup." width="800" height="409"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Idle Times
&lt;/h3&gt;

&lt;p&gt;Time-based billing makes cost-projections easier but on the other hand idle resources can quickly become a cost-overhead. Dedicated inference is cost-effective only when GPUs are busy. To avoid unnecessary expenses, the system should be turned off when not in use. Managing this manually can be tedious and error-prone.&lt;/p&gt;

&lt;h2&gt;
  
  
  LLM Workers to the Rescue
&lt;/h2&gt;

&lt;p&gt;To address the downsides of dedicated inference, providers like Google, and Cortecs offer dedicated LLM workers.Without any additional configurations these workers are started and stopped on-demand — avoiding setup overhead and idle times. The required hardware is allocated, the inference engine is configured, and API connections are established all in the background. Once the workload is completed, the worker shuts down automatically. &lt;/p&gt;

&lt;h3&gt;
  
  
  Example
&lt;/h3&gt;

&lt;p&gt;As I’m involved in the cortecs project I’m going to showcase it using our &lt;a href="https://github.com/cortecs-ai/cortecs-py" rel="noopener noreferrer"&gt;library&lt;/a&gt;. It can be installed with pip.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pip install cortecs-py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We will use the OpenAI python library to access the model.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pip install openai
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, register at &lt;a href="https://cortecs.ai" rel="noopener noreferrer"&gt;cortecs.ai&lt;/a&gt; and create your access credentials at the profile page. Then set them as environment variables.&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;export OPENAI_API_KEY="Your cortecs api key"&lt;br&gt;
export CORTECS_CLIENT_ID="Your cortecs id"&lt;br&gt;
export CORTECS_CLIENT_SECRET="Your cortecs secret"&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;It’s time to choose a model. We selected a model supporting 🔵 instant provisioning which was &lt;em&gt;phi-4-FP8-Dynamic&lt;/em&gt;. Models that support instant provisioning enable a warm start, eliminating provisioning latency — perfect for this demonstration.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;cortecs_py&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Cortecs&lt;/span&gt;

&lt;span class="n"&gt;cortecs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Cortecs&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;my_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;cortecs/phi-4-FP8-Dynamic&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;

&lt;span class="c1"&gt;# Start a new instance
&lt;/span&gt;&lt;span class="n"&gt;my_instance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cortecs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ensure_instance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;my_model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;my_instance&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;completion&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;my_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Write a joke about LLMs.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;completion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Stop the instance
&lt;/span&gt;&lt;span class="n"&gt;cortecs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;my_instance&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;instance_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All provisioning complexity is abstracted by &lt;code&gt;cortecs.ensure_instance(my_model)&lt;/code&gt; and &lt;code&gt;cortecs.stop(my_instance.instance_id)&lt;/code&gt;. Between these two lines, you can execute arbitrary inference tasks—whether it's generating a simple joke about LLMs or producing billions of words.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LLM Workers are a game-changer&lt;/strong&gt; for large-scale data analysis. With no need to manage complex compute clusters, they enable seamless big data analysis and generation without the typical concerns of rate limits or exploding inference costs.&lt;br&gt;
Imagine a future where LLM Workers handle highly complex tasks, such as proving mathematical theorems or executing reasoning-intensive operations. You could launch a worker, let it run at full GPU utilization to tackle the problem, and have it shut itself down automatically upon completion. The potential is enormous, and this tutorial demonstrates how to dynamically provision LLM Workers for high-performance AI tasks.&lt;/p&gt;

</description>
      <category>nlp</category>
      <category>llm</category>
      <category>devops</category>
    </item>
  </channel>
</rss>
