<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: DigitalOcean</title>
    <description>The latest articles on DEV Community by DigitalOcean (@digitalocean_staff).</description>
    <link>https://dev.to/digitalocean_staff</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F64516%2Fa0c9989b-6d18-46c7-bc66-4c2c1580534e.jpg</url>
      <title>DEV Community: DigitalOcean</title>
      <link>https://dev.to/digitalocean_staff</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/digitalocean_staff"/>
    <language>en</language>
    <item>
      <title>Tutorial: Build an AI-Powered GPU Fleet Optimizer</title>
      <dc:creator>DigitalOcean</dc:creator>
      <pubDate>Fri, 17 Apr 2026 19:00:00 +0000</pubDate>
      <link>https://dev.to/digitalocean/tutorial-build-an-ai-powered-gpu-fleet-optimizer-8bl</link>
      <guid>https://dev.to/digitalocean/tutorial-build-an-ai-powered-gpu-fleet-optimizer-8bl</guid>
      <description>&lt;p&gt;&lt;em&gt;This article was originally written by Shamim Raashid (Senior Solutions Architect) and Anish Singh Walla (Senior Technical Content Strategist and Team Lead)&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Deploy a serverless LangGraph agent&lt;/strong&gt; on the DigitalOcean Gradient AI Platform that monitors your GPU fleet using natural language queries.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scrape real-time NVIDIA DCGM metrics&lt;/strong&gt; (temperature, power, VRAM, engine utilization) from GPU Droplets over Prometheus-style endpoints on port 9400.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Detect idle and underutilized GPUs automatically&lt;/strong&gt; by defining configurable threshold dictionaries that compare live metrics against your baseline workload patterns.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Customize the blueprint to your needs:&lt;/strong&gt; Change target Droplet types, adjust idle detection thresholds, enrich the data payload with additional metrics, and add actionable tools like automated power-off commands.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reduce GPU cloud costs&lt;/strong&gt; by replacing reactive dashboard monitoring with a proactive AI agent that identifies waste the moment it starts.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Managing a GPU fleet in the cloud is a constant balancing act between performance and cost. A single idle GPU Droplet left running overnight can add hundreds of dollars to your monthly bill. Traditional monitoring dashboards surface raw metrics, but they still require a human to interpret whether a machine is “working” or “wasting money.”&lt;/p&gt;

&lt;p&gt;This tutorial walks you through building an AI-powered GPU fleet optimizer using the DigitalOcean Gradient AI Platform and the Agent Development Kit (ADK). You will deploy a serverless, natural-language AI agent that audits your GPU infrastructure in real time, scrapes NVIDIA DCGM (Data Center GPU Manager) metrics like temperature, power draw, VRAM usage, and engine utilization, and flags idle resources before they inflate your cloud bill.&lt;/p&gt;

&lt;p&gt;This blueprint is designed to be forked and customized. By the end of this guide, you will know how to tune the agent's personality and efficiency thresholds, add new monitoring tools, and deploy the agent as a production-ready serverless endpoint.&lt;/p&gt;

&lt;h4&gt;
  
  
  Reference repository
&lt;/h4&gt;

&lt;p&gt;You can view the complete blueprint code here: &lt;a href="https://github.com/dosraashid/do-adk-gpu-monitor" rel="noopener noreferrer"&gt;dosraashid/do-adk-gpu-monitor&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;DigitalOcean Account:&lt;/strong&gt; With at least one active GPU Droplet running.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DigitalOcean API Token:&lt;/strong&gt; A Personal Access Token with read permissions and GenAI scopes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gradient Model Access Key:&lt;/strong&gt; Generated from the Gradient AI Dashboard.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Python 3.12:&lt;/strong&gt; Recommended for the latest LangGraph and asyncio features.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Familiarity with Python, REST APIs, and Linux command-line basics.&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The challenge: “Invisible” cloud waste
&lt;/h2&gt;

&lt;p&gt;When scaling AI workloads, engineering teams often spin up expensive, specialized GPU Droplets (like NVIDIA H100s or H200s) for training or inference tasks.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Problem: Hidden costs and wasted resources
&lt;/h3&gt;

&lt;p&gt;Once a training script finishes or a model endpoint stops receiving traffic, the Droplet itself remains online and billing by the hour. This creates two compounding issues:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Generic monitoring falls short:&lt;/strong&gt; Standard cloud dashboards typically show host-level metrics like CPU and RAM. A machine learning node might report 1% CPU utilization, but those monitors do not reveal whether the GPU's VRAM is empty or whether the compute engine is completely idle.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Dashboard fatigue:&lt;/strong&gt; Even if you install specialized tools like Grafana to track NVIDIA DCGM metrics, an engineer still has to remember to log in, interpret the charts, and manually map the IP address of an idle node back to a specific cloud resource to shut it down.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbiwytf0raeao1je60mni.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbiwytf0raeao1je60mni.png" alt="A a weary developer looking at a screen while money flies out of the data center server" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Solution: A proactive AI fleet analyst
&lt;/h3&gt;

&lt;p&gt;Instead of waiting for an engineer to check a dashboard, you can build an AI agent that acts as an autonomous infrastructure analyst. &lt;/p&gt;

&lt;p&gt;Using the DigitalOcean Gradient ADK, you will deploy a Large Language Model (LLM) equipped with custom Python tools. When you ask the agent a question like, “Are any of my GPUs wasting money right now?”, it executes a multi-step reasoning loop:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Discovery:&lt;/strong&gt; Calls the DigitalOcean API to get a live inventory of your Droplets.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Interrogation:&lt;/strong&gt; Pings the NVIDIA DCGM exporter on each node's public IP to read VRAM, temperature, and engine load.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Analysis:&lt;/strong&gt; Runs those raw metrics against a threshold dictionary you define (e.g., “If VRAM usage is below 5% and engine utilization is below 2%, mark this GPU as IDLE”).&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Actionable Output:&lt;/strong&gt; Replies in plain English, naming the specific node, its current hourly cost, and the exact metrics proving it is idle.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuiy0rs2lojv908252rar.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuiy0rs2lojv908252rar.png" alt="Stressed developer on the left, image of a chatbot providing the solution on the right" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding NVIDIA DCGM metrics for GPU monitoring
&lt;/h2&gt;

&lt;p&gt;NVIDIA Data Center GPU Manager (DCGM) exposes hardware telemetry through a Prometheus-compatible exporter that runs on port 9400. &lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;What It Measures&lt;/th&gt;
&lt;th&gt;Why It Matters&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DCGM_FI_DEV_GPU_TEMP&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;GPU die temperature in Celsius&lt;/td&gt;
&lt;td&gt;High temperatures indicate active computation.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DCGM_FI_DEV_POWER_USAGE&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Current power draw in watts&lt;/td&gt;
&lt;td&gt;Idle GPUs draw significantly less power than busy ones.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DCGM_FI_DEV_FB_USED&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Framebuffer (VRAM) memory in use&lt;/td&gt;
&lt;td&gt;Empty VRAM means no models are loaded.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DCGM_FI_DEV_GPU_UTIL&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;GPU engine utilization percentage&lt;/td&gt;
&lt;td&gt;The most direct indicator of compute work.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;You can query these metrics directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; http://&amp;lt;DROPLET_PUBLIC_IP&amp;gt;:9400/metrics | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-E&lt;/span&gt; &lt;span class="s2"&gt;"DCGM_FI_DEV_GPU_TEMP|DCGM_FI_DEV_POWER_USAGE|DCGM_FI_DEV_FB_USED|DCGM_FI_DEV_GPU_UTIL"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;a href="https://www.digitalocean.com/resources/articles/ai-agents" rel="noopener noreferrer"&gt;AI agent&lt;/a&gt; in this blueprint automates this scraping across your entire fleet, parses the Prometheus text format, and feeds the structured data into the LLM for analysis. If DCGM is not available on a particular node (for example, because the exporter is not installed or port &lt;code&gt;9400&lt;/code&gt; is blocked by a firewall), the agent falls back to standard CPU and RAM metrics and reports “DCGM Missing” for that node.&lt;/p&gt;

&lt;p&gt;For production deployments, consider pairing DCGM data collection with a full &lt;a href="https://dev.toPrometheus%20and%20Grafana%20monitoring%20stack"&gt;Prometheus and Grafana monitoring stack&lt;/a&gt; for historical trend analysis alongside the AI agent’s real-time assessments.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Clone the blueprint and set up your environment
&lt;/h2&gt;

&lt;p&gt;Start with the foundational repository rather than writing everything from scratch.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Clone the repo and set up your &lt;a href="https://www.digitalocean.com/community/tutorials/how-to-install-python-3-and-set-up-a-programming-environment-on-an-ubuntu-22-04-server" rel="noopener noreferrer"&gt;Python environment&lt;/a&gt;:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/dosraashid/do-adk-gpu-monitor
&lt;span class="nb"&gt;cd &lt;/span&gt;&lt;span class="k"&gt;do&lt;/span&gt;&lt;span class="nt"&gt;-adk-gpu-monitor&lt;/span&gt;
python3.12 &lt;span class="nt"&gt;-m&lt;/span&gt; venv venv
&lt;span class="nb"&gt;source &lt;/span&gt;venv/bin/activate
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Configure your secrets by creating a .env file in the root directory:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;&lt;span class="n"&gt;DIGITALOCEAN_API_TOKEN&lt;/span&gt;=&lt;span class="s2"&gt;"your_do_token"&lt;/span&gt;
&lt;span class="n"&gt;GRADIENT_MODEL_ACCESS_KEY&lt;/span&gt;=&lt;span class="s2"&gt;"your_gradient_key"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Security note: Never commit &lt;code&gt;.env&lt;/code&gt; files to version control. The repository’s &lt;code&gt;.gitignore&lt;/code&gt; already excludes this file.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Step 2: How it works (the architecture)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fds5p9hftjariheuwthdg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fds5p9hftjariheuwthdg.png" alt="AI Agent LangGraph architecture diagram" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Before you customize the blueprint, it helps to understand the data flow inside the code:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;User Prompt&lt;/strong&gt;: You ask the agent a question via the &lt;code&gt;/run&lt;/code&gt; endpoint.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LangGraph State&lt;/strong&gt;: The agent checks its conversation memory &lt;code&gt;(thread_id)&lt;/code&gt; via &lt;code&gt;MemorySaver&lt;/code&gt;, which enables multi-turn follow-up questions within the same session.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool Execution&lt;/strong&gt;: The LLM decides to call &lt;code&gt;@tool def analyze_gpu_fleet()&lt;/code&gt; defined in &lt;code&gt;main.py&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parallel Scraping&lt;/strong&gt;: &lt;code&gt;analyzer.py&lt;/code&gt; uses Python’s &lt;code&gt;ThreadPoolExecutor&lt;/code&gt; to query the DigitalOcean API and each Droplet’s DCGM endpoint &lt;code&gt;(metrics.py)&lt;/code&gt; concurrently. This parallel approach prevents network bottlenecks when monitoring dozens of nodes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Omniscient Payload&lt;/strong&gt;: The analyzer packages all raw data (temperature, power, VRAM, RAM, CPU, cost) into a structured JSON dictionary that the LLM can reason about.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Synthesis&lt;/strong&gt;: The LLM reads the JSON payload and responds in natural language with specific node names, costs, and actionable recommendations.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you want to learn more about building stateful AI agents with LangGraph, follow the &lt;a href="https://www.digitalocean.com/community/tutorials/getting-started-agentic-ai-langgraph" rel="noopener noreferrer"&gt;Getting Started with Agentic AI Using LangGraph tutorial&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Customizing the blueprint to your needs
&lt;/h2&gt;

&lt;p&gt;This repository is built to be forked and modified. Here are the four main areas you should adjust to match your organization’s requirements.&lt;/p&gt;

&lt;h3&gt;
  
  
  Customization 1: Tuning the logic (config.py)
&lt;/h3&gt;

&lt;p&gt;Open &lt;code&gt;config.py.&lt;/code&gt; This is the control center for your agent’s behavior.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Persona&lt;/strong&gt;: Edit &lt;code&gt;AGENT_SYSTEM_PROMPT&lt;/code&gt; to change how the AI communicates. For a highly technical DevOps assistant, remove the emojis and instruct it to output raw bullet points. For a management-facing report, tell it to summarize in cost terms.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Thresholds&lt;/strong&gt;: The blueprint considers a GPU “Idle” when utilization falls below 2% by default. If your baseline workloads idle at a higher percentage, adjust the &lt;code&gt;THRESHOLDS&lt;/code&gt; dictionary:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;THRESHOLDS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpu&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_temp_c&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;82.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_util_percent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;95.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_vram_percent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;95.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;idle_util_percent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;2.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;idle_vram_percent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;5.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;optimized_util_percent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;40.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;optimized_vram_percent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;50.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;idle_cpu_percent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;3.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;idle_ram_percent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;15.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;idle_load_15&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;starved_cpu_percent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;85.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;starved_ram_percent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;90.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;optimized_cpu_percent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;40.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;optimized_ram_percent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;50.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For example, if your inference servers typically idle at 8% GPU utilization between request bursts, set idle_util_percent to 10.0 to avoid false positives.&lt;/p&gt;

&lt;h3&gt;
  
  
  Customization 2: Changing the target infrastructure (analyzer.py)
&lt;/h3&gt;

&lt;p&gt;By default, the blueprint only scans Droplets with &lt;code&gt;"gpu"&lt;/code&gt; in the &lt;code&gt;size_slug&lt;/code&gt; to reduce unnecessary API calls. Open &lt;code&gt;analyzer.py&lt;/code&gt; and locate the slug filter. If you want the agent to monitor CPU-optimized or standard Droplets, modify this line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Change "gpu" to "c-" for CPU-Optimized, or remove the filter entirely to scan all Droplets.
&lt;/span&gt;&lt;span class="n"&gt;target_droplets&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;all_droplets&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpu&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;size_slug&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Customization 3: Enriching the omniscient payload (analyzer.py and metrics.py)
&lt;/h3&gt;

&lt;p&gt;The LLM only knows what you explicitly pass to it. The default payload includes temperature, power, and VRAM data. If you install &lt;a href="https://prometheus.io/docs/guides/node-exporter/" rel="noopener noreferrer"&gt;Prometheus Node Exporter&lt;/a&gt; on your instances and want the AI to also analyze disk space, you would:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Update &lt;code&gt;metrics.py&lt;/code&gt; to scrape disk metrics from Node Exporter on port &lt;code&gt;9100&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Update the return dictionary at the bottom of &lt;code&gt;process_single_droplet&lt;/code&gt; in &lt;code&gt;analyzer.py&lt;/code&gt; to include the new field:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;droplet_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;droplet_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpu_temp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;temp_val&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpu_power&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;power_val&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vram_used&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;vram_val&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;disk_space_free_gb&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;disk_val&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# New metric
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Customization 4: Adding actionable tools (main.py)
&lt;/h3&gt;

&lt;p&gt;The default blueprint is read-only. The most powerful upgrade is giving the AI permission to act on your infrastructure. In &lt;code&gt;main.py&lt;/code&gt;, you can add a new function with the &lt;code&gt;@tool&lt;/code&gt; decorator that uses the DigitalOcean API to power off a specific Droplet:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;power_off_droplet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;droplet_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Power off a Droplet by ID. Use only when the user explicitly asks to stop an idle node.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;
    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;

    &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DIGITALOCEAN_API_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.digitalocean.com/v2/droplets/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;droplet_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/actions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Content-Type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;power_off&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;201&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Successfully sent power-off command to Droplet &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;droplet_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Failed to power off Droplet &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;droplet_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After adding any new tools, bind them to the LLM so the agent can invoke them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;llm_with_tools&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;bind_tools&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;analyze_gpu_fleet&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;power_off_droplet&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Warning&lt;/strong&gt;: Giving an AI agent write access to your infrastructure requires careful guardrails. Consider adding confirmation prompts, restricting which Droplet tags the agent can act on, and logging all actions for audit purposes.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Step 4: Testing your custom agent
&lt;/h2&gt;

&lt;p&gt;Once you have tailored the code, test it locally before deploying. Start the local development server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gradient agent run
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In a separate terminal, simulate user requests using curl.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5jpmyvirmaagjrtig3kd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5jpmyvirmaagjrtig3kd.png" alt="Agent testing workflow diagram" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Example 1: Deep diagnostic
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8080/run &lt;span class="se"&gt;\&lt;/span&gt;
     &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
     &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
           "prompt": "Give me a full diagnostic on my GPU nodes including temperature and power.",
           "thread_id": "audit-session-1"
         }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Expected Output&lt;/strong&gt;: The AI uses the Omniscient Payload to report exact temperatures, wattage, and RAM utilization for each GPU Droplet, alongside cost-saving recommendations for any idle nodes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example 2: Contextual memory
&lt;/h3&gt;

&lt;p&gt;Because you are passing &lt;code&gt;thread_id: "audit-session-1"&lt;/code&gt;, the agent retains conversation context. You can ask follow-up questions without triggering a full re-scan of your infrastructure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8080/run &lt;span class="se"&gt;\&lt;/span&gt;
     &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
     &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
           "prompt": "Which of those nodes was the most expensive?",
           "thread_id": "audit-session-1"
         }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Example 3: Thread isolation
&lt;/h3&gt;

&lt;p&gt;The memory is strictly scoped by &lt;code&gt;thread_id&lt;/code&gt;. A request with a different thread ID sees no prior history and starts a fresh conversation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8080/run &lt;span class="se"&gt;\&lt;/span&gt;
     &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
     &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
           "prompt": "What was the second question I asked you?",
           "thread_id": "audit-session-2"
         }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Expected Output&lt;/strong&gt;: The agent responds that it has no record of previous questions in this session, confirming that thread isolation is working correctly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: Cloud deployment:
&lt;/h2&gt;

&lt;p&gt;Once you are satisfied with your customizations, deploy the agent as a serverless endpoint on the DigitalOcean Gradient AI Platform:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gradient agent deploy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You will receive a public endpoint URL that you can integrate into Slack bots, internal dashboards, &lt;a href="https://www.digitalocean.com/solutions/cicd-pipelines" rel="noopener noreferrer"&gt;CI/CD pipelines&lt;/a&gt;, or any HTTP client. The Gradient platform handles scaling, so your agent can serve multiple concurrent users without manual infrastructure management.&lt;/p&gt;

&lt;p&gt;For more details on building and deploying agents with the ADK, see &lt;a href="https://docs.digitalocean.com/products/gradient-ai-platform/how-to/build-agents-using-adk/" rel="noopener noreferrer"&gt;How to Build Agents Using ADK&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  GPU fleet cost optimization: When to use an AI agent vs. static dashboards
&lt;/h3&gt;

&lt;p&gt;One of the most common questions teams face when setting up &lt;a href="https://www.digitalocean.com/community/tutorials/monitoring-gpu-utilization-in-real-time" rel="noopener noreferrer"&gt;GPU monitoring&lt;/a&gt; is whether to build a custom AI agent or rely on traditional dashboard tooling. The right choice depends on your fleet size, the complexity of your workloads, and how quickly you need to act on idle resources.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Factor&lt;/th&gt;
&lt;th&gt;Static Dashboards (Grafana + Prometheus)&lt;/th&gt;
&lt;th&gt;AI Agent (This Blueprint)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Setup complexity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Moderate: requires Prometheus server, Grafana, and DCGM exporter configuration&lt;/td&gt;
&lt;td&gt;Low: clone the repo, set env vars, deploy with &lt;code&gt;gradient agent deploy&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Real-time alerting&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Rule-based alerts with fixed thresholds&lt;/td&gt;
&lt;td&gt;Natural language queries with adaptive reasoning&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multi-metric correlation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Manual: you visually compare multiple charts&lt;/td&gt;
&lt;td&gt;Automatic: the LLM correlates temperature, power, VRAM, and cost in a single response&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Actionability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Read-only dashboards; separate automation needed&lt;/td&gt;
&lt;td&gt;Extensible with &lt;code&gt;@tool&lt;/code&gt; decorator for direct API actions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Conversational follow-ups&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Not supported&lt;/td&gt;
&lt;td&gt;Built-in via LangGraph &lt;code&gt;MemorySaver&lt;/code&gt; and &lt;code&gt;thread_id&lt;/code&gt; scoping&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Best for&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Large teams with dedicated SRE/DevOps staff and historical trend analysis&lt;/td&gt;
&lt;td&gt;Small-to-mid teams that need fast, conversational GPU auditing without building dashboard infrastructure&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For teams running fewer than 20 GPU Droplets, the AI agent approach eliminates the overhead of maintaining a full monitoring stack while still providing actionable insights. For larger fleets, consider running both: use &lt;a href="https://www.digitalocean.com/community/developer-center/setting-up-monitoring-for-digitalocean-managed-databases-with-prometheus-and-grafana" rel="noopener noreferrer"&gt;Prometheus and Grafana&lt;/a&gt; for long-term trend storage and the AI agent for on-demand, conversational diagnostics.&lt;/p&gt;

&lt;h2&gt;
  
  
  Advantages and tradeoffs
&lt;/h2&gt;

&lt;p&gt;When adapting this blueprint for production, keep these architectural considerations in mind:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Contextual intelligence&lt;/strong&gt;: LangGraph’s &lt;code&gt;MemorySaver&lt;/code&gt; gives the agent conversation history, allowing natural drill-down investigations. You can ask “Which node is idle?” followed by “How much is it costing me per hour?” without repeating context.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parallel processing&lt;/strong&gt;: The analyzer uses Python’s &lt;code&gt;ThreadPoolExecutor&lt;/code&gt; to scan dozens of Droplets concurrently, preventing the LLM from timing out while waiting for sequential network calls.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost justification&lt;/strong&gt;: If the AI agent spots a single idle $500/month GPU instance, it pays for itself many times over. The inference cost of running a single diagnostic query on the Gradient platform is negligible compared to the savings.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Graceful degradation&lt;/strong&gt;: If the DCGM metric scraper cannot reach port &lt;code&gt;9400&lt;/code&gt; (for example, because of firewall rules or the exporter not being installed), the agent reports “DCGM Missing” for that node and falls back to standard CPU and RAM metrics rather than failing entirely.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security considerations&lt;/strong&gt;: The agent requires a DigitalOcean API token with read permissions. If you add write tools (like the &lt;code&gt;power_off_droplet&lt;/code&gt; example), scope the token’s permissions carefully and implement audit logging.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;You have successfully deployed a multi-tool AI agent using the DigitalOcean Gradient AI Platform that transforms raw infrastructure metrics into conversational, actionable intelligence. By combining DigitalOcean API data with real-time NVIDIA DCGM telemetry and an LLM reasoning engine, you have built a system that addresses three major operational challenges:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Stopping the silent budget drain
&lt;/h3&gt;

&lt;p&gt;The most immediate value this agent delivers is catching “forgotten resources.” When engineers spin up GPU Droplets for experiments or temporary training runs, those instances often continue billing long after the work is done. Standard CPU monitors might show background processes at 1%, making the instance look active.&lt;/p&gt;

&lt;p&gt;By querying the NVIDIA DCGM exporter directly for engine and VRAM utilization, the AI agent cuts through that noise. It identifies premium GPU nodes that are doing no meaningful compute work, letting you stop the financial drain before it compounds.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Eliminating dashboard fatigue
&lt;/h3&gt;

&lt;p&gt;In a traditional workflow, diagnosing a cloud infrastructure issue means opening the DigitalOcean Control Panel to check Droplet status, switching to Grafana to review DCGM metrics, and consulting an architecture diagram to remember what each node is responsible for.&lt;/p&gt;

&lt;p&gt;This agent consolidates that entire workflow. Using LangGraph’s conversational memory and the Omniscient Payload, you ask a single question and receive a complete summary of host details, GPU temperature, power usage, and cost impact in one response.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Bridging observability and action
&lt;/h3&gt;

&lt;p&gt;Traditional dashboards are read-only. They can alert you that a resource is idle, but they do not provide the tools to act on that information.&lt;/p&gt;

&lt;p&gt;Because this blueprint is built on the Gradient ADK, the agent is inherently extensible. By adding a few lines of Python using the &lt;code&gt;@tool decorator&lt;/code&gt;, you can upgrade this agent from a passive monitor into an active operator that executes API commands to power off idle nodes, resize underutilized instances, or trigger scaling events automatically.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://github.com/dosraashid/do-adk-gpu-monitor" rel="noopener noreferrer"&gt;do-adk-gpu-monitor&lt;/a&gt; repository is your starting point. Clone the code, adjust the efficiency thresholds to match your specific workloads, and start having conversations with your infrastructure today.&lt;/p&gt;

&lt;h2&gt;
  
  
  Reference and resources
&lt;/h2&gt;

&lt;p&gt;Ready to take your GPU fleet management and AI agent development further? Explore these resources:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://docs.digitalocean.com/products/gradient-ai-platform/" rel="noopener noreferrer"&gt;DigitalOcean Gradient AI Platform Documentation&lt;/a&gt;&lt;/strong&gt;: Full reference for deploying and managing AI agents, models, and inference endpoints.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://docs.digitalocean.com/products/gradient-ai-platform/how-to/build-agents-using-adk/" rel="noopener noreferrer"&gt;How to Build Agents Using ADK&lt;/a&gt;&lt;/strong&gt;: Step-by-step guide to creating custom agents with the Agent Development Kit.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://dev.tourl"&gt;Getting Started with Agentic AI Using LangGraph&lt;/a&gt;&lt;/strong&gt;: Learn the fundamentals of building stateful, multi-step AI agents with LangGraph.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://www.digitalocean.com/community/tutorials/stable-diffusion-gpu-droplet" rel="noopener noreferrer"&gt;Stable Diffusion on DigitalOcean GPU Droplets&lt;/a&gt;&lt;/strong&gt;: Run GPU-accelerated AI workloads on DigitalOcean GPU Droplets.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://www.digitalocean.com/community/tutorials/harnessing-gpus-glb-vpc-for-genai-products" rel="noopener noreferrer"&gt;Scaling Gradient with GPU Droplets and Networking&lt;/a&gt;&lt;/strong&gt;: Architect production GenAI deployments with GPU Droplets, global load balancers, and VPC networking.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>gpu</category>
      <category>nvidia</category>
      <category>ai</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>March 2026 DigitalOcean Tutorials: GPT-5.4 and Nemotron 3</title>
      <dc:creator>DigitalOcean</dc:creator>
      <pubDate>Mon, 06 Apr 2026 16:00:00 +0000</pubDate>
      <link>https://dev.to/digitalocean/march-2026-digitalocean-tutorials-gpt-54-and-nemotron-3-npc</link>
      <guid>https://dev.to/digitalocean/march-2026-digitalocean-tutorials-gpt-54-and-nemotron-3-npc</guid>
      <description>&lt;p&gt;AI development continues to change with the consistent release of new models, standards, and system architectures. It can often be a lot to keep track of and learn. But &lt;a href="https://www.digitalocean.com/community/tutorials" rel="noopener noreferrer"&gt;DigitalOcean&lt;/a&gt; has you covered with our community tutorials and resources.  &lt;/p&gt;

&lt;p&gt;These 10 tutorials from last month cover both practical, hands-on topics (such as building a game with GPT-5.4) and explanatory concepts (like migrating to multi-agent systems). Take a look and try them out—or bookmark them for some weekend coding! &lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;a href="https://www.digitalocean.com/community/tutorials/qwen35" rel="noopener noreferrer"&gt;Getting Started with Qwen3.5 Vision-Language Models&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;This tutorial walks through how to run and experiment with Qwen 3.5, an open-source multimodal model family that handles text, images, and even video. It breaks down the model’s architecture and demonstrates how to deploy it on GPU infrastructure so you can build apps like coding assistants or document analyzers on your own stack. You’ll see how high-performing multimodal AI is becoming accessible without relying on proprietary APIs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7c2zpcamded53ldofxej.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7c2zpcamded53ldofxej.jpg" alt="Qwen 3.5 Overview" width="800" height="517"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;a href="https://www.digitalocean.com/community/tutorials/a2a-vs-mcp-ai-agent-protocols" rel="noopener noreferrer"&gt;A2A vs MCP: How These AI Agent Protocols Actually Differ&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;Read about the difference between two emerging standards for agent-based systems: agent-to-agent communication (A2A) and model context protocol (MCP). You’ll learn when to use each—A2A for coordinating multiple agents and MCP for structured tool integration—and why most production systems combine both. It’s a practical breakdown of the protocols shaping how agentic AI systems are actually built.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;a href="https://www.digitalocean.com/community/tutorials/nemotron-3-nemofinder" rel="noopener noreferrer"&gt;Nemotron 3 Helped Me Find the Perfect Dish Rack?&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;Get insight into how NVIDIA’s Nemotron 3 model pairs with NemoFinder to improve retrieval and reasoning workflows. This tutorial demonstrates how combining LLMs with optimized search and ranking pipelines can yield more accurate results, especially in enterprise or knowledge-intensive applications. You’ll also learn more about how retrieval-augmented generation (RAG) systems are evolving with tighter model–tool integration.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;a href="https://www.digitalocean.com/community/tutorials/train-yolo26-retail-object-detection-digitalocean-gpu" rel="noopener noreferrer"&gt;Train YOLO26 for Retail Object Detection on DigitalOcean GPUs&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;This hands-on guide shows how to train a YOLOv26 model for retail use cases such as shelf monitoring and product detection on GPU infrastructure. It walks through dataset prep, training, and deployment so you can build real-world computer vision pipelines. You’ll gain a better understanding of how to move from raw image data to a production-ready detection model.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F99bfmsgafo7i185adx42.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F99bfmsgafo7i185adx42.png" alt="YOLO26 Benchmarks" width="800" height="355"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;a href="https://www.digitalocean.com/community/tutorials/langgraph-mem0-integration-long-term-ai-memory" rel="noopener noreferrer"&gt;Building Long-Term Memory in AI Agents with LangGraph and Mem0&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;If you’re curious about how to add persistent memory to agent workflows using LangGraph and Mem0, check out this tutorial. It shows how agents can retain context across sessions, enabling more personalized and stateful interactions over time. Its key takeaway is how long-term memory transforms agents from stateless responders into systems that can learn and adapt.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;a href="https://www.digitalocean.com/community/tutorials/gpt-54" rel="noopener noreferrer"&gt;Crafting a Game from Scratch with GPT-5.4&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;This article breaks down GPT-5.4’s capabilities, improvements, and practical use cases. It highlights advancements in reasoning, efficiency, and multimodal performance, and shows how developers can integrate the model into real applications. You’ll see how this frontier model integrates into modern AI stacks and the steps involved in creating a 3D badminton game from the ground up. &lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;a href="https://www.digitalocean.com/community/tutorials/text-diffusion-models" rel="noopener noreferrer"&gt;What are Text Diffusion Models? An Overview&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;This guide introduces diffusion models for text generation and explains how they differ from traditional autoregressive LLMs. It walks through how diffusion-based approaches iteratively refine outputs and where they may outperform standard models. You’ll get a conceptual and practical understanding of an emerging alternative to transformers.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fowwx6wo0zblyx0474l8m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fowwx6wo0zblyx0474l8m.png" alt="Overview of LLaDa" width="800" height="330"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;a href="https://www.digitalocean.com/community/tutorials/llm-tool-calling-managed-database-gradient-ai-platform" rel="noopener noreferrer"&gt;LLM Tool Calling with Gradient™ AI Platform and Databases&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;Discover how to connect LLMs to external tools—like databases—using structured tool calling. It walks through building workflows in which models query, retrieve, and act on real data rather than relying solely on prompts. You’ll get to see that tool integration makes LLMs more reliable and production-ready.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;a href="https://www.digitalocean.com/community/tutorials/generate-videos-ltx-23" rel="noopener noreferrer"&gt;How to Generate Videos with LTX-2.3 on DigitalOcean GPU Droplets&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;This tutorial explores how to generate videos using LTX 2.3, covering setup, prompts, and rendering workflows. It demonstrates how generative AI is expanding beyond text and images into video creation. After this article, you’ll know how to experiment with video generation pipelines and integrate them into creative or product workflows.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;a href="https://www.digitalocean.com/community/tutorials/single-to-multi-agent-infrastructure" rel="noopener noreferrer"&gt;From Single to Multi-Agent Systems: Key Infrastructure Needs&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;Get an overview of what changes when you move from a single AI agent to a multi-agent system. This tutorial goes through the full infrastructure stack—covering orchestration patterns, communication protocols, memory, and observability—so you can design systems where multiple agents collaborate reliably. Ultimately, multi-agent setups unlock scalability and specialization but require significantly more coordination, state management, and fault tolerance to work in production.&lt;/p&gt;

</description>
      <category>openai</category>
      <category>nvidia</category>
      <category>tutorial</category>
      <category>learning</category>
    </item>
    <item>
      <title>Build an End-to-End RAG Pipeline for LLM Applications</title>
      <dc:creator>DigitalOcean</dc:creator>
      <pubDate>Wed, 01 Apr 2026 01:06:34 +0000</pubDate>
      <link>https://dev.to/digitalocean/build-an-end-to-end-rag-pipeline-for-llm-applications-1330</link>
      <guid>https://dev.to/digitalocean/build-an-end-to-end-rag-pipeline-for-llm-applications-1330</guid>
      <description>&lt;p&gt;&lt;em&gt;This article was originally written by Shaoni Mukherjee (Technical Writer)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.digitalocean.com/resources/articles/large-language-models" rel="noopener noreferrer"&gt;Large language models&lt;/a&gt; have transformed the way we build intelligent applications. &lt;a href="https://www.digitalocean.com/products/gradient/platform" rel="noopener noreferrer"&gt;Generative AI Models&lt;/a&gt; can summarize documents, generate code, and answer complex questions. However, they still face a major limitation: they cannot access private or continuously changing knowledge unless that information is incorporated into their training data.&lt;/p&gt;

&lt;p&gt;Retrieval-Augmented Generation (RAG) addresses this limitation by combining information retrieval systems with generative AI models. Instead of relying entirely on the knowledge embedded in model weights, a RAG system retrieves relevant information from external sources and provides it to the language model during inference. The model then generates a response grounded in this retrieved context.&lt;/p&gt;

&lt;p&gt;An &lt;strong&gt;end-to-end RAG pipeline&lt;/strong&gt; refers to the full system that manages this process from beginning to end. It includes ingesting documents, transforming them into embeddings, storing them in a vector database, retrieving relevant information for a user query, and generating an answer using a large language model.&lt;/p&gt;

&lt;p&gt;This architecture is increasingly used in modern AI systems such as enterprise knowledge assistants, internal documentation search engines, developer copilots, and AI customer support tools. Organizations adopt RAG because it allows models to remain lightweight while still accessing large knowledge bases that change frequently.&lt;/p&gt;

&lt;p&gt;In this tutorial, we will walk through how to design and build a complete RAG pipeline. Along the way, we will explore architectural considerations, optimization strategies, and production challenges developers encounter when deploying retrieval-based AI systems.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhmeku3hdzligtrv0nf06.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhmeku3hdzligtrv0nf06.png" alt="Knowledge and Vector Storage for RAG pipeline" width="800" height="408"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;RAG combines retrieval and generation for more accurate AI systems&lt;/strong&gt;: Retrieval-Augmented Generation (RAG) bridges the gap between static language models and dynamic, real-world data. Instead of relying only on pre-trained knowledge, it fetches relevant information at runtime and uses it to generate answers. This makes responses more accurate, up-to-date, and context-aware. It is especially useful for applications like chatbots, internal knowledge assistants, and search systems. Overall, RAG helps reduce hallucinations and improves trust in AI-generated outputs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vector embeddings are the foundation of semantic search in RAG&lt;/strong&gt;: Embeddings convert text into numerical vectors that capture meaning rather than exact wording. This allows the system to understand similarity between queries and documents even if they use different phrasing. As a result, retrieval becomes more intelligent and context-driven instead of keyword-based. High-quality embedding models like &lt;code&gt;text-embedding-3-large&lt;/code&gt; or &lt;code&gt;bge-large-en&lt;/code&gt; can significantly improve retrieval performance. Choosing the right embedding model directly impacts the overall quality of your RAG system.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Each component of the pipeline plays a critical role&lt;/strong&gt;: A RAG system is made up of multiple steps, including ingestion, chunking, embedding, storage, retrieval, and generation. If any one component is poorly optimized, it can affect the entire pipeline’s performance. For example, bad chunking can lead to irrelevant retrieval, even if your embedding model is strong. Similarly, weak retrieval will result in poor answers, no matter how powerful the language model is. This is why building an end-to-end RAG system requires careful design and tuning at every stage.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Evaluation is essential for building reliable RAG applications&lt;/strong&gt;: It is not enough to just build a RAG pipeline, but you must also evaluate how well it performs. This includes checking whether the system retrieves the correct documents and whether the generated answers are accurate and grounded. Metrics like precision and recall help measure retrieval quality, while human evaluation helps assess answer correctness. Creating benchmark datasets with known questions and answers makes it easier to track improvements over time. Continuous evaluation ensures your system remains reliable in production.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Understanding the RAG System Architecture
&lt;/h2&gt;

&lt;p&gt;Before implementing the pipeline, it is important to understand how the different components interact. A typical &lt;strong&gt;RAG system architecture&lt;/strong&gt; can be divided into two major workflows: the indexing pipeline and the retrieval pipeline.&lt;/p&gt;

&lt;p&gt;The indexing pipeline prepares the knowledge base so that it can be searched efficiently. During this stage, documents are ingested, cleaned, split into chunks, converted into embeddings, and stored in a &lt;a href="https://www.digitalocean.com/community/tutorials/beyond-vector-databases-rag-without-embeddings" rel="noopener noreferrer"&gt;vector database&lt;/a&gt;. This process is usually executed offline or periodically when new data becomes available.&lt;/p&gt;

&lt;p&gt;The retrieval pipeline operates during inference. When a user asks a question, the system converts that query into an &lt;a href="https://www.digitalocean.com/community/tutorials/beyond-vector-databases-rag-without-embeddings" rel="noopener noreferrer"&gt;embedding&lt;/a&gt;, searches the vector database for semantically similar chunks, and provides those retrieved passages to the language model. The model then generates a response using both the query and the contextual information.&lt;/p&gt;

&lt;p&gt;A simplified representation of the pipeline looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Document Sources
       (PDFs, Docs, APIs, Knowledge Base)
                        |
                        v
               Document Processing
                        |
                        v
                  Text Chunking
                        |
                        v
               Embedding Generation
                        |
                        v
               Vector Database Index
                        |
                        v
User Query → Query Embedding → Similarity Search
                        |
                        v
             Retrieved Context Chunks
                        |
                        v
                  LLM Generation
                        |
                        v
                  Final Response
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This architecture enables the system to retrieve information dynamically rather than relying solely on model training.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy49fm6102laxs8huvmqn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy49fm6102laxs8huvmqn.png" alt="RAG System Architecture" width="750" height="676"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Data Ingestion in a RAG Pipeline
&lt;/h2&gt;

&lt;p&gt;The first stage of the pipeline involves gathering the data that the AI system will use as its knowledge source. In many real-world applications, this information is distributed across multiple systems. Organizations may store documentation in internal knowledge bases, PDFs, wikis, product manuals, or database records.&lt;/p&gt;

&lt;p&gt;The ingestion stage extracts textual information from these sources and prepares it for processing. Depending on the data format, ingestion may involve parsing HTML pages, converting PDFs to text, or querying APIs to retrieve structured records.&lt;/p&gt;

&lt;p&gt;At this stage, developers often implement preprocessing steps such as removing redundant formatting, normalizing whitespace, and filtering irrelevant sections. These steps are important because retrieval performance strongly depends on the quality of the text data stored in the system.&lt;/p&gt;

&lt;p&gt;For enterprise knowledge retrieval systems, ingestion pipelines are usually automated and scheduled. For example, an internal documentation chatbot might update its &lt;a href="https://docs.digitalocean.com/products/gradient-ai-platform/how-to/create-manage-agent-knowledge-bases/" rel="noopener noreferrer"&gt;knowledge base&lt;/a&gt; daily by ingesting the latest documentation changes from a repository.&lt;/p&gt;

&lt;h2&gt;
  
  
  Text Chunking: Preparing Documents for Retrieval
&lt;/h2&gt;

&lt;p&gt;After ingestion, documents must be divided into smaller pieces before they can be embedded. This step, known as &lt;a href="https://docs.digitalocean.com/products/gradient-ai-platform/concepts/chunking-strategies/" rel="noopener noreferrer"&gt;text chunking&lt;/a&gt;, plays a critical role in the overall performance of the RAG pipeline.&lt;/p&gt;

&lt;p&gt;Large documents cannot be embedded effectively because embedding models have token limits and because large chunks reduce retrieval precision. Instead, documents are broken into manageable segments that capture a coherent piece of information.&lt;/p&gt;

&lt;p&gt;Chunk size is typically chosen between 200 and 500 tokens. Smaller chunks provide more precise retrieval results, while larger chunks preserve more contextual information. Many production pipelines use overlapping chunks to prevent important sentences from being split across boundaries.&lt;/p&gt;

&lt;p&gt;The following diagram illustrates how a long document is transformed into multiple overlapping chunks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Original Document
-------------------------------------------------------
| Paragraph 1 | Paragraph 2 | Paragraph 3 | Paragraph 4 |
-------------------------------------------------------

After Chunking
-------------------------------------------------------
| Chunk 1 | Chunk 2 | Chunk 3 | Chunk 4 | Chunk 5 |
-------------------------------------------------------

Chunk Example
Chunk 1: Paragraph 1 + part of Paragraph 2
Chunk 2: Paragraph 2 + part of Paragraph 3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Choosing an effective chunking strategy significantly improves retrieval accuracy because each chunk represents a focused semantic concept.&lt;/p&gt;

&lt;h2&gt;
  
  
  Embedding Generation
&lt;/h2&gt;

&lt;p&gt;Once documents are divided into chunks, each chunk must be converted into a numerical representation called an embedding. Embeddings transform text into high-dimensional vectors that capture semantic meaning.&lt;/p&gt;

&lt;p&gt;For example, two sentences that express similar ideas will produce vectors that are close to each other in vector space. This property allows vector databases to retrieve semantically related text even when the wording differs.&lt;/p&gt;

&lt;p&gt;Embedding models are trained using large datasets and &lt;a href="https://www.digitalocean.com/community/tutorials/transformers-attention-is-all-you-need" rel="noopener noreferrer"&gt;transformer architectures&lt;/a&gt;. When a chunk is processed, the model generates a vector with hundreds or thousands of dimensions. These vectors serve as the foundation for similarity search.&lt;/p&gt;

&lt;p&gt;Embedding generation occurs during both indexing and retrieval. During indexing, embeddings are generated for each document chunk. During retrieval, the user’s query is also converted into an embedding so that it can be compared against stored vectors.&lt;/p&gt;

&lt;p&gt;This mechanism allows the RAG system to perform &lt;strong&gt;semantic search&lt;/strong&gt;, which is far more powerful than traditional keyword matching.&lt;/p&gt;

&lt;h2&gt;
  
  
  Vector Embedding
&lt;/h2&gt;

&lt;p&gt;Vector embeddings are dense numerical representations of data, which can be text, images, or audio. Vector embeddings are used to capture the semantic meaning of the data in a high-dimensional vector space. In an end-to-end RAG pipeline, embeddings are used to convert both documents and user queries into vectors so that similarity between them can be measured using metrics like cosine similarity. This allows the system to retrieve context based on meaning rather than exact keyword matches, making responses more accurate and relevant.&lt;/p&gt;

&lt;p&gt;For example, even if a query doesn’t contain the same words as a document, embeddings can still identify it as relevant if the underlying intent is similar. Popular embedding models used in RAG systems include &lt;a href="https://developers.openai.com/api/docs/models/text-embedding-3-large" rel="noopener noreferrer"&gt;text-embedding-3-large&lt;/a&gt;, &lt;a href="https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2" rel="noopener noreferrer"&gt;all-MiniLM-L6-v2&lt;/a&gt;, &lt;a href="https://huggingface.co/BAAI/bge-large-en" rel="noopener noreferrer"&gt;bge-large-en&lt;/a&gt;, and &lt;a href="https://huggingface.co/intfloat/e5-large-v2" rel="noopener noreferrer"&gt;e5-large-v2&lt;/a&gt;, each offering different trade-offs in performance, cost, and deployment flexibility.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fixgailx5konq18wkv1ev.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fixgailx5konq18wkv1ev.png" alt="Vector Embedding Workflow" width="800" height="428"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Storing Vectors in a Database
&lt;/h2&gt;

&lt;p&gt;After embeddings are created, they must be stored in a specialized database capable of performing fast similarity searches. These systems are known as &lt;strong&gt;vector databases&lt;/strong&gt; and form the core of the RAG retrieval infrastructure.&lt;/p&gt;

&lt;p&gt;Unlike traditional databases that index numeric or textual fields, vector databases are optimized to search across high-dimensional vectors. They use approximate nearest neighbor algorithms to identify vectors that are closest to a query embedding.&lt;/p&gt;

&lt;p&gt;The structure of a stored vector typically includes the embedding itself, the original text chunk, and metadata describing the source of the information. Metadata can include document identifiers, timestamps, or categories that allow filtering during retrieval.&lt;/p&gt;

&lt;p&gt;A simplified representation of vector storage looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Vector Database

ID     Vector Embedding        Text Chunk
---------------------------------------------------------
1   [0.12, -0.44, 0.92...]   "RAG combines retrieval..."
2   [0.55, 0.33, -0.14...]   "Vector databases enable..."
3   [-0.77, 0.08, 0.62...]   "Embeddings represent..."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Popular vector database technologies include managed services and open-source platforms designed specifically for AI workloads. The choice often depends on scale, infrastructure preferences, and latency requirements.&lt;/p&gt;

&lt;h2&gt;
  
  
  Retrieval in a RAG Pipeline
&lt;/h2&gt;

&lt;p&gt;When a user submits a question, the system begins the retrieval stage. The query is first converted into an embedding using the same embedding model used during indexing. Maintaining the same embedding model is important because similarity comparisons rely on consistent vector representations.&lt;/p&gt;

&lt;p&gt;The query embedding is then sent to the vector database. The database performs a similarity search to find document chunks whose embeddings are closest to the query vector. These chunks represent the pieces of information most relevant to the user’s question.&lt;/p&gt;

&lt;p&gt;The retrieved chunks are then combined and passed to the language model as contextual input. The model uses this context to generate a response grounded in actual documents rather than relying solely on its training data.&lt;/p&gt;

&lt;p&gt;This process ensures that answers are based on real knowledge sources and can be updated whenever the underlying documents change.&lt;/p&gt;

&lt;h2&gt;
  
  
  Generation with a Large Language Model
&lt;/h2&gt;

&lt;p&gt;The final stage of the pipeline involves generating a response using a language model. At this point, the system already has two pieces of information: the user’s question and the retrieved context.&lt;/p&gt;

&lt;p&gt;These elements are combined into a prompt that instructs the model to answer the question using the provided information. Because the context is derived from authoritative documents, the model’s output becomes significantly more reliable and factual.&lt;/p&gt;

&lt;p&gt;This stage also allows developers to control how responses are generated. Prompts may instruct the model to summarize information, provide citations, or answer in a specific format. Some systems also include guardrails that prevent hallucinations or restrict responses to retrieved information.&lt;/p&gt;

&lt;p&gt;For example, if a user asks a question, the system first pulls the most relevant text from your knowledge base, then the LLM rewrites that content into a helpful answer, making it more conversational, structured, and easy to understand. This step is what makes RAG powerful, because it combines &lt;strong&gt;accurate, up-to-date information&lt;/strong&gt; with &lt;strong&gt;fluent natural language generation&lt;/strong&gt;, reducing hallucinations and improving answer quality.&lt;/p&gt;

&lt;h2&gt;
  
  
  Code Demo: Building a Simple End-to-End RAG Pipeline
&lt;/h2&gt;

&lt;p&gt;The following example demonstrates how a basic &lt;strong&gt;RAG pipeline for LLM applications&lt;/strong&gt; can be implemented in Python. The example uses document loading, chunking, embeddings, and a vector database to create a minimal working pipeline.&lt;/p&gt;

&lt;h4&gt;
  
  
  Install dependencies
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pip install langchain chromadb sentence-transformers openai
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Load documents
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from langchain.document_loaders import TextLoader

loader = TextLoader("knowledge_base.txt")
documents = loader.load()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Split documents into chunks
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from langchain.text_splitter import RecursiveCharacterTextSplitter

splitter = RecursiveCharacterTextSplitter(
   chunk_size=500,
   chunk_overlap=100
)

chunks = splitter.split_documents(documents)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Generate embeddings
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from langchain.embeddings import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(
   model_name="sentence-transformers/all-MiniLM-L6-v2"
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Store vectors
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from langchain.vectorstores import Chroma

vector_db = Chroma.from_documents(
   documents=chunks,
   embedding=embeddings
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Retrieval and generation
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI()

qa_chain = RetrievalQA.from_chain_type(
   llm=llm,
   retriever=vector_db.as_retriever()
)

response = qa_chain.run(
   "What is retrieval augmented generation?"
)

print(response)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This simple implementation demonstrates how document retrieval and language models can be combined into a working RAG system.&lt;/p&gt;

&lt;h2&gt;
  
  
  Evaluating RAG System Performance
&lt;/h2&gt;

&lt;p&gt;Evaluating a RAG system is important because you need to be sure that it is not only retrieving the right information but also generating correct and useful answers from it. In simple terms, a good RAG pipeline should &lt;strong&gt;find the right content&lt;/strong&gt; and then &lt;strong&gt;explain it correctly&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;First, let’s look at &lt;strong&gt;retrieval evaluation&lt;/strong&gt;. This checks whether the system is pulling the right documents from your database. Imagine you have a knowledge base about cloud services, and a user asks, &lt;em&gt;“How can I run AI models on GPUs?”&lt;/em&gt;. If your system retrieves documents about &lt;a href="https://www.digitalocean.com/products/gradient/gpu-droplets" rel="noopener noreferrer"&gt;GPU Droplets&lt;/a&gt; or AI infrastructure, that’s a good sign. But if it returns unrelated content like pricing pages or networking docs, retrieval quality is poor. Metrics like &lt;em&gt;recall&lt;/em&gt; (did we find all relevant documents?) and &lt;em&gt;precision&lt;/em&gt; (were the retrieved documents actually relevant?) help measure this. For example, if 5 documents are relevant but your system only retrieves 2, recall is low.&lt;/p&gt;

&lt;p&gt;Next is &lt;strong&gt;generation evaluation&lt;/strong&gt;, which focuses on the answer produced by the language model. Even if retrieval is correct, the model (like GPT-4 or Llama 3) might still generate incomplete or incorrect responses. For instance, if the retrieved document clearly says &lt;em&gt;“GPU droplets support CUDA workloads”&lt;/em&gt;, but the model responds with &lt;em&gt;“GPU support is limited”&lt;/em&gt;, that’s a problem. This is why human evaluation is often needed to check if the answer is &lt;strong&gt;factually correct, complete, and grounded in the provided context&lt;/strong&gt;. Automated metrics struggle to detect things like s or subtle inaccuracies.&lt;/p&gt;

&lt;p&gt;To make evaluation consistent, teams usually create an &lt;strong&gt;evaluation dataset&lt;/strong&gt;. This is a collection of sample questions along with their correct answers and sometimes the expected source documents. For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Question: &lt;em&gt;“What are GPU droplets used for?”&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Expected answer: &lt;em&gt;“They are used for AI/ML workloads, training models, and high-performance computing.”&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can then run your RAG system on this dataset and compare its answers against the expected ones. Over time, this helps you track improvements, catch errors, and tune your system (for example, by improving chunking, choosing a better embedding model, or adjusting prompts).&lt;/p&gt;

&lt;p&gt;In practice, strong RAG evaluation combines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Retrieval checks&lt;/strong&gt;: Did we fetch the right information?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Answer checks&lt;/strong&gt;: Did we explain it correctly?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Continuous testing&lt;/strong&gt;: Are we improving over time?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This ensures your RAG pipeline is reliable, accurate, and ready for real-world use.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scaling and Production Considerations
&lt;/h2&gt;

&lt;p&gt;Prototype RAG pipelines often work well with small datasets, but production deployments introduce additional challenges. Large organizations may store millions of document chunks, requiring scalable infrastructure for indexing and retrieval.&lt;/p&gt;

&lt;p&gt;Latency also becomes an important concern. Vector searches, embedding generation, and LLM inference all contribute to response time. Developers must carefully optimize these components to ensure interactive performance.&lt;/p&gt;

&lt;p&gt;Production systems frequently incorporate caching layers, query batching, and efficient indexing strategies. Monitoring tools are also used to track retrieval accuracy, system latency, and cost per query.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cost and Latency Optimization
&lt;/h2&gt;

&lt;p&gt;Operating a &lt;a href="https://www.digitalocean.com/community/conceptual-articles/rag-ai-agents-agentic-rag-comparative-analysis" rel="noopener noreferrer"&gt;RAG pipeline&lt;/a&gt; at scale can become expensive if not carefully optimized. Each query may require embedding generation, vector search, and language model inference.&lt;/p&gt;

&lt;p&gt;Several strategies help reduce these costs. Caching responses for frequently asked questions prevents repeated model inference. Limiting the number of retrieved chunks also reduces token usage and speeds up generation.&lt;/p&gt;

&lt;p&gt;Another important technique is &lt;strong&gt;re-ranking&lt;/strong&gt;. Instead of sending many retrieved documents to the language model, a re-ranking model selects the most relevant passages before generation. This improves response quality while reducing computational overhead.&lt;/p&gt;

&lt;h2&gt;
  
  
  RAG vs Fine-Tuning
&lt;/h2&gt;

&lt;p&gt;A common question among developers is whether to use retrieval-augmented generation or fine-tuning.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.digitalocean.com/community/tutorials/fine-tuning-llms-on-budget-digitalocean-gpu" rel="noopener noreferrer"&gt;Fine-tuning&lt;/a&gt; changes a model’s internal weights by training it on additional datasets. This approach works well for teaching models specific styles or behaviors. However, it is less effective for continuously changing knowledge because retraining the model is expensive and time-consuming.&lt;/p&gt;

&lt;p&gt;RAG systems take a different approach by keeping the model unchanged while retrieving knowledge dynamically. This makes them ideal for applications where information changes frequently, such as product documentation or customer support knowledge bases.&lt;/p&gt;

&lt;p&gt;For most knowledge-intensive applications, RAG provides a more flexible and maintainable solution.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Building an end-to-end RAG pipeline is about combining the strengths of retrieval systems and large language models to create applications that are both accurate and context-aware. Instead of relying only on pre-trained knowledge, a RAG system can fetch relevant information in real time and use models like GPT-4 or Llama 3 to generate clear, human-like responses grounded in that data. In this article, we understood each of the steps used to create the RAG pipeline from data ingestion and chunking to vector embeddings, retrieval, and response generation. Each component plays a critical role, and even small improvements (like better chunking strategies or choosing the right embedding model) can significantly impact overall performance. As organizations continue to build AI-powered applications, RAG stands out as a practical and scalable approach for use cases like chatbots, knowledge assistants, and document search. By continuously evaluating and refining your pipeline, you can create systems that are not only intelligent but also reliable and production-ready.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.digitalocean.com/resources/articles/rag" rel="noopener noreferrer"&gt;What is Retrieval Augmented Generation (RAG)? The Key to Smarter, More Accurate AI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.digitalocean.com/community/conceptual-articles/rag-ai-agents-agentic-rag-comparative-analysis" rel="noopener noreferrer"&gt;RAG, AI Agents, and Agentic RAG: An In-Depth Review and Comparative Analysis&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.digitalocean.com/community/tutorials/beyond-vectors-knowledge-graphs-and-rag" rel="noopener noreferrer"&gt;Beyond Vectors - Knowledge Graphs &amp;amp; RAG Using Gradient&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.langchain.com/" rel="noopener noreferrer"&gt;Langchain docs&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>rag</category>
      <category>tutorial</category>
      <category>ai</category>
      <category>llm</category>
    </item>
    <item>
      <title>Tutorial: Deploy NVIDIA's NemoClaw in One Click</title>
      <dc:creator>DigitalOcean</dc:creator>
      <pubDate>Mon, 23 Mar 2026 18:28:14 +0000</pubDate>
      <link>https://dev.to/digitalocean/how-to-set-up-nemoclaw-on-a-digitalocean-droplet-with-1-click-1lo4</link>
      <guid>https://dev.to/digitalocean/how-to-set-up-nemoclaw-on-a-digitalocean-droplet-with-1-click-1lo4</guid>
      <description>&lt;p&gt;&lt;em&gt;This article was originally written by Amit Jotwani (Staff Developer Advocate at DigitalOcean)&lt;/em&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Key Takeaways
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;NemoClaw is an open-source stack from NVIDIA designed to help developers run OpenClaw securely. &lt;/li&gt;
&lt;li&gt;DigitalOcean offers NemoClaw 1-Click Droplets that enable you to set up this stack on a CPU-optimized virtual machine and run NemoClaw. &lt;/li&gt;
&lt;li&gt;This tutorial illustrates how to SSH into your Droplet, configure inference settings and policies, connect to NemoClaw, and effectively reconnect after the initial setup.
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;At GTC 2026, NVIDIA announced &lt;a href="https://nvidianews.nvidia.com/news/nvidia-announces-nemoclaw" rel="noopener noreferrer"&gt;NemoClaw&lt;/a&gt;, an open-source stack that makes it easy to run &lt;a href="https://openclaw.com/" rel="noopener noreferrer"&gt;OpenClaw&lt;/a&gt; autonomous agents securely. OpenClaw is an open-source agent platform that Jensen Huang called “the operating system for personal AI.” We covered &lt;a href="https://www.digitalocean.com/community/tutorials/how-to-run-openclaw" rel="noopener noreferrer"&gt;how to run OpenClaw on a Droplet&lt;/a&gt; in an earlier tutorial. NemoClaw takes a different approach — it wraps OpenClaw with sandboxing, security policies, and inference routing through NVIDIA’s cloud.&lt;/p&gt;

&lt;p&gt;NemoClaw is still in alpha, so expect rough edges. Interfaces may change, features might be incomplete, and things could break. But if you’re curious to try it out or just want to see what NVIDIA’s vision for agents looks like, this tutorial will get you up and running on a DigitalOcean Droplet in under 10 minutes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before you begin, you’ll need:&lt;/p&gt;

&lt;p&gt;A DigitalOcean account (&lt;a href="https://cloud.digitalocean.com/registrations/new" rel="noopener noreferrer"&gt;sign up here&lt;/a&gt; if you don’t have one)&lt;br&gt;
An NVIDIA account to generate an API key at &lt;a href="https://build.nvidia.com/settings/api-keys" rel="noopener noreferrer"&gt;build.nvidia.com&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 1 - Create a Droplet from the Marketplace
&lt;/h2&gt;

&lt;p&gt;Head to the NemoClaw 1-Click Droplet on the DigitalOcean Marketplace. Click &lt;strong&gt;Create NemoClaw Droplet&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;When configuring the Droplet, select the &lt;strong&gt;CPU-Optimized&lt;/strong&gt; plan with &lt;strong&gt;Premium Intel&lt;/strong&gt;. You’ll want the option with &lt;strong&gt;32 GB of RAM and 16 CPUs&lt;/strong&gt;. NemoClaw runs Docker containers, a Kubernetes cluster (k3s), and the OpenShell gateway, so it needs the headroom.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkf3xcfukamdj8d0kidh1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkf3xcfukamdj8d0kidh1.png" alt="Droplet Configuration Settings" width="800" height="691"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Pick a data center region near you, add your SSH key, and hit &lt;strong&gt;Create Droplet&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Heads up: This Droplet costs $336/mo, so make sure to destroy it when you’re done experimenting. It adds up fast if you forget about it.&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 2 - SSH into the Droplet
&lt;/h2&gt;

&lt;p&gt;Once your Droplet is ready, SSH in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ssh"&gt;&lt;code&gt;&lt;span class="k"&gt;ssh&lt;/span&gt; root@your_server_ip
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You’ll see the usual Ubuntu login banner, and then the NemoClaw onboarding wizard will kick off automatically. It runs through a series of preflight checks, making sure Docker is running, installing the OpenShell CLI, and spinning up the gateway. You’ll see checkmarks fly by as each step completes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy9zq2u6f7fiedqcrj91w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy9zq2u6f7fiedqcrj91w.png" alt="Onboarding checks" width="800" height="530"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3 - Walk Through the Onboard Wizard
&lt;/h2&gt;

&lt;p&gt;The onboarding wizard will ask you a few things. Here’s what to do at each prompt:&lt;/p&gt;

&lt;h3&gt;
  
  
  Sandbox Name
&lt;/h3&gt;

&lt;p&gt;The first prompt asks for a sandbox name. Just press &lt;strong&gt;Enter&lt;/strong&gt; to accept the default (&lt;code&gt;my-assistant&lt;/code&gt;). The wizard will then create the sandbox, build the container image, and push it to the gateway. This takes a couple of minutes, and you’ll see it run through about 20 steps as it builds and uploads everything.&lt;/p&gt;

&lt;h3&gt;
  
  
  NVIDIA API Key
&lt;/h3&gt;

&lt;p&gt;Once the sandbox is ready, the wizard asks for your NVIDIA API key. In this setup, inference is routed through NVIDIA’s cloud using the &lt;code&gt;nvidia/nemotron-3-super-120b-a12b&lt;/code&gt; model, so it needs a key to authenticate.&lt;/p&gt;

&lt;p&gt;To get your key, head to &lt;a href="https://build.nvidia.com/settings/api-keys" rel="noopener noreferrer"&gt;build.nvidia.com/settings/api-keys&lt;/a&gt;, sign in, and click &lt;strong&gt;Generate API Key&lt;/strong&gt;. Give it a name, pick an expiration, and hit &lt;strong&gt;Generate Key&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fffkfetz0bbqstz3ea9a3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fffkfetz0bbqstz3ea9a3.png" alt="NVIDIA API Key generation" width="800" height="569"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Copy the key (it starts with &lt;code&gt;nvapi-&lt;/code&gt;), paste it into the terminal prompt, and press &lt;strong&gt;Enter&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcisdgrdv3g5qk78pn0ti.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcisdgrdv3g5qk78pn0ti.png" alt="NVIDIA API key integration" width="800" height="530"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The wizard saves the key to &lt;code&gt;~/.nemoclaw/credentials.json&lt;/code&gt; and sets up the inference provider. You’ll see it confirm the model and create an inference route.&lt;/p&gt;

&lt;h3&gt;
  
  
  Policy Presets
&lt;/h3&gt;

&lt;p&gt;After the inference setup, NemoClaw sets up OpenClaw inside the sandbox and then asks about policy presets. You’ll see a list of available presets including Discord, Docker Hub, Hugging Face, Jira, npm, PyPI, Slack, and more. These control what external services the agent is allowed to reach.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyzr3abqzhmec2dawimv2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyzr3abqzhmec2dawimv2.png" alt="Onboarding policy presets" width="800" height="530"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;At the bottom, the wizard asks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Apply suggested presets (pypi, npm)? [Y/n/list]:
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Type &lt;code&gt;n&lt;/code&gt; and press &lt;strong&gt;Enter&lt;/strong&gt;. These presets grant the sandbox network access to package registries, which you don’t need for a basic setup. You can always add them later if your agent needs to install packages.&lt;/p&gt;

&lt;p&gt;Once onboarding finishes, you’ll see a clean summary with your sandbox details and the commands you’ll need going forward:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxv3xi2k87w2wyolgqfku.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxv3xi2k87w2wyolgqfku.png" alt="Onboarding complete" width="800" height="530"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Sandbox    my-assistant (Landlock + seccomp + netns)
Model      nvidia/nemotron-3-super-120b-a12b (NVIDIA Cloud API)
NIM        not running

Run:       nemoclaw my-assistant connect
Status:    nemoclaw my-assistant status
Logs:      nemoclaw my-assistant logs --follow
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 4 - Connect to NemoClaw
&lt;/h2&gt;

&lt;p&gt;Now for the fun part. Connect to your sandbox.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;nemoclaw my-assistant connect
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This drops you into a shell inside the sandboxed environment. From here, launch the OpenClaw TUI (terminal user interface):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw tui
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That’s it. You should see the OpenClaw chat interface come up. The agent will greet you and introduce itself, ready to chat.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsc2n1gyftn9k6eibpy34.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsc2n1gyftn9k6eibpy34.png" alt="OpenClaw TUI" width="800" height="530"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Type a message and hit &lt;strong&gt;Enter&lt;/strong&gt;. You’re now talking to an AI agent running inside a secure, sandboxed environment on your own Droplet.&lt;/p&gt;

&lt;h2&gt;
  
  
  Reconnecting After a New SSH Session
&lt;/h2&gt;

&lt;p&gt;If you close your terminal and SSH back into the Droplet later, you’ll find that &lt;code&gt;nemoclaw&lt;/code&gt; and related commands aren’t available. That’s because the onboarding script installed everything through nvm in a separate shell, and that doesn’t carry over to new sessions.&lt;/p&gt;

&lt;p&gt;Run this once to fix it permanently. It adds nvm to your &lt;code&gt;.bashrc&lt;/code&gt; so it loads automatically on every login:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'export NVM_DIR="$HOME/.nvm"'&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; ~/.bashrc &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'[ -s "$NVM_DIR/nvm.sh" ] &amp;amp;&amp;amp; \. "$NVM_DIR/nvm.sh"'&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; ~/.bashrc &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'[ -s "$NVM_DIR/bash_completion" ] &amp;amp;&amp;amp; \. "$NVM_DIR/bash_completion"'&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; ~/.bashrc &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;source&lt;/span&gt; ~/.bashrc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then reconnect to your sandbox and launch the TUI the same way as before:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;nemoclaw my-assistant connect
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw tui
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7v53w5esybr80ypsbwtt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7v53w5esybr80ypsbwtt.png" alt="Sandbox reload" width="800" height="530"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Everything picks up right where you left off. Your sandbox and agent are still running.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;By default, the sandbox has limited network access, so the agent can’t reach external services out of the box. To unlock more capabilities - like connecting to Slack, GitHub, or pulling packages from PyPI - you’ll want to configure policy presets. Check the NemoClaw documentation for the full list of available integrations and how to set them up.&lt;/p&gt;

&lt;p&gt;NemoClaw is still very early, so expect things to be rough around the edges. But if you want to get a feel for where always-on agents are headed, this is a good way to start poking around.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://marketplace.digitalocean.com/apps/nemoclaw-alpha" rel="noopener noreferrer"&gt;NemoClaw 1-Click Droplet on DigitalOcean Marketplace&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/NVIDIA/NemoClaw/" rel="noopener noreferrer"&gt;NemoClaw GitHub Repo&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.nvidia.com/nemoclaw/latest/" rel="noopener noreferrer"&gt;NemoClaw Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://nvidianews.nvidia.com/news/nvidia-announces-nemoclaw" rel="noopener noreferrer"&gt;NVIDIA NemoClaw Announcement&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://openclaw.com/" rel="noopener noreferrer"&gt;OpenClaw&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.digitalocean.com/community/tutorials/how-to-run-openclaw" rel="noopener noreferrer"&gt;How to Run OpenClaw on a DigitalOcean Droplet&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://build.nvidia.com/settings/api-keys" rel="noopener noreferrer"&gt;NVIDIA API Keys&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>tutorial</category>
      <category>nemoclaw</category>
      <category>ai</category>
      <category>nvidia</category>
    </item>
    <item>
      <title>GPT 5.3 Codex is the Next Level for Agentic Coding</title>
      <dc:creator>DigitalOcean</dc:creator>
      <pubDate>Thu, 19 Mar 2026 20:00:00 +0000</pubDate>
      <link>https://dev.to/digitalocean/gpt-53-codex-is-the-next-level-for-agentic-coding-52kl</link>
      <guid>https://dev.to/digitalocean/gpt-53-codex-is-the-next-level-for-agentic-coding-52kl</guid>
      <description>&lt;p&gt;Agentic Coding models are one of the obvious and most impressive applications of LLM technologies, and their development has gone hand in hand with massive impacts to markets and job growth. There are numerous players vying to create the best new LLM for all sorts of applications, and many would argue no company and their products in this space have more of a significant impact than OpenAI.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://openai.com/index/introducing-gpt-5-3-codex/" rel="noopener noreferrer"&gt;GPT‑5.3‑Codex&lt;/a&gt; is a truly impressive installment in this quest to create the best model. &lt;a href="https://openai.com" rel="noopener noreferrer"&gt;OpenAI&lt;/a&gt; promises that GPT-5.3-Codex is their most &lt;a href="https://openai.com/index/introducing-gpt-5-3-codex/" rel="noopener noreferrer"&gt;capable Codex model&lt;/a&gt; yet, advancing both coding performance and professional reasoning beyond GPT-5.2-Codex. Benchmark results show state-of-the-art performance on coding and agentic benchmarks like SWE-Bench Pro and Terminal-Bench, reflecting stronger multi-language and real-world task ability. Furthermore, the model is ~25% faster than &lt;a href="https://openai.com/index/introducing-gpt-5-2-codex/" rel="noopener noreferrer"&gt;GPT-5.2-Codex&lt;/a&gt; for &lt;a href="https://openai.com/codex/" rel="noopener noreferrer"&gt;Codex&lt;/a&gt; users thanks to infrastructure and inference improvements. Overall, GPT‑5.3‑Codex might be the most powerful agentic coding model ever released (&lt;a href="https://openai.com/index/introducing-gpt-5-3-codex/" rel="noopener noreferrer"&gt;Source&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;So let’s see what it can do. Now available on the &lt;a href="https://www.digitalocean.com/products/gradient/platform" rel="noopener noreferrer"&gt;DigitalOcean GradientTM AI Platform&lt;/a&gt; and all OpenAI ChatGPT and Codex resources, we can test the model to see how it performs. In this tutorial, we will show how to use Codex to write a completely new project from scratch. We are going to make a &lt;a href="https://huggingface.co/Tongyi-MAI/Z-Image-Turbo" rel="noopener noreferrer"&gt;Z-Image-Turbo&lt;/a&gt; Real-Time image-to-image application using GPT‑5.3‑Codex, without any user coding! Follow along to learn what GPT‑5.3‑Codex has to offer, how to use GPT‑5.3‑Codex for yourself, and a guide to vibe coding new web applications from scratch!&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;State-of-the-Art Agentic Performance: GPT-5.3-Codex delivers impressive results across software engineering and agentic tasks, outperforming GPT-5.2-Codex in reasoning, multi-language capability, and real-world coding evaluations like SWE-Bench Pro and Terminal-Bench 2.0.&lt;/li&gt;
&lt;li&gt;Getting Started with GPT-5.3-Codex on GradientTM AI Platform is easy: All you need is access to the DigitalOcean Platform to begin integrating your LLM’s calls seamlessly into your workflows at scale.&lt;/li&gt;
&lt;li&gt;From Prototype to Production in Record Time: With roughly 25% improved speed and real-time interactive steering, GPT-5.3-Codex feels less like a static generator and more like a responsive engineering partner capable of iterating, debugging, and refining projects alongside you. By handling scaffolding, architecture decisions, edge cases, and deployment-ready details, GPT-5.3-Codex can dramatically compress development timelines, making it possible to ship fully functional applications from scratch more quickly than ever (&lt;a href="https://openai.com/index/introducing-gpt-5-3-codex/" rel="noopener noreferrer"&gt;Source&lt;/a&gt;).&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  GPT‑5.3‑Codex Overview
&lt;/h2&gt;

&lt;p&gt;GPT-5.3-Codex is a major agentic coding model upgrade that combines stronger reasoning and professional knowledge with enhanced coding performance, runs about 25 % faster than GPT-5.2-Codex, and excels on real-world and multi-language benchmarks like &lt;a href="https://scale.com/leaderboard/swe_bench_pro_public" rel="noopener noreferrer"&gt;SWE-Bench Pro&lt;/a&gt; and &lt;a href="https://www.tbench.ai/" rel="noopener noreferrer"&gt;Terminal-Bench&lt;/a&gt;. It’s designed to go beyond simple code generation to support full software lifecycle tasks (e.g., debugging, deployment, documentation) and lets you interact and steer it in real time while it’s working, making it feel more like a collaborative partner than a generator. It also has expanded capabilities for long-running work and improved responsiveness, with broader availability across IDEs, CLI, and apps for paid plans. (&lt;a href="https://openai.com/index/introducing-gpt-5-3-codex/" rel="noopener noreferrer"&gt;Source&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo6s3njnozmwe93mtdvfg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo6s3njnozmwe93mtdvfg.png" alt="image" width="800" height="449"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As we can see from the table above, GPT‑5.3‑Codex is a major step forward over GPT‑5.2‑Codex across software engineering, agentic, and computer use benchmarks. This, paired with the marked improvement in efficiency, make for an incredible indicator of how great this model is. We think this is a significant upgrade to previous GPT Codex model users, as well as new users looking for a powerful agentic coding tool to aid their process.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started with GPT-5.3-Codex
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh22frckrami4z84ep59l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh22frckrami4z84ep59l.png" alt="image" width="800" height="426"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There are two ways to get started with GPT-5.3-Codex that we recommend to developers. First, is accessing the model with Serverless Inference through the &lt;a href="https://www.digitalocean.com/products/gradient/platform" rel="noopener noreferrer"&gt;GradientTM AI Platform&lt;/a&gt;. With Serverless Inference, we can Pythonically integrate the LLM generations into any pipeline. All you need to do is create a model access key, and begin generating! For more information on getting started, check out the official &lt;a href="https://docs.digitalocean.com/products/gradient-ai-platform/how-to/use-serverless-inference/" rel="noopener noreferrer"&gt;documentation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffurv5tcadtlwz8jloy21.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffurv5tcadtlwz8jloy21.png" alt="image" width="800" height="511"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The other way to get started quickly is the official OpenAI Codex application. It’s easy to get started with Codex on your local machine. Simply download the application onto your computer, and launch it. You will then be prompted to log in to your account. From there, simply choose which project you wish to work in, and you’re ready to get started!&lt;/p&gt;

&lt;h2&gt;
  
  
  Vibe Coding a Z-Image-Turbo Web Application with GPT‑5.3‑Codex
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fevd2jw8py8w20fzi25x1.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fevd2jw8py8w20fzi25x1.gif" alt="image" width="560" height="315"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So now that we have heard about how GPT‑5.3‑Codex performs, let’s see it in action. For this experiment, we sought to see how the model performed on a relatively novel assignment that has a basis in past applications. In this case, we asked it to create a real-time image-to-image pipeline for Z-Image-Turbo that uses webcam footage as image input.&lt;/p&gt;

&lt;p&gt;To do this, we created a blank new directory/project space to work in. We then asked the model to create a skeleton of the project to begin, and then iteratively added in the missing features on subsequent queries. Overall, we were able to create a full working version of the application with just 5 prompts and 30 minutes of testing. This extreme speed made it possible to ship the project in less than a day, from inspiration to completion. Now let’s take a closer look at the application project itself.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fau60yz6xtsq15q936e6e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fau60yz6xtsq15q936e6e.png" alt="image" width="800" height="363"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This project, which can be found &lt;a href="https://github.com/Jameshskelton/z-image-turbo-realtime" rel="noopener noreferrer"&gt;here&lt;/a&gt;, is a real-time webcam-driven image-to-image generation application built in Python around a &lt;a href="https://www.gradio.app/" rel="noopener noreferrer"&gt;Gradio&lt;/a&gt; interface and a dedicated Z-Image-Turbo inference engine, where the UI in app.py presents side-by-side live input and generated output panes, parameter controls, and explicit Start/Stop gating so inference only runs when requested, while the backend in inference.py loads Tongyi-MAI/Z-Image-Turbo via ZImageImg2ImgPipeline, introspects the pipeline signature to bind the correct image-conditioning argument, enforces true img2img semantics instead of prompt-only generation, and executes inference in torch.inference_mode() with dynamic argument wiring so behavior adapts to the installed diffusers API. Critically, it can compute per-frame target resolution from webcam aspect ratio, snapping dimensions to a model-friendly multiple (default 16), and caps both sides below 1024, then applies post-generation safeguards that made the app stable in practice: dtype strategy (auto preferring bf16 then fp32, avoiding fp16 black-frame failure modes), degenerate-output detection with automatic float32 recovery, robust PIL/NumPy/Tensor output decoding and normalization, effective-strength clamping to preserve source structure, frame-hash seed mixing so scene changes influence results, and configurable structure-preserving input blending, all parameterized in config.py and documented in the &lt;a href="https://github.com/Jameshskelton/z-image-turbo-realtime?tab=readme-ov-file#readme" rel="noopener noreferrer"&gt;README.md&lt;/a&gt;, with runtime status reporting latency plus internal diagnostics (pipe, dtype, size, effective strength, blend, seed, warnings) so you can observe exactly how each frame is being processed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing Thoughts
&lt;/h2&gt;

&lt;p&gt;GPT-5.3-Codex feels less like an incremental update and more like a meaningful shift in how developers interact with code. The combination of stronger reasoning,  benchmark gains seen in testing, and a noticeable speed improvement makes it clear that agentic coding is maturing into something even more production-ready. What once required hours of boilerplate, debugging, and manual wiring can now be orchestrated through iterative prompts and high-level direction. As we demonstrated with the Z-Image-Turbo real-time application, a fully functional project can move from blank directory to working prototype in much less  time traditionally required. While the actual results and performance benefits you experience will vary based on specific project requirements, complexity, and individual developer workflows, we are confident that GPT-5.3-Codex provides a substantial upgrade and a meaningful step forward in agentic coding capability, as evidenced by its stronger reasoning and measurable benchmark gains.&lt;/p&gt;

&lt;p&gt;We recommend trying out GPT-5.3-Codex in all contexts, especially with &lt;a href="https://www.digitalocean.com/products/gradient/platform" rel="noopener noreferrer"&gt;DigitalOcean’s GradientTM AI Platform&lt;/a&gt;!&lt;/p&gt;

</description>
      <category>chatgpt</category>
      <category>coding</category>
      <category>tutorial</category>
      <category>codex</category>
    </item>
    <item>
      <title>Getting Started with Qwen3.5 Vision-Language Models</title>
      <dc:creator>DigitalOcean</dc:creator>
      <pubDate>Tue, 17 Mar 2026 16:00:00 +0000</pubDate>
      <link>https://dev.to/digitalocean/getting-started-with-qwen35-vision-language-models-3ej3</link>
      <guid>https://dev.to/digitalocean/getting-started-with-qwen35-vision-language-models-3ej3</guid>
      <description>&lt;p&gt;&lt;em&gt;This article was originally written by James Skelton (Senior AI/ML Technical Content Strategist II)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.digitalocean.com/community/tutorials/visualizing-vision-language-models-multimodal-reasoning" rel="noopener noreferrer"&gt;Vision Language models&lt;/a&gt; are one of the most powerful and highest potential applications of deep learning technologies. The reasoning behind such a strong assertion lies in the versatility of VL modeling: from document understanding to object tracking to image captioning, vision language models are likely going to be the building blocks of the incipient, physical AI future. This is because everything that we can interact with that will be powered by AI - from robots to driverless vehicles to medical assistants - will likely have a VL model in its pipeline.&lt;/p&gt;

&lt;p&gt;This is why the power of open-source development is so important to all of these disciplines and applications of AI, and why we are so excited about the release of &lt;a href="https://qwen.ai/blog?id=qwen3.5" rel="noopener noreferrer"&gt;Qwen3.5&lt;/a&gt; from Qwen Team. This &lt;a href="https://huggingface.co/collections/Qwen/qwen35" rel="noopener noreferrer"&gt;suite of completely open source VL models&lt;/a&gt;, ranging in size from .8B to 397B (with activated 17B) parameters, is the clear next step forward for VL modeling. They excel at bench marks for everything from agentic coding to computer use to document understanding, and nearly match closed source rivals in terms of capabilities.&lt;/p&gt;

&lt;p&gt;In this tutorial, we will examine and show how to make the best use of Qwen3.5 using a &lt;a href="https://www.digitalocean.com/products/gradient/gpu-droplets" rel="noopener noreferrer"&gt;Gradienttm GPU Droplet&lt;/a&gt;. Follow along for explicit instructions on how to setup and run your GPU Droplet to power Qwen3.5 to power applications like Claude Code and Codex using your own resources.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Qwen3.5 VL demonstrates the growing power of open &lt;a href="https://www.digitalocean.com/solutions/multimodal-ai" rel="noopener noreferrer"&gt;multimodal AI&lt;/a&gt;. The fully open-source model suite spans from 0.8B to 397B parameters and achieves strong benchmark performance across tasks like coding, document understanding, and computer interaction, approaching the capabilities of leading proprietary models.&lt;/li&gt;
&lt;li&gt;Its architecture enables efficient large-scale multimodal training. By decoupling vision and language parallelism strategies, using sparse activations, and employing an FP8 training pipeline, Qwen3.5 improves hardware utilization, reduces memory usage, and maintains high throughput even when training on mixed text, image, and video data.&lt;/li&gt;
&lt;li&gt;Developers can deploy Qwen3.5 on their own infrastructure. With tools like Ollama and GPU Droplets, it is possible to run large Qwen3.5 models locally or in the cloud to power applications such as coding assistants, computer-use agents, and custom AI tools without relying on proprietary APIs.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Qwen3.5: Overview
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv3v5lob56ux6d9h1yzny.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv3v5lob56ux6d9h1yzny.jpg" alt="image" width="800" height="516"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Qwen3.5 is a fascinating model suite with a unique architecture. It “enables efficient native multimodal training via a heterogeneous infrastructure that decouples parallelism strategies across vision and language components” (&lt;a href="https://qwen.ai/blog?id=qwen3.5" rel="noopener noreferrer"&gt;Source&lt;/a&gt;). This helps to make it avoid uniform approaches’ inefficiencies, such as over-allocating compute to lighter modalities, synchronization bottlenecks between vision and language towers, memory imbalance across devices, and reduced scaling efficiency when both modalities are forced into the same parallelism strategy.&lt;/p&gt;

&lt;p&gt;By leveraging sparse activations to enable overlapping computation across model components, the system reaches nearly the same training throughput as pure text-only baselines even when trained on mixed text, image, and video datasets. Alongside this, a native FP8 training pipeline applies low-precision computation to activations, Mixture-of-Experts (MoE) routing, and GEMM operations. Runtime monitoring dynamically preserves BF16 precision in numerically sensitive layers, reducing activation memory usage by roughly 50% and delivering more than a 10% training speed improvement while maintaining stable scaling to tens of trillions of tokens.&lt;/p&gt;

&lt;p&gt;To further leverage reinforcement learning at scale, the team developed an asynchronous RL framework capable of training Qwen3.5 models across all sizes, supporting text-only, multimodal, and multi-turn interaction settings. The system uses a fully disaggregated &lt;a href="https://www.digitalocean.com/community/tutorials/llm-inference-optimization" rel="noopener noreferrer"&gt;training–inference architecture&lt;/a&gt;, allowing training and rollout generation to run independently while improving hardware utilization, enabling dynamic load balancing, and supporting fine-grained fault recovery. Through techniques such as end-to-end FP8 training, rollout router replay, speculative decoding, and multi-turn rollout locking, the framework increases throughput while maintaining strong consistency between training and inference behavior.&lt;/p&gt;

&lt;p&gt;This system–algorithm co-design also constrains gradient staleness and reduces data skew during asynchronous updates, preserving both training stability and model performance. In addition, the framework is built to support agentic workflows natively, enabling uninterrupted multi-turn interactions within complex environments. Its decoupled architecture can scale to millions of concurrent agent scaffolds and environments, which helps improve generalization during training. Together, these optimizations produce a 3×–5× improvement in end-to-end training speed while maintaining strong stability, efficiency, and scalability (&lt;a href="https://qwen.ai/blog?id=qwen3.5" rel="noopener noreferrer"&gt;Source&lt;/a&gt;).&lt;/p&gt;

&lt;h2&gt;
  
  
  Qwen3.5 Demo
&lt;/h2&gt;

&lt;p&gt;Getting started with Qwen3.5 is very simple. Thanks to the foresight of Qwen Team &amp;amp; their collaborators, their are numerous ways to access and run the Qwen3.5 model suite’s models from your own machine. Of course, running the larger models will require significantly more computational resources. We recommend at least an 8x &lt;a href="https://www.digitalocean.com/community/tutorials/nvidia-h200-gpu-droplet" rel="noopener noreferrer"&gt;NVIDIA H200&lt;/a&gt; setup for the larger models in particular, though a single H200 is sufficient for this tutorial. We are going to use Ollama to power &lt;a href="https://huggingface.co/Qwen/Qwen3.5-122B-A10B" rel="noopener noreferrer"&gt;Qwen3.5-122B-A10B&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;To get started, simply start up a GPU Droplet with an NVIDIA H200 with your &lt;a href="https://www.digitalocean.com/community/tutorials/how-to-configure-ssh-key-based-authentication-on-a-linux-server" rel="noopener noreferrer"&gt;SSH key&lt;/a&gt; attached, and SSH in using the terminal on your local machine. From there, navigate to the base directory of your choice. Create a new directory with &lt;code&gt;mkdir&lt;/code&gt; to represent your new workspace, and change into the directory.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating a custom game with Qwen3.5 running on Ollama and Claude Code
&lt;/h3&gt;

&lt;p&gt;For this demo, we are going to do something simple: create a Python based video game for one of the most popular Winter Olympics sports: curling. To get started, paste the following code into the remote terminal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://ollama.com/install.sh | sh
ollama launch claude &lt;span class="nt"&gt;--model&lt;/span&gt; qwen3.5:122b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fop1la5cjyv0riseeoleb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fop1la5cjyv0riseeoleb.png" alt="image" width="800" height="165"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This will launch Claude Code. If everything worked, it should look like above. From here, we can begin giving instructions to our model to begin generating code!&lt;/p&gt;

&lt;p&gt;For this demo, provide it with a base set of instructions. Try customizing the following input:&lt;/p&gt;

&lt;p&gt;“I want to create a simple game of curling in python code. i want it to be playable on my computer. Please create a sample Python program.&lt;/p&gt;

&lt;p&gt;Packages: pygame”&lt;/p&gt;

&lt;p&gt;This will give you, if your model ran predictably, a python file named something like “curling_game.py” with a full game’s code inside. Simply download this file onto your local computer, open the terminal and run it with &lt;code&gt;python3.11 curling_game.py&lt;/code&gt;. Our game looks like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm5yrbeeqys9timusj8qd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm5yrbeeqys9timusj8qd.png" alt="image" width="800" height="598"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;But looks are deceiving: this game is far from playable in the one-shot state. It requires serious work to amend the code to make the game playable, especially for two players. We can either use Claude Code with Qwen3.5 to make those adjustments, switch to an Anthropic Model like &lt;a href="https://www.digitalocean.com/community/tutorials/claude-sonnet" rel="noopener noreferrer"&gt;Sonnet 4.6&lt;/a&gt; or &lt;a href="https://www.digitalocean.com/community/tutorials/claude-opus" rel="noopener noreferrer"&gt;Opus 4.6&lt;/a&gt;, or make the changes manually. From this base state, it took Qwen3.5 over an hour and at least 10 requests to make the game playable. Time was notably constrained by the single H200 GPU deployment we used for this demo, but the code output leaves significant room for improvement nonetheless. We expect that Opus 4.6 could accomplish the same task in a much quicker time frame, given its optimization for &lt;a href="https://www.digitalocean.com/community/tutorials/claude-code-gpu-droplets-vscode" rel="noopener noreferrer"&gt;Claude Code&lt;/a&gt;, relatively superior benchmark scores, and more optimized infrastructure for inference.&lt;/p&gt;

&lt;p&gt;If you want to try it out, this file can be found on Github &lt;a href="https://gist.github.com/Jameshskelton/02be269e8d50f724cc910b35f6296e9c" rel="noopener noreferrer"&gt;Gist&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing Thoughts
&lt;/h2&gt;

&lt;p&gt;Qwen3.5 VL represents an important step forward for open-source multimodal AI, demonstrating that publicly available models can increasingly rival proprietary systems in capability while offering far greater flexibility for developers. With its scalable architecture, efficient training infrastructure, and strong performance across tasks like coding, document understanding, and computer use, the Qwen3.5 suite highlights the growing maturity of the open AI ecosystem. As tools like GPU Droplets and frameworks such as Ollama make deploying large models easier than ever, vision-language systems like Qwen3.5 are poised to become foundational components in the next generation of AI-powered applications and physical AI systems.&lt;/p&gt;

</description>
      <category>qwen</category>
      <category>learning</category>
      <category>aimodels</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>7 OpenClaw Security Challenges to Watch for in 2026</title>
      <dc:creator>DigitalOcean</dc:creator>
      <pubDate>Thu, 12 Mar 2026 16:00:00 +0000</pubDate>
      <link>https://dev.to/digitalocean/7-openclaw-security-challenges-to-watch-for-in-2026-46b1</link>
      <guid>https://dev.to/digitalocean/7-openclaw-security-challenges-to-watch-for-in-2026-46b1</guid>
      <description>&lt;p&gt;&lt;em&gt;This article was originally written by Fadeke Adegbuyi (Manager, Content Marketing)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;OpenClaw isn’t just another chatbot wrapper. It executes shell commands, controls your browser, manages your calendar, reads and writes files, and remembers everything across sessions. The &lt;a href="https://github.com/openclaw/openclaw" rel="noopener noreferrer"&gt;project&lt;/a&gt; runs locally on your machine and connects to WhatsApp, Telegram, iMessage, Discord, Slack, and over a dozen other platforms via &lt;a href="https://openclaw.ai/integrations" rel="noopener noreferrer"&gt;pre-built integrations&lt;/a&gt;. It functions as a truly connected personal assistant. As a result, the use cases people have dreamed up for OpenClaw are wild.&lt;/p&gt;

&lt;p&gt;One user showed an OpenClaw agent &lt;a href="https://x.com/xmayeth/status/2020883912734425389" rel="noopener noreferrer"&gt;making money on Polymarket&lt;/a&gt; by monitoring news feeds and executing trades automatically. Another gave their bot access to &lt;a href="https://x.com/MatznerJon/status/2019044317621567811" rel="noopener noreferrer"&gt;home surveillance cameras&lt;/a&gt;. Someone else &lt;a href="https://x.com/nickvasiles/status/2021391007800328683" rel="noopener noreferrer"&gt;&lt;/a&gt;unleashed subagents to apply for &lt;a href="https://x.com/nickvasiles/status/2021391007800328683" rel="noopener noreferrer"&gt;UpWork freelancing jobs&lt;/a&gt; on their behalf.&lt;/p&gt;

&lt;p&gt;

&lt;iframe class="tweet-embed" id="tweet-2019044317621567811-81" src="https://platform.twitter.com/embed/Tweet.html?id=2019044317621567811"&gt;
&lt;/iframe&gt;

  // Detect dark theme
  var iframe = document.getElementById('tweet-2019044317621567811-81');
  if (document.body.className.includes('dark-theme')) {
    iframe.src = "https://platform.twitter.com/embed/Tweet.html?id=2019044317621567811&amp;amp;theme=dark"
  }





&lt;/p&gt;

&lt;p&gt;But this kind of access to your digital life comes with real consequences when things go wrong. And things have gone wrong. Security researchers found that &lt;a href="https://www.404media.co/silicon-valleys-favorite-new-ai-agent-has-serious-security-flaws/" rel="noopener noreferrer"&gt;&lt;/a&gt;the agent shipped with &lt;a href="https://www.404media.co/silicon-valleys-favorite-new-ai-agent-has-serious-security-flaws/" rel="noopener noreferrer"&gt;serious flaws&lt;/a&gt; that made it possible for attackers to hijack machines with a single malicious link. Meanwhile, &lt;a href="https://www.digitalocean.com/resources/articles/what-is-moltbook" rel="noopener noreferrer"&gt;Moltbook&lt;/a&gt;, a Reddit-style platform with over 2.8 million AI agents, had its database completely &lt;a href="https://www.404media.co/exposed-moltbook-database-let-anyone-take-control-of-any-ai-agent-on-the-site/" rel="noopener noreferrer"&gt;exposed&lt;/a&gt;, so anyone could take control of any AI agent on the platform.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;None of this means you should avoid OpenClaw entirely&lt;/strong&gt;. It means you should understand OpenClaw security challenges and take precautions before spinning up an agent with root access to your laptop. Running OpenClaw in an isolated cloud environment can help  neutralize some of these risks—DigitalOcean's &lt;a href="https://www.digitalocean.com/blog/moltbot-on-digitalocean" rel="noopener noreferrer"&gt;1-Click Deploy for OpenClaw&lt;/a&gt;, for example, handles authentication, firewall rules, and container isolation out of the box so your personal machine stays out of the equation.&lt;/p&gt;

&lt;h2&gt;
  
  
  What are OpenClaw security challenges?
&lt;/h2&gt;

&lt;p&gt;OpenClaw security challenges boil down to a design tension: the tool needs broad system permissions to be useful, but those permissions create a massive attack surface when something goes wrong. The agent runs with whatever privileges your user account has—full disk, terminal, and network access—by design.&lt;/p&gt;

&lt;p&gt;It's also &lt;a href="https://www.digitalocean.com/resources/articles/agentic-ai" rel="noopener noreferrer"&gt;agentic&lt;/a&gt; and self-improving, meaning it can modify its own behavior, update its memory, and install new skills autonomously. This is impressive from a capability standpoint, but another vector that can cause things to spiral when guardrails are missing. Pair that with defaults that skip authentication, an unvetted skill marketplace, and persistent memory storing weeks of context, and trouble follows. The takeaway: approach with caution, isolate from production systems, and carefully scrutinize the defaults.&lt;/p&gt;

&lt;p&gt;To his credit, OpenClaw creator &lt;a href="https://x.com/steipete" rel="noopener noreferrer"&gt;Peter Steinberger&lt;/a&gt; has been openly vocal about these risks and actively encourages running OpenClaw in a &lt;a href="https://docs.openclaw.ai/gateway/sandboxing" rel="noopener noreferrer"&gt;sandboxed environment&lt;/a&gt;, which isolates tool execution inside Docker containers to limit filesystem and process access when the model misbehaves. DigitalOcean's one-click deployment does exactly this out of the box, giving you that isolation without the manual setup.&lt;/p&gt;

&lt;p&gt;

  &lt;iframe src="https://www.youtube.com/embed/n2MrUtIT1m4"&gt;
  &lt;/iframe&gt;


&lt;/p&gt;

&lt;h2&gt;
  
  
  7 OpenClaw security challenges to watch out for
&lt;/h2&gt;

&lt;p&gt;We've already seen a security audit &lt;a href="https://www.kaspersky.com/blog/openclaw-vulnerabilities-exposed/55263/" rel="noopener noreferrer"&gt;uncover 512 vulnerabilities&lt;/a&gt; (eight critical) and &lt;a href="https://thehackernews.com/2026/02/researchers-find-341-malicious-clawhub.html" rel="noopener noreferrer"&gt;malicious ClawHub skills&lt;/a&gt; stealing cryptocurrency wallets. None of these challenges are theoretical. They're all based on incidents that have already played out within weeks of OpenClaw’s launch.&lt;/p&gt;

&lt;p&gt;These are the challenges you need to have on your radar if you're experimenting with OpenClaw:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. One-click remote code execution through WebSocket hijacking
&lt;/h3&gt;

&lt;p&gt;One of the most alarming OpenClaw vulnerabilities discovered so far is &lt;a href="https://thehackernews.com/2026/02/openclaw-bug-enables-one-click-remote.html" rel="noopener noreferrer"&gt;CVE-2026-25253&lt;/a&gt;, a one-click remote code execution flaw that Mav Levin, a founding researcher at DepthFirst, disclosed in late January 2026. The attack worked because OpenClaw's local server didn’t validate the WebSocket origin header—so any website you visited could silently connect to your running agent. An attacker just needed you to click one link. From there, they chained a cross-site WebSocket hijack into full code execution on your machine. The compromise happened in milliseconds. This is the core danger of running an agent locally on the same machine you're browsing the web with—one careless click and an attacker is already inside.&lt;/p&gt;

&lt;p&gt;Levin's proof-of-concept showed that visiting a single malicious webpage was enough to steal authentication tokens and gain operator-level access to the gateway API—giving an attacker access to change your config, read your files, and run commands.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security checks&lt;/strong&gt;: In this instance, the fix landed in &lt;a href="https://github.com/openclaw/openclaw/releases" rel="noopener noreferrer"&gt;version 2026.1.29&lt;/a&gt;, so update immediately if you’re a version behind. Beyond that, best practices include avoiding running OpenClaw while browsing untrusted sites and considering putting the agent behind a reverse proxy with proper origin validation for an additional layer of protection.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Tens of thousands of unprotected OpenClaw instances sitting open on the internet
&lt;/h3&gt;

&lt;p&gt;Here's the thing about OpenClaw's early defaults: the agent trusted any connection from localhost without asking for a password. That sounded fine until the gateway sits behind a misconfigured reverse proxy—at which point every external request got forwarded to 127.0.0.1, and your agent thought the whole internet was a trusted local user. SecurityScorecard's STRIKE team &lt;a href="https://www.bitsight.com/blog/openclaw-ai-security-risks-exposed-instances" rel="noopener noreferrer"&gt;&lt;/a&gt;found over &lt;a href="https://www.bitsight.com/blog/openclaw-ai-security-risks-exposed-instances" rel="noopener noreferrer"&gt;30,000 internet-exposed OpenClaw instances&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Security researcher &lt;a href="https://x.com/theonejvo/status/2015401219746128322" rel="noopener noreferrer"&gt;Jamieson O'Reilly showed&lt;/a&gt; just how bad this gets. He accessed Anthropic API keys, Telegram bot tokens, Slack accounts, and complete chat histories from exposed instances, even sending messages on behalf of users and running commands with full admin privileges. No authentication required.&lt;/p&gt;

&lt;p&gt;This has since been addressed—&lt;a href="https://docs.openclaw.ai/gateway#runtime-model" rel="noopener noreferrer"&gt;gateway auth&lt;/a&gt; is now required by default, and the onboarding wizard auto-generates a token even for localhost.&lt;/p&gt;

&lt;p&gt;

&lt;iframe class="tweet-embed" id="tweet-2015401219746128322-801" src="https://platform.twitter.com/embed/Tweet.html?id=2015401219746128322"&gt;
&lt;/iframe&gt;

  // Detect dark theme
  var iframe = document.getElementById('tweet-2015401219746128322-801');
  if (document.body.className.includes('dark-theme')) {
    iframe.src = "https://platform.twitter.com/embed/Tweet.html?id=2015401219746128322&amp;amp;theme=dark"
  }





&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security checks&lt;/strong&gt;: At a minimum, check whether your instance is reachable from the public internet. Use a &lt;a href="https://www.digitalocean.com/resources/articles/cloud-firewall" rel="noopener noreferrer"&gt;firewall&lt;/a&gt; to restrict access, enable gateway token authentication, and never expose the control plane without a &lt;a href="https://www.digitalocean.com/solutions/vpn" rel="noopener noreferrer"&gt;VPN&lt;/a&gt; or &lt;a href="https://www.digitalocean.com/community/tutorials/ssh-essentials-working-with-ssh-servers-clients-and-keys" rel="noopener noreferrer"&gt;SSH tunnel&lt;/a&gt; in front of it. This is a  case where a managed cloud deployment can solve the problem outright—because your personal API keys, chat histories, and credentials aren’t sitting on an exposed local machine in the first place.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Malicious skills on ClawHub are poisoning the supply chain
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://github.com/openclaw/clawhub" rel="noopener noreferrer"&gt;ClawHub&lt;/a&gt;, OpenClaw's public skill marketplace, lets anyone publish an extension—the only requirement is a GitHub account older than one week. That low bar has unfortunately turned the marketplace into a target. Koi Security &lt;a href="https://www.koi.ai/blog/clawhavoc-341-malicious-clawedbot-skills-found-by-the-bot-they-were-targeting" rel="noopener noreferrer"&gt;audited all 2,857 skills on ClawHub&lt;/a&gt; and found 341 that were outright malicious. Bitdefender's independent scan put the number closer to &lt;a href="https://www.bitdefender.com/en-us/blog/businessinsights/technical-advisory-openclaw-exploitation-enterprise-networks" rel="noopener noreferrer"&gt;900 malicious skills&lt;/a&gt;, roughly 20% of all packages. A single account—"hightower6eu"—uploaded 354 malicious packages by itself.&lt;/p&gt;

&lt;p&gt;The attack is clever. You install what looks like a useful skill and the documentation looks professional. But buried in a "Prerequisites" section, it asks you to install something first—and that something is Atomic Stealer (&lt;a href="https://www.darktrace.com/blog/atomic-stealer-darktraces-investigation-of-a-growing-macos-threat" rel="noopener noreferrer"&gt;AMOS&lt;/a&gt;), a macOS credential-stealing malware.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security checks&lt;/strong&gt;: OpenClaw has since &lt;a href="https://openclaw.ai/blog/virustotal-partnership" rel="noopener noreferrer"&gt;partnered with VirusTotal&lt;/a&gt; to scan new skill uploads, but Steinberger himself admitted this isn't a silver bullet. At a minimum, before installing any skill, read its source code. Check the publisher's account age and history. Put simply, treat every skill as untrusted code running with your agent's full permissions. Unlike some exposure risks, malicious skills are a threat regardless of where OpenClaw runs—a poisoned skill executes the same way on a cloud server as it does on your laptop.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Credential storage in plaintext and API key leakage
&lt;/h3&gt;

&lt;p&gt;One of the less glamorous but more dangerous issues is how OpenClaw handles secrets. The platform &lt;a href="https://permiso.io/blog/inside-the-openclaw-ecosystem-ai-agents-with-privileged-credentials" rel="noopener noreferrer"&gt;stores credentials in plaintext&lt;/a&gt;—including API keys for your LLM provider and tokens for every messaging platform your agent connects to—and those become targets the moment your instance is accessible to anyone other than you. Prompt injection attacks can also trick the agent into exfiltrating credentials by embedding hidden instructions in content the agent processes.&lt;/p&gt;

&lt;p&gt;Cisco's team tested a skill called &lt;a href="https://blogs.cisco.com/ai/personal-ai-agents-like-openclaw-are-a-security-nightmare" rel="noopener noreferrer"&gt;"What Would Elon Do?"&lt;/a&gt; and surfaced nine security findings, two of them critical. The skill instructed the bot to execute a curl command sending data to an external server controlled by the skill's author. Functionally, it was malware hiding behind a joke name.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security check&lt;/strong&gt;: At a minimum, rotate your API keys regularly and store secrets using environment variables or a dedicated secrets manager rather than config files. It's also worth setting spending limits on your LLM provider accounts. That way, even if a key is compromised, it can't rack up thousands in charges.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Prompt injection attacks amplified by persistent memory
&lt;/h3&gt;

&lt;p&gt;What makes prompt injection in OpenClaw worse than in a typical &lt;a href="https://www.digitalocean.com/resources/articles/ai-agent-vs-ai-chatbot" rel="noopener noreferrer"&gt;chatbot&lt;/a&gt; is the persistent memory. The agent retains long-term context, preferences, and conversation history across sessions—which is one of its best features. But it also means a malicious instruction embedded in a website, email, or document doesn't have to execute immediately. Palo Alto Networks warned that these become "&lt;a href="https://www.paloaltonetworks.com/blog/network-security/why-moltbot-may-signal-ai-crisis/" rel="noopener noreferrer"&gt;stateful, delayed-execution attacks&lt;/a&gt;". A hidden prompt in a PDF you opened last Tuesday could sit dormant in the agent's memory until a future task triggers it days later.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security check&lt;/strong&gt;: There's no perfect fix for prompt injection right now; it's an unresolved problem in agentic AI. But you can reduce the blast radius by limiting what tools and permissions your agent has access to, segmenting its access to sensitive systems, and reviewing its memory and context periodically for anything unexpected.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Shadow AI spreading through enterprise networks
&lt;/h3&gt;

&lt;p&gt;This one's for anyone working at a company where developers tinker on their work machines. Token Security found that &lt;a href="https://www.token.security/blog/the-clawdbot-enterprise-ai-risk-one-in-five-have-it-installed" rel="noopener noreferrer"&gt;22% of their enterprise customers&lt;/a&gt; have employees running OpenClaw as shadow AI without IT approval. Bitdefender confirmed the same, showing &lt;a href="https://businessinsights.bitdefender.com/technical-advisory-openclaw-exploitation-enterprise-networks" rel="noopener noreferrer"&gt;employees deploying agents&lt;/a&gt; on corporate machines connected to internal networks. An OpenClaw agent on a developer's laptop with VPN access to production means every vulnerability above is now a business problem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security check&lt;/strong&gt;: If you're on a security team, you should scan your network for OpenClaw instances now. Set up detection for its WebSocket traffic patterns, and mandate that any approved use runs in an isolated environment—a VM or cloud server—rather than on laptops with internal access. Giving teams an approved, isolated deployment path is the fastest way to get ahead of shadow AI—it's much easier to enforce guardrails when the alternative isn't 'don't use it at all.'&lt;/p&gt;

&lt;h3&gt;
  
  
  7. The Moltbook database breach exposing millions of agent credentials
&lt;/h3&gt;

&lt;p&gt;The security mess isn't limited to OpenClaw itself. Moltbook, the social network for AI agents built by &lt;a href="https://x.com/MattPRD" rel="noopener noreferrer"&gt;Matt Schlicht&lt;/a&gt;, &lt;a href="https://www.404media.co/exposed-moltbook-database-let-anyone-take-control-of-any-ai-agent-on-the-site/" rel="noopener noreferrer"&gt;suffered a database exposure&lt;/a&gt; that cybersecurity firm Wiz discovered in early February. The database had zero access controls. Anyone who found it could view 1.5 million API tokens, 35,000 email addresses, and private messages between agents—enough to take control of any agent on the platform. China's Ministry of Industry and Information Technology &lt;a href="https://www.reuters.com/world/china/china-warns-security-risks-linked-openclaw-open-source-ai-agent-2026-02-05/" rel="noopener noreferrer"&gt;issued a formal warning&lt;/a&gt; about OpenClaw security risks, citing incidents like this breach.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security check&lt;/strong&gt;: If you've used Moltbook, rotate every API key and token associated with your agent. Treat third-party platforms in the OpenClaw ecosystem with the same skepticism you'd apply to any new service asking for your credentials and consider additional security checks.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Any references to third-party companies, trademarks, or logos in this document are for informational purposes only and do not imply any affiliation with, sponsorship by, or endorsement of those third parties.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Pricing and product information accurate as of February 2026.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>openclaw</category>
      <category>security</category>
      <category>learning</category>
    </item>
    <item>
      <title>[Boost]</title>
      <dc:creator>DigitalOcean</dc:creator>
      <pubDate>Tue, 10 Mar 2026 18:07:19 +0000</pubDate>
      <link>https://dev.to/digitalocean_staff/-2im1</link>
      <guid>https://dev.to/digitalocean_staff/-2im1</guid>
      <description>&lt;p&gt;

&lt;/p&gt;
&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/digitalocean/gpu-programming-for-beginners-rocm-amd-setup-to-edge-detection-29bm" class="crayons-story__hidden-navigation-link"&gt;GPU Programming for Beginners: ROCm + AMD Setup to Edge Detection&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;
          &lt;a class="crayons-logo crayons-logo--l" href="/digitalocean"&gt;
            &lt;img alt="DigitalOcean logo" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F175%2F369f1227-0eac-4a88-8d3c-08851bf0b117.png" class="crayons-logo__image"&gt;
          &lt;/a&gt;

          &lt;a href="/digitalocean_staff" class="crayons-avatar  crayons-avatar--s absolute -right-2 -bottom-2 border-solid border-2 border-base-inverted  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F64516%2Fa0c9989b-6d18-46c7-bc66-4c2c1580534e.jpg" alt="digitalocean_staff profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/digitalocean_staff" class="crayons-story__secondary fw-medium m:hidden"&gt;
              DigitalOcean
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                DigitalOcean
                
              
              &lt;div id="story-author-preview-content-3318030" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/digitalocean_staff" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F64516%2Fa0c9989b-6d18-46c7-bc66-4c2c1580534e.jpg" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;DigitalOcean&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

            &lt;span&gt;
              &lt;span class="crayons-story__tertiary fw-normal"&gt; for &lt;/span&gt;&lt;a href="/digitalocean" class="crayons-story__secondary fw-medium"&gt;DigitalOcean&lt;/a&gt;
            &lt;/span&gt;
          &lt;/div&gt;
          &lt;a href="https://dev.to/digitalocean/gpu-programming-for-beginners-rocm-amd-setup-to-edge-detection-29bm" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Mar 10&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/digitalocean/gpu-programming-for-beginners-rocm-amd-setup-to-edge-detection-29bm" id="article-link-3318030"&gt;
          GPU Programming for Beginners: ROCm + AMD Setup to Edge Detection
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/gpu"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;gpu&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/amd"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;amd&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/programming"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;programming&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/ai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;ai&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/digitalocean/gpu-programming-for-beginners-rocm-amd-setup-to-edge-detection-29bm" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;1&lt;span class="hidden s:inline"&gt; reaction&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/digitalocean/gpu-programming-for-beginners-rocm-amd-setup-to-edge-detection-29bm#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              &lt;span class="hidden s:inline"&gt;Add Comment&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            2 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;




</description>
      <category>gpu</category>
      <category>amd</category>
      <category>programming</category>
      <category>ai</category>
    </item>
    <item>
      <title>GPU Programming for Beginners: ROCm + AMD Setup to Edge Detection</title>
      <dc:creator>DigitalOcean</dc:creator>
      <pubDate>Tue, 10 Mar 2026 16:00:00 +0000</pubDate>
      <link>https://dev.to/digitalocean/gpu-programming-for-beginners-rocm-amd-setup-to-edge-detection-29bm</link>
      <guid>https://dev.to/digitalocean/gpu-programming-for-beginners-rocm-amd-setup-to-edge-detection-29bm</guid>
      <description>&lt;p&gt;

  &lt;iframe src="https://www.youtube.com/embed/TdHexc0Garg"&gt;
  &lt;/iframe&gt;


&lt;/p&gt;

&lt;p&gt;In this hands-on tutorial, we demystify GPU computation and show you how to write your own GPU programs from scratch. Understanding GPU programming is essential for anyone looking to grasp why AI models depend on this specialized hardware.&lt;/p&gt;

&lt;p&gt;We'll use ROCm and HIP (AMD's version of CUDA) to take you from zero to running real GPU code, culminating in a computer vision edge detector that processes images in parallel.&lt;/p&gt;

&lt;p&gt;You can find the code in the &lt;strong&gt;project repository&lt;/strong&gt;: &lt;a href="https://github.com/oconnoob/intro_to_rocm_hip/blob/main/README.md" rel="noopener noreferrer"&gt;https://github.com/oconnoob/intro_to_rocm_hip/blob/main/README.md&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;👇 WHAT YOU'LL LEARN IN THIS VIDEO 👇&lt;/p&gt;

&lt;p&gt;🔧 &lt;strong&gt;Getting Set Up with ROCm Two ways to get started&lt;/strong&gt;: spin up a GPU Droplet on DigitalOcean with ROCm pre-installed, or install ROCm yourself on an Ubuntu system with an AMD GPU. We cover both methods step-by-step.&lt;/p&gt;

&lt;p&gt;➕ &lt;strong&gt;Example 1&lt;/strong&gt;: Vector Addition (The Basics) Learn the fundamental structure of GPU programs—kernels, threads, blocks, and memory management. We'll add one million elements in parallel and verify our results.&lt;/p&gt;

&lt;p&gt;⚡ &lt;strong&gt;Example 2&lt;/strong&gt;: Matrix Multiplication (Why Libraries Matter) Discover why optimized libraries like rocBLAS dramatically outperform naive implementations. This is the operation powering most AI models you use daily.&lt;/p&gt;

&lt;p&gt;👁️ &lt;strong&gt;Example 3&lt;/strong&gt;: Edge Detection with Sobel Filter (The Cool Stuff) Apply your GPU programming skills to a real computer vision problem—detecting edges in images using a classic Sobel filter, all running massively parallel on the GPU.&lt;/p&gt;

&lt;p&gt;Whether you're an AI enthusiast wanting to understand the hardware layer or a developer looking to harness GPU compute power, this tutorial gives you the foundation to start writing efficient parallel programs.&lt;/p&gt;

</description>
      <category>gpu</category>
      <category>amd</category>
      <category>programming</category>
      <category>ai</category>
    </item>
    <item>
      <title>In case you haven't heard, we're back! Follow the DigitalOcean organization for updates, tutorials, and hands-on AI learning.</title>
      <dc:creator>DigitalOcean</dc:creator>
      <pubDate>Fri, 06 Mar 2026 22:25:26 +0000</pubDate>
      <link>https://dev.to/digitalocean_staff/in-case-you-havent-heard-were-back-follow-the-digitalocean-organization-for-updates-tutorials-53oj</link>
      <guid>https://dev.to/digitalocean_staff/in-case-you-havent-heard-were-back-follow-the-digitalocean-organization-for-updates-tutorials-53oj</guid>
      <description>&lt;p&gt;

&lt;/p&gt;
&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/digitalocean/digitalocean-on-devto-practical-ai-insights-for-builders-3g0c" class="crayons-story__hidden-navigation-link"&gt;DigitalOcean on Dev.to: Practical AI Insights for Builders&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;
          &lt;a class="crayons-logo crayons-logo--l" href="/digitalocean"&gt;
            &lt;img alt="DigitalOcean logo" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F175%2F369f1227-0eac-4a88-8d3c-08851bf0b117.png" class="crayons-logo__image"&gt;
          &lt;/a&gt;

          &lt;a href="/jlulks" class="crayons-avatar  crayons-avatar--s absolute -right-2 -bottom-2 border-solid border-2 border-base-inverted  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3476605%2F8f9c9b3a-5b45-42b8-88ca-4f557174dba7.jpg" alt="jlulks profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/jlulks" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Jess Lulka
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Jess Lulka
                
              
              &lt;div id="story-author-preview-content-3222465" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/jlulks" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3476605%2F8f9c9b3a-5b45-42b8-88ca-4f557174dba7.jpg" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Jess Lulka&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

            &lt;span&gt;
              &lt;span class="crayons-story__tertiary fw-normal"&gt; for &lt;/span&gt;&lt;a href="/digitalocean" class="crayons-story__secondary fw-medium"&gt;DigitalOcean&lt;/a&gt;
            &lt;/span&gt;
          &lt;/div&gt;
          &lt;a href="https://dev.to/digitalocean/digitalocean-on-devto-practical-ai-insights-for-builders-3g0c" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Feb 2&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/digitalocean/digitalocean-on-devto-practical-ai-insights-for-builders-3g0c" id="article-link-3222465"&gt;
          DigitalOcean on Dev.to: Practical AI Insights for Builders
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/ai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;ai&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/digitalocean"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;digitalocean&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/machinelearning"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;machinelearning&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/learning"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;learning&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/digitalocean/digitalocean-on-devto-practical-ai-insights-for-builders-3g0c" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/exploding-head-daceb38d627e6ae9b730f36a1e390fca556a4289d5a41abb2c35068ad3e2c4b5.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/multi-unicorn-b44d6f8c23cdd00964192bedc38af3e82463978aa611b4365bd33a0f1f4f3e97.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;22&lt;span class="hidden s:inline"&gt; reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/digitalocean/digitalocean-on-devto-practical-ai-insights-for-builders-3g0c#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              1&lt;span class="hidden s:inline"&gt; comment&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            2 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;




</description>
      <category>ai</category>
      <category>digitalocean</category>
      <category>machinelearning</category>
      <category>learning</category>
    </item>
    <item>
      <title>We're DigitalOcean and we're excited to be here with you! </title>
      <dc:creator>DigitalOcean</dc:creator>
      <pubDate>Thu, 23 Jul 2020 11:47:06 +0000</pubDate>
      <link>https://dev.to/digitalocean/we-re-digitalocean-and-we-re-excited-to-be-here-with-you-33hc</link>
      <guid>https://dev.to/digitalocean/we-re-digitalocean-and-we-re-excited-to-be-here-with-you-33hc</guid>
      <description>&lt;p&gt;Hey everyone! We're so excited to be here at CodeLand:Distributed. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://digitalocean.com" rel="noopener noreferrer"&gt;DigitalOcean&lt;/a&gt; offers the most easy-to-use and developer-friendly cloud platform. We help you manage and scale apps with an intuitive API, multiple storage options, integrated firewalls load balancers, and more. We're on a mission to simplify cloud computing so developers and businesses can spend more time creating software that changes the world!&lt;/p&gt;

&lt;h3&gt;
  
  
  Got Questions?! Let's chat!
&lt;/h3&gt;

&lt;p&gt;Sammy the shark and our team members are here to connect with any questions you might have. We'll be at our &lt;a href="https://dev.to/join_channel_invitation/digitalocean-5eag?invitation_slug=invitation-link-e9804f"&gt;DEV Connect channel&lt;/a&gt; all day, so stop by and say hello!&lt;/p&gt;

&lt;p&gt;We're also happy to respond to any comments down below. 👇&lt;/p&gt;

&lt;h3&gt;
  
  
  Digital Swag
&lt;/h3&gt;

&lt;p&gt;Today, we're offering all CodeLand attendees a $100 USD free trial. Sign up below and we'll follow up with all the details: &lt;/p&gt;


&lt;div class="ltag__user-subscription-tag"&gt;
  &lt;div class="ltag__user-subscription-tag__container"&gt;

    &lt;div class="ltag__user-subscription-tag__content w-100"&gt;

      &lt;div class="ltag__user-subscription-tag__profile-images signed-out"&gt;

        &lt;span class="crayons-avatar crayons-avatar--xl ltag__user-subscription-tag__author-profile-image m-auto"&gt;
          &lt;img class="crayons-avatar__image ltag__user-subscription-tag__author-profile-image m-0" src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F64516%2Fa0c9989b-6d18-46c7-bc66-4c2c1580534e.jpg"&gt;
        &lt;/span&gt;

        &lt;span class="crayons-avatar crayons-avatar--xl ltag__user-subscription-tag__subscriber-profile-image m-auto"&gt;
          &lt;img class="crayons-avatar__image ltag__user-subscription-tag__subscriber-profile-image m-0" alt=""&gt;
        &lt;/span&gt;

      &lt;/div&gt;

      &lt;h2 class="ltag__user-subscription-tag__cta-text fs-xl mt-0 mb-4 align-center"&gt;
        Sign up for a $100 DigitalOcean Promo!
      &lt;/h2&gt;

      &lt;div class="ltag__user-subscription-tag__subscription-area align-center"&gt;
        &lt;div class="ltag__user-subscription-tag__signed-out"&gt;
          &lt;div class="fs-base mb-2"&gt;
            You must first sign in to DEV Community.
          &lt;/div&gt;
          &lt;a href="/enter" class="c-cta c-cta--default"&gt;
            Sign In
          &lt;/a&gt;
        &lt;/div&gt;

        &lt;div class="ltag__user-subscription-tag__signed-in hidden"&gt;
          
            Subscribe
          
          &lt;div class="ltag__user-subscription-tag__logged-in-text fs-s mb-3"&gt;
            You'll subscribe with the email address associated with your DEV Community account. To use a different email address, you can &lt;a href="/settings"&gt;update your email address in Settings&lt;/a&gt;.
          &lt;/div&gt;
        &lt;/div&gt;

        &lt;div class="ltag__user-subscription-tag__apple-auth fs-s hidden"&gt;
          Subscribe
          &lt;div class="fs-s"&gt;
            Hey, there! It looks like when you created your DEV Community account you signed up with Apple using a private relay email address. If you'd like to subscribe, please &lt;a href="/settings"&gt;update your email address in Settings&lt;/a&gt; first to a different email address.
          &lt;/div&gt;
        &lt;/div&gt;

        &lt;div class="ltag__user-subscription-tag__response-message crayons-notice fs-base w-100 hidden"&gt;&lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
    &lt;div class="user-subscription-confirmation-modal hidden"&gt;
      &lt;div class="crayons-modal__box__body"&gt;
        &lt;p class="fs-base mb-4 mt-0"&gt;
          You'll share your email address, username, name, and DEV Community profile URL with &lt;span class="ltag__user-subscription-tag__author-username fw-medium"&gt;digitalocean_staff&lt;/span&gt;. Once you do this, you cannot undo this.
        &lt;/p&gt;

&lt;div class="ltag__user-subscription-tag__confirmation-buttons"&gt;
          
            Confirm subscription
          
          
            Cancel
          
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;We also have some &lt;a href="https://imgur.com/a/q6i58" rel="noopener noreferrer"&gt;fun wallpapers&lt;/a&gt; for anyone looking to spruce up their desktop or virtual backgrounds. ✨&lt;/p&gt;

&lt;h3&gt;
  
  
  Job Opportunities at DigitalOcean
&lt;/h3&gt;

&lt;p&gt;DigitalOcean is a values-driven organization. Here is what we believe in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Our community is bigger than just us&lt;/li&gt;
&lt;li&gt;Simplicity in all we DO&lt;/li&gt;
&lt;li&gt;We speak up when we have something to say and listen when others DO&lt;/li&gt;
&lt;li&gt;e are accountable to deliver on our commitments&lt;/li&gt;
&lt;li&gt;Love is at our core&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Come swim with us: &lt;a href="https://do.co/careers" rel="noopener noreferrer"&gt;https://do.co/careers&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fp0xus8z4qagtrikazlv9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fp0xus8z4qagtrikazlv9.png" alt="developer-community"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>codeland</category>
    </item>
    <item>
      <title>How to Code in Go eBook</title>
      <dc:creator>DigitalOcean</dc:creator>
      <pubDate>Mon, 22 Jun 2020 15:46:31 +0000</pubDate>
      <link>https://dev.to/digitalocean/how-to-code-in-go-ebook-ifl</link>
      <guid>https://dev.to/digitalocean/how-to-code-in-go-ebook-ifl</guid>
      <description>&lt;h2&gt;
  
  
  Introduction to the eBook
&lt;/h2&gt;

&lt;p&gt;This book is designed to introduce you to writing programs with the Go programming language. You’ll learn how to write useful tools and applications that can run on remote servers, or local Windows, macOS, and Linux systems for development.&lt;/p&gt;

&lt;p&gt;This book is based on the &lt;a href="https://www.digitalocean.com/community/tutorial_series/how-to-code-in-go"&gt;How To Code in Go&lt;/a&gt; tutorial series found on &lt;a href="https://www.digitalocean.com/community"&gt;DigitalOcean Community&lt;/a&gt;. The topics that it covers include how to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Install and set up a local Go development environment on Windows, macOS, and Linux systems&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Design your programs with conditional logic, including switch statements to control program flow&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Define your own data structures and create interfaces to them for reusable code&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Write custom error handling functions&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Building and installing your Go programs so that they can run on different operating systems and different CPU architectures&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Using flags to pass arguments to your programs, to override default options&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each chapter can be read on its own or used as a reference, or you can follow the chapters from beginning to end. Feel free to jump to the chapter or chapters that best suits your purpose as you are learning Go with this book.&lt;/p&gt;

&lt;h2&gt;
  
  
  Download the eBook
&lt;/h2&gt;

&lt;p&gt;You can download the eBook in either the EPUB or PDF format by following the links below.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://do.co/go-book-epub"&gt;&lt;em&gt;How To Code in Go&lt;/em&gt; eBook in &lt;strong&gt;EPUB format&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://do.co/go-book-pdf"&gt;&lt;em&gt;How To Code in Go&lt;/em&gt; eBook in &lt;strong&gt;PDF format&lt;/strong&gt;&lt;/a&gt;  &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After you’re finished this book, if you’d like to learn more about how to build tools and applications with Go, visit the DigitalOcean Community’s &lt;a href="https://www.digitalocean.com/community/tags/go"&gt;Go section&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>go</category>
      <category>tutorial</category>
      <category>beginners</category>
      <category>ebook</category>
    </item>
  </channel>
</rss>
