<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Oleg Šelajev</title>
    <description>The latest articles on DEV Community by Oleg Šelajev (@olegshelajev).</description>
    <link>https://dev.to/olegshelajev</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F771739%2F3d6232f7-be47-4e5d-9a3c-4a2134440b3e.jpeg</url>
      <title>DEV Community: Oleg Šelajev</title>
      <link>https://dev.to/olegshelajev</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/olegshelajev"/>
    <language>en</language>
    <item>
      <title>Running LLMs with Docker on Linux: from local to CI</title>
      <dc:creator>Oleg Šelajev</dc:creator>
      <pubDate>Fri, 20 Jun 2025 09:38:43 +0000</pubDate>
      <link>https://dev.to/olegshelajev/running-llms-with-docker-on-linux-from-local-to-ci-29fe</link>
      <guid>https://dev.to/olegshelajev/running-llms-with-docker-on-linux-from-local-to-ci-29fe</guid>
      <description>&lt;p&gt;Earlier this year, Docker released Docker Model Runner, a component integrated into Docker Desktop that allows you to run Large Language Models (LLMs) locally on your machine. Unlike typical container-based execution, Docker Model Runner can leverage the full capabilities of your GPU hardware directly, offering optimal performance. Initially available on macOS and Windows through Docker Desktop, Docker Model Runner is now also available as part of Docker Community Edition (CE). This expansion means you can integrate it seamlessly into your Linux-based continuous integration (CI) pipelines or even use it directly in production.&lt;/p&gt;

&lt;p&gt;In this article, we’ll explore how you can install Docker Model Runner on a Linux VM with Docker already available. We'll go through pulling some LLMs, running them, and clarify which URLs you'll use to connect to your models from applications.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting a Linux VM
&lt;/h2&gt;

&lt;p&gt;First, we need a Linux VM. To keep things simple, we’ll use Google Cloud Platform’s Shell console, which provides a convenient Linux VM environment right in your browser without needing to provision custom resources.&lt;/p&gt;

&lt;p&gt;The VM provided through Cloud Shell isn’t particularly powerful, but it has Docker pre-installed, making it ideal for our demonstration.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1898ijpxvaatpjigmnsw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1898ijpxvaatpjigmnsw.png" alt="Cloud shell" width="800" height="609"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To launch it:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Go to your &lt;a href="https://console.cloud.google.com" rel="noopener noreferrer"&gt;Google Cloud Platform Console&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Click on the Shell icon at the top-right corner.&lt;/li&gt;
&lt;li&gt;Authorize the browser if prompted.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Verify Docker installation by running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;docker &lt;span class="nt"&gt;--version&lt;/span&gt;
Docker version 28.2.2, build e6534b4
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Installing Docker Model Runner
&lt;/h2&gt;

&lt;p&gt;Docker Model Runner on Linux uses standard Docker primitives like containers and volumes to manage GPU passthrough and LLM lifecycle efficiently.&lt;/p&gt;

&lt;p&gt;First, install the required plugin package:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;docker-model-plugin
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu803c0yt9hoj0l0hekgc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu803c0yt9hoj0l0hekgc.png" alt="installing docker-model-plugin" width="800" height="247"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After installation, you can confirm everything is set up correctly by running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker models list
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command initially pulls necessary infrastructure components. Once complete, it will display any models available locally. Since we haven't downloaded any yet, it'll show an empty list.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyvx32vuycudfm483ynn4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyvx32vuycudfm483ynn4.png" alt="Docker model runner works on Linux too" width="800" height="122"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Pulling and Running an LLM
&lt;/h2&gt;

&lt;p&gt;Next, let's install a small, resource-efficient model suitable for the Cloud Shell VM. You can choose a model from the Docker AI Hub at &lt;a href="https://hub.docker.com/u/ai" rel="noopener noreferrer"&gt;hub.docker.com/u/ai&lt;/a&gt;. For this demonstration, we'll use a small Qwen model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker model pull ai/qwen3:0.6B-Q4_K_M
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once pulled, verify it by running the model interactively:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker model run ai/qwen3:0.6B-Q4_K_M
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can ask questions and get the typical LLM answers: &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fniom5wiwqdmb20a8b27x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fniom5wiwqdmb20a8b27x.png" alt="Image description" width="800" height="279"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Connecting to the Model
&lt;/h2&gt;

&lt;p&gt;Docker Model Runner hosts an inference server that you can connect to using a standard HTTP endpoint. Internally, from Docker containers, the server is accessible via:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;http://172.17.0.1:12434/engines/v1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Externally, from your local machine or other environments, use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;http://localhost:12434
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For example, to query your model via an OpenAI-compatible API:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl http://localhost:12434/engines/v1/chat/completions &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
  "model": "ai/qwen3:0.6B-Q4_K_M",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Please write 100 words about the fall of Rome."}
  ]
}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response will be JSON-formatted and include the completion text provided by the model, something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"choices"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"finish_reason"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"stop"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"index"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"assistant"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The fall of Rome marked the end of the Roman Empire, which had long been a dominant power in the Mediterranean. The decline was driven by internal struggles, including political instability and weakened central authority, alongside external pressures and shifting alliances. The collapse of the Empire had profound effects on Europe, shaping the course of medieval civilization. As Rome faded, its legacy endured through art, religion, and the enduring influence of its legacy on Western culture."&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"created"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1750411314&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ai/qwen3:0.6B-Q4_K_M"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"usage"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"completion_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;252&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"prompt_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"total_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;284&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;With &lt;a href="https://docs.docker.com/ai/model-runner/" rel="noopener noreferrer"&gt;Docker Model Runner&lt;/a&gt;, you can now easily run powerful LLMs locally on Windows, macOS, and Linux, significantly simplifying integration into your development workflows. Whether you're using it for local experimentation or in CI environments Docker Model Runner provides a straightforward solution to add AI to your applications without breaking a sweat.&lt;/p&gt;

</description>
      <category>docker</category>
      <category>ai</category>
      <category>howto</category>
    </item>
    <item>
      <title>Implementing MCP Servers in Java: Stockfish example</title>
      <dc:creator>Oleg Šelajev</dc:creator>
      <pubDate>Thu, 29 May 2025 10:24:54 +0000</pubDate>
      <link>https://dev.to/olegshelajev/implementing-mcp-servers-in-java-stockfish-example-4dme</link>
      <guid>https://dev.to/olegshelajev/implementing-mcp-servers-in-java-stockfish-example-4dme</guid>
      <description>&lt;p&gt;Among other things, Model Context Protocol (MCP) enables AI models to interact with external tools and services through a structured interface. Which allows models to defer control to actual software libraries and execute tasks with reproducibility, predictable performance, and security guarantees.&lt;/p&gt;

&lt;p&gt;This blog post demonstrates creating an MCP server in Java that integrates the open-source chess engine Stockfish. We will use this MCP server to equip AI models with the state of the art ability to analyze chess positions and moves.&lt;/p&gt;

&lt;p&gt;We chose Java because it's an enterprise-standard language widely adopted for large-scale applications. Its ecosystem is robust, mature, and continues to power thousands of enterprise solutions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stockfish Setup
&lt;/h2&gt;

&lt;p&gt;&lt;a href="//stockfishchess.org/"&gt;Stockfish&lt;/a&gt; is a highly popular open source chess engine.&lt;/p&gt;

&lt;p&gt;The MCP server implementation will consist of a Docker image with the Stockfish binary for chess analysis and a Quarkus application implementing the MCP protocol around the Stockfish binary.&lt;/p&gt;

&lt;p&gt;You can check out the complete project on GitHub: &lt;a href="https://github.com/shelajev/mcp-stockfish/tree/main" rel="noopener noreferrer"&gt;https://github.com/shelajev/mcp-stockfish/tree/main&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To implement the MCP server functionality with Quarkus, we need the &lt;a href="https://quarkus.io/extensions/io.quarkiverse.mcp/quarkus-mcp-server-sse/" rel="noopener noreferrer"&gt;quarkus-mcp-server-sse&lt;/a&gt; dependency, which you can install just like any other Quarkus extension using a Maven command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./mvnw quarkus:add-extension &lt;span class="nt"&gt;-Dextensions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"io.quarkiverse.mcp:quarkus-mcp-server-sse"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It will essentially add the necessary dependency section to the &lt;code&gt;pom.xml&lt;/code&gt; and in case of more complex extenstions can also add the build plugins, and change the configuration for everything to work out of the box.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;dependency&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;io.quarkiverse.mcp&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;quarkus-mcp-server-sse&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.2.0&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/dependency&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The tool implementation, marked with a &lt;code&gt;@Tool&lt;/code&gt; annotation will be automatically picked up by Quarkus and registered to be announced when the MCP requests start coming in.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Singleton&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyTools&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="nd"&gt;@Tool&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Analyze a chess position using Stockfish."&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
  &lt;span class="nc"&gt;ToolResponse&lt;/span&gt; &lt;span class="nf"&gt;stockfish&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@ToolArg&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"FEN of the chess position"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;fen&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;depth&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;timeoutSeconds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;command&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""
      expect -c "spawn stockfish; \
      send \"uci\r\"; \
      send \"setoption name MultiPV value 2\r\"; \
      send \"position fen %s\r\"; \
      send \"go depth %d\r\"; \
      sleep %d; \
      send \"quit\r\"; interact"
      """&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;formatted&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fen&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;depth&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeoutSeconds&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;ToolResponse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;success&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;TextContent&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;()));&lt;/span&gt;
  &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note the very convenient &lt;code&gt;$(command).get()&lt;/code&gt;, which uses &lt;a href="https://github.com/jbangdev/jbang-jash" rel="noopener noreferrer"&gt;Jash&lt;/a&gt; - a Java library to provide a fluent interface to &lt;code&gt;Process&lt;/code&gt;.&lt;br&gt;
It really is a nice API to run shell commands from Java. &lt;/p&gt;

&lt;p&gt;The &lt;code&gt;expect&lt;/code&gt; utility handles interaction with the Stockfish binary because unlike typical CLI applications where an invocation gets all the parameters and represents a complete unit of computation, Stockfish CLI starts and then you send commands to it both for configuration and chess analysis. It's more convenient when you need to analyze sequences of moves at once, but a bit awkward as a CLI to integrate against. &lt;/p&gt;

&lt;p&gt;Anyway, we package our MCP server as a Docker container using this Dockerfile: &lt;a href="https://github.com/shelajev/mcp-stockfish/blob/main/src/main/docker/Dockerfile.jvm" rel="noopener noreferrer"&gt;https://github.com/shelajev/mcp-stockfish/blob/main/src/main/docker/Dockerfile.jvm&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This Dockerfile uses multi-stage build to compile the Stockfish binary and then the standard Quarkus boilerplate instructions the app. &lt;/p&gt;

&lt;p&gt;And of course we copy the Stockfish binary to the final image too:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;COPY --from=stockfish_builder /opt/Stockfish/src/stockfish /usr/local/bin/stockfish
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Building and Running the Server
&lt;/h3&gt;

&lt;p&gt;Build the Quarkus application using:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./mvnw verify &lt;span class="nt"&gt;-Dquarkus&lt;/span&gt;.container-image.build&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Docker image will be tagged as &lt;code&gt;shelajev/mcp-stockfish:0.0.1&lt;/code&gt;. Run the server using:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-p&lt;/span&gt; 8080:8080 shelajev/mcp-stockfish:0.0.1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Testing our glorious MCP
&lt;/h2&gt;

&lt;p&gt;Verify the server works using the MCP inspector by running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @modelcontextprotocol/inspector
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then connect it to &lt;code&gt;http://localhost:8080/mcp&lt;/code&gt; where Quarkus MCP Server extension hosts the endpoint (you can override the config, but for use the defaults suffice): &lt;/p&gt;

&lt;p&gt;Click on the tools, and the list tools button. Then manually call the MCP server to verify its functionality. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbbs5t4d7wdmktv3cxjve.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbbs5t4d7wdmktv3cxjve.png" alt="Reti study" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Consider this &lt;a href="https://en.wikipedia.org/wiki/R%C3%A9ti_endgame_study" rel="noopener noreferrer"&gt;famous endgame study by Richard Réti&lt;/a&gt;. It's FEN, the notation describing a chess postion, which we'll be passing to the MCP server and Stockfish is &lt;code&gt;7K/8/k1P5/7p/8/8/8/8 w - - 0 1&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdjcetrh8d9nw1bl4ap4r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdjcetrh8d9nw1bl4ap4r.png" alt="Stockfish MCP working" width="800" height="540"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can see in the output that Stockfish analyzes the position and correctly suggests &lt;code&gt;Kg7&lt;/code&gt; as the optimal move, leading to a surprising draw.&lt;/p&gt;

&lt;h2&gt;
  
  
  Integration with AI assitants
&lt;/h2&gt;

&lt;p&gt;You can of course integrate this MCP server into any AI assistant that supports the protocol. As an exercise for the reader, try to configure &lt;a href="https://dev.to/olegshelajev/easy-private-ai-assistant-with-goose-and-docker-model-runner-4b41"&gt;Goose, which we set up in a previous article&lt;/a&gt; to use our Stockfish MCP.&lt;/p&gt;

&lt;p&gt;We'll integrate it into VS Code: &lt;/p&gt;

&lt;p&gt;In the workspace create the &lt;code&gt;mcp.json&lt;/code&gt; file and configure the server like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "servers": {
    "stockfish": {
        "name": "Stockfish",
        "url": "http://localhost:8080/mcp",
        "type": "http"
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And click the start button on the server definition so VS Code will connect to it: &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fklbmpzadhxbfz39eetc9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fklbmpzadhxbfz39eetc9.png" alt="VS Code mcp.json" width="800" height="213"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now all that is left is to use the chat in the Agent mode, so it has access to tools, and request to analyze chess positions: &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpusgwqf2zghjtu2k415v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpusgwqf2zghjtu2k415v.png" alt="Stockfish MCP in VS Code" width="800" height="443"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Creating an MCP server in Java is super straightforward, &lt;br&gt;
and is a great way to enhance AI assistants with predictable, repeatable functionality.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.docker.com/blog/whats-next-for-mcp-security/" rel="noopener noreferrer"&gt;Running MCP servers in Docker adds isolation, reproducibility, and security benefits&lt;/a&gt;, so in general one should probably prefer that over running naked npx servers and other similar approaches. &lt;/p&gt;

&lt;p&gt;But also you don't have to roll MCP servers yourself for most standard APIs you want to integrate with. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5fy9cea2cies65rbt46h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5fy9cea2cies65rbt46h.png" alt="Docker MCP Toolkit" width="800" height="516"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For typical integrations (Slack, Notion, GitHub), there's Docker’s MCP toolkit which makes working with MCP even better &lt;a href="https://www.docker.com/blog/announcing-docker-mcp-catalog-and-toolkit-beta/" rel="noopener noreferrer"&gt;providing discovery, simplified installation, secrets management, access control, etc&lt;/a&gt; -- all the good things you want if you're building production grade systems. &lt;/p&gt;

</description>
      <category>java</category>
      <category>docker</category>
      <category>ai</category>
      <category>mcp</category>
    </item>
    <item>
      <title>AI-Enhanced Mock APIs with Docker Model Runner and Microcks</title>
      <dc:creator>Oleg Šelajev</dc:creator>
      <pubDate>Mon, 26 May 2025 14:01:06 +0000</pubDate>
      <link>https://dev.to/olegshelajev/ai-enhanced-mock-apis-with-docker-model-runner-and-microcks-4hja</link>
      <guid>https://dev.to/olegshelajev/ai-enhanced-mock-apis-with-docker-model-runner-and-microcks-4hja</guid>
      <description>&lt;p&gt;&lt;a href="https://microcks.io/" rel="noopener noreferrer"&gt;Microcks is a powerful CNCF tool&lt;/a&gt; that allows developers to quickly spin up mock services for development and testing. By providing predefined mock responses or generating them directly from an OpenAPI schema, you can point your applications to consume these mocks instead of hitting real APIs, enabling efficient and safe testing environments.&lt;/p&gt;

&lt;p&gt;Docker Model Runner is a convenient way to run LLMs locally within your Docker Desktop. It provides an OpenAI-compatible API, allowing you to integrate sophisticated AI capabilities into your projects seamlessly, using local hardware resources.&lt;/p&gt;

&lt;p&gt;By integrating Microcks with Docker Model Runner, you can enrich your mock APIs with AI-generated responses, creating realistic and varied data that is less rigid than static examples. &lt;/p&gt;

&lt;p&gt;In this guide, we'll explore how to set up these two tools together, giving you the benefits of dynamic mock generation powered by local AI.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting Up Docker Model Runner
&lt;/h2&gt;

&lt;p&gt;To start, ensure you've enabled Docker Model Runner as described in our previous guide on configuring Goose for a local AI assistant setup: &lt;a href="https://dev.to/olegshelajev/easy-private-ai-assistant-with-goose-and-docker-model-runner-4b41"&gt;Easy Private AI Assistant with Goose and Docker Model Runner&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Next, select and pull your desired LLM model from Docker Hub. For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker model pull ai/qwen3:8B-F16
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Configuring Microcks with Docker Model Runner
&lt;/h2&gt;

&lt;p&gt;First, clone the Microcks repository:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/microcks/microcks &lt;span class="nt"&gt;--depth&lt;/span&gt; 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Navigate to the Docker Compose setup directory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;microcks/install/docker-compose
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You'll need to adjust some configurations to enable the AI Copilot feature within Microcks.&lt;br&gt;
In the &lt;code&gt;/config/application.properties&lt;/code&gt; file, configure the AI Copilot to use Docker Model Runner:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ai-copilot.enabled=true
ai-copilot.implementation=openai
ai-copilot.openai.api-key=irrelevant
ai-copilot.openai.api-url=http://model-runner.docker.internal:80/engines/llama.cpp/
ai-copilot.openai.timeout=600
ai-copilot.openai.maxTokens=10000
ai-copilot.openai.model=ai/qwen3:8B-F16
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We're using the &lt;code&gt;model-runner.docker.internal:80&lt;/code&gt; as the base URL for the OpenAI compatible API. Docker Model Runner is available there from the containers running in Docker Desktop and using it ensures direct communication between the containers and the model runner avoiding unnecessary networking using the host machine ports.&lt;/p&gt;

&lt;p&gt;Next, enable the copilot feature itself by adding this line to the Microcks &lt;code&gt;config/features.properties&lt;/code&gt; file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;features.feature.ai-copilot.enabled=true
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Running Microcks
&lt;/h2&gt;

&lt;p&gt;Start Microcks with Docker Compose in development mode:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker-compose &lt;span class="nt"&gt;-f&lt;/span&gt; docker-compose-devmode.yml up
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once up, access the Microcks UI at &lt;a href="http://localhost:8080/" rel="noopener noreferrer"&gt;http://localhost:8080&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;Install the example API for testing. Click through these buttons on the Microcks page:&lt;br&gt;
&lt;strong&gt;Microcks Hub → MicrocksIO Samples APIs → pastry-api-openapi v.2.0.0 → Install → Direct import → Go&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8tu8mm6fuihc9gmlvc44.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8tu8mm6fuihc9gmlvc44.png" alt="Pastry API service in Microcks" width="800" height="428"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Using AI Copilot Samples
&lt;/h3&gt;

&lt;p&gt;Within the Microcks UI, &lt;a href="http://localhost:8080/#/services/683467ce41829c4e6acbcdf3" rel="noopener noreferrer"&gt;navigate to the service page of the imported API&lt;/a&gt; and select an operation you'd like to enhance. Open the "AI Copilot Samples" dialog, prompting Microcks to query the configured LLM via Docker Model Runner. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5ciu5nwsjmb251mpva4p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5ciu5nwsjmb251mpva4p.png" alt="Click the button to trigger AI" width="800" height="232"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You may notice increased GPU activity as the model processes your request.&lt;/p&gt;

&lt;p&gt;After processing, the AI-generated mock responses are displayed, ready to be reviewed or added directly to your mocked operations.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6odjluta0gurm1sye7uu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6odjluta0gurm1sye7uu.png" alt="Generated API responses" width="800" height="640"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can easily test the generated mocks with a simple &lt;code&gt;curl&lt;/code&gt; command. For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; PATCH &lt;span class="s1"&gt;'http://localhost:8080/rest/API+Pastry+-+2.0/2.0.0/pastry/Chocolate+Cake'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s1"&gt;'accept: application/json'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s1"&gt;'Content-Type: application/json'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"status":"out_of_stock"}'&lt;/span&gt;

&lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="s2"&gt;"name"&lt;/span&gt; : &lt;span class="s2"&gt;"Chocolate Cake"&lt;/span&gt;,
  &lt;span class="s2"&gt;"description"&lt;/span&gt; : &lt;span class="s2"&gt;"Rich chocolate cake with vanilla frosting"&lt;/span&gt;,
  &lt;span class="s2"&gt;"size"&lt;/span&gt; : &lt;span class="s2"&gt;"L"&lt;/span&gt;,
  &lt;span class="s2"&gt;"price"&lt;/span&gt; : 12.99,
  &lt;span class="s2"&gt;"status"&lt;/span&gt; : &lt;span class="s2"&gt;"out_of_stock"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This returns a realistic, AI-generated response that enhances the quality and reliability of your test data.&lt;/p&gt;

&lt;p&gt;For better reproducibility, you can specify the Docker Model Runner dependency and the chosen model explicitly in your &lt;code&gt;compose.yml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ai_runner:
  provider:
    type: model
    options:
      model: ai/qwen3:8B-F16
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then just starting the compose setup will pull the model too and wait for it to be available the same way it does for containers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Docker Model Runner is an excellent local resource for running LLMs and provides compatibility with OpenAI APIs, allowing for seamless integration into existing workflows. &lt;br&gt;
Microcks, for example, can use Docker Model Runner to generate sample responses for the API it mocks, so you have a richer synthetic data for your integration testing purposes. &lt;/p&gt;

&lt;p&gt;In this article we looked at what it takes to configure these two tools work together. If you have local AI workflows or just run LLMs locally, please let me know, I'd love to explore more local AI integrations with Docker. &lt;/p&gt;

</description>
    </item>
    <item>
      <title>Easy private AI assistant with Goose and Docker Model Runner</title>
      <dc:creator>Oleg Šelajev</dc:creator>
      <pubDate>Wed, 21 May 2025 11:27:50 +0000</pubDate>
      <link>https://dev.to/olegshelajev/easy-private-ai-assistant-with-goose-and-docker-model-runner-4b41</link>
      <guid>https://dev.to/olegshelajev/easy-private-ai-assistant-with-goose-and-docker-model-runner-4b41</guid>
      <description>&lt;h3&gt;
  
  
  &lt;strong&gt;Using Goose and Docker Model Runner&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://block.github.io/goose/" rel="noopener noreferrer"&gt;Goose&lt;/a&gt; is an innovative CLI assistant designed to automate development tasks using AI models. Docker Model Runner simplifies deploying AI models locally with Docker. Combining these technologies, you get a powerful local environment with advanced AI assistance, ideal for coding and automation.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Install Goose CLI on macOS&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Install Goose via the curl2sudo oneliner technique:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://github.com/block/goose/releases/download/stable/download_cli.sh | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  &lt;strong&gt;Enable Docker Model Runner&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;First, ensure you have &lt;a href="https://docs.docker.com/get-docker/" rel="noopener noreferrer"&gt;Docker&lt;/a&gt; Desktop installed, then configure Docker Model Runner with your model of choice. Go to Settings -&amp;gt; Beta features and check the checkboxes for Docker Model Runner.&lt;/p&gt;

&lt;p&gt;By default it’s not wired to be available from your host machine, as a security precaution, but we want to simplify the setup, and enable the TCP support as well. The default port for that would be 12434, so the base URL for the connection would be: &lt;code&gt;http://localhost:12434&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgzua9gzkap8fpui1bdt7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgzua9gzkap8fpui1bdt7.png" alt="Docker Desktop configuration to enable Docker Model Runner " width="800" height="559"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now we can pull the models from Docker Hub: &lt;a href="http://hub.docker.com/u/ai" rel="noopener noreferrer"&gt;hub.docker.com/u/ai&lt;/a&gt; and run the models&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker model pull ai/qwen3:30B-A3B-Q4_K_M
docker model run ai/qwen3:30B-A3B-Q4_K_M
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command starts the interactive chat with the model.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Configure Goose for Docker Model Runner&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Edit your Goose config at &lt;code&gt;~/.config/goose/config.yaml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GOOSE_MODEL: ai/qwen3:30B-A3B-Q4_K_M
GOOSE_PROVIDER: openai
extensions:
  developer:
    display_name: null
    enabled: true
    name: developer
    timeout: null
    type: builtin
GOOSE_MODE: auto
GOOSE_CLI_MIN_PRIORITY: 0.8
OPENAI_API_KEY: irrelevant
OPENAI_BASE_PATH: /engines/llama.cpp/v1/chat/completions
OPENAI_HOST: http://localhost:12434
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;OPENAI_API_KEY&lt;/code&gt; is irrelevant as Docker Model Runner does not require authentication because the model is run locally and privately on your machine.&lt;/p&gt;

&lt;p&gt;We provide the base path for the OpenAI compatible API, and choose the model &lt;code&gt;GOOSE_MODEL: ai/qwen3:30B-A3B-Q4_K_M&lt;/code&gt; that we have pulled before.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Testing It Out&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Try Goose CLI by running goose in the terminal. You can see that is automatically connects to the correct model, and when you ask for something, you’ll see the GPU spike as well.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmome5n4cffbm4m3xjmr8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmome5n4cffbm4m3xjmr8.png" alt="Goose CLI powered by Docker Model Runner" width="800" height="161"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now, we also configure Goose to have the Developer extension enabled. It allows it to run various commands on your behalf, and makes it a much more powerful assistant with access to your machine than just a chat application.&lt;/p&gt;

&lt;p&gt;You can additionally configure the custom hints to goose to tweak its behaviour using the &lt;a href="https://block.github.io/goose/docs/guides/using-goosehints/\" rel="noopener noreferrer"&gt;.goosehints&lt;/a&gt; file.&lt;/p&gt;

&lt;p&gt;And what’s even better, you can script Goose to run tasks on your behalf a simple one-liner:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;goose run -t "your instructions here"&lt;/code&gt; or &lt;code&gt;goose run -i instructions.md&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;where &lt;code&gt;instructions.md&lt;/code&gt; is the file with what to do.&lt;/p&gt;

&lt;p&gt;On macos you have access to crontab for scheduling recurrent scripts, so you can automate Goose with Docker Model Runner to activate repeatedly and act on your behalf. For example,&lt;br&gt;
&lt;code&gt;crontab -e&lt;/code&gt; , will open the editor for the commands you want to run, and a like like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;5 8 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; 1-5 goose run &lt;span class="nt"&gt;-i&lt;/span&gt; fetch_and_summarize_news.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Will make Goose run at 8:05 am every workday and follow the instructions in the &lt;code&gt;fetch_and_summarize_news.md&lt;/code&gt; file. For example to skim the internet and prioritize news based on what you like.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Conclusion&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;All in all integrating Goose with Docker Model Runner creates a simple but powerful setup for using local AI for your workflows.&lt;br&gt;
You can make it run custom instructions for you or easily script it to perform repetitive actions intelligently. &lt;br&gt;
It is all powered by a local model running in Docker Model Runner, so you don't compromise on privacy either. &lt;/p&gt;

</description>
      <category>ai</category>
      <category>docker</category>
    </item>
  </channel>
</rss>
