Oleg Šelajev

Posted on May 26

AI-Enhanced Mock APIs with Docker Model Runner and Microcks

Microcks is a powerful CNCF tool that allows developers to quickly spin up mock services for development and testing. By providing predefined mock responses or generating them directly from an OpenAPI schema, you can point your applications to consume these mocks instead of hitting real APIs, enabling efficient and safe testing environments.

Docker Model Runner is a convenient way to run LLMs locally within your Docker Desktop. It provides an OpenAI-compatible API, allowing you to integrate sophisticated AI capabilities into your projects seamlessly, using local hardware resources.

By integrating Microcks with Docker Model Runner, you can enrich your mock APIs with AI-generated responses, creating realistic and varied data that is less rigid than static examples.

In this guide, we'll explore how to set up these two tools together, giving you the benefits of dynamic mock generation powered by local AI.

Setting Up Docker Model Runner

To start, ensure you've enabled Docker Model Runner as described in our previous guide on configuring Goose for a local AI assistant setup: Easy Private AI Assistant with Goose and Docker Model Runner.

Next, select and pull your desired LLM model from Docker Hub. For example:

docker model pull ai/qwen3:8B-F16

Configuring Microcks with Docker Model Runner

First, clone the Microcks repository:

git clone https://github.com/microcks/microcks --depth 1

Navigate to the Docker Compose setup directory:

cd microcks/install/docker-compose

You'll need to adjust some configurations to enable the AI Copilot feature within Microcks.
In the /config/application.properties file, configure the AI Copilot to use Docker Model Runner:

ai-copilot.enabled=true
ai-copilot.implementation=openai
ai-copilot.openai.api-key=irrelevant
ai-copilot.openai.api-url=http://model-runner.docker.internal:80/engines/llama.cpp/
ai-copilot.openai.timeout=600
ai-copilot.openai.maxTokens=10000
ai-copilot.openai.model=ai/qwen3:8B-F16

We're using the model-runner.docker.internal:80 as the base URL for the OpenAI compatible API. Docker Model Runner is available there from the containers running in Docker Desktop and using it ensures direct communication between the containers and the model runner avoiding unnecessary networking using the host machine ports.

Next, enable the copilot feature itself by adding this line to the Microcks config/features.properties file:

features.feature.ai-copilot.enabled=true

Running Microcks

Start Microcks with Docker Compose in development mode:

docker-compose -f docker-compose-devmode.yml up

Once up, access the Microcks UI at http://localhost:8080.

Install the example API for testing. Click through these buttons on the Microcks page:
Microcks Hub → MicrocksIO Samples APIs → pastry-api-openapi v.2.0.0 → Install → Direct import → Go.

Using AI Copilot Samples

Within the Microcks UI, navigate to the service page of the imported API and select an operation you'd like to enhance. Open the "AI Copilot Samples" dialog, prompting Microcks to query the configured LLM via Docker Model Runner.

You may notice increased GPU activity as the model processes your request.

After processing, the AI-generated mock responses are displayed, ready to be reviewed or added directly to your mocked operations.

You can easily test the generated mocks with a simple curl command. For example:

curl -X PATCH 'http://localhost:8080/rest/API+Pastry+-+2.0/2.0.0/pastry/Chocolate+Cake' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{"status":"out_of_stock"}'

{
  "name" : "Chocolate Cake",
  "description" : "Rich chocolate cake with vanilla frosting",
  "size" : "L",
  "price" : 12.99,
  "status" : "out_of_stock"
}

This returns a realistic, AI-generated response that enhances the quality and reliability of your test data.

For better reproducibility, you can specify the Docker Model Runner dependency and the chosen model explicitly in your compose.yml:

ai_runner:
  provider:
    type: model
    options:
      model: ai/qwen3:8B-F16

Then just starting the compose setup will pull the model too and wait for it to be available the same way it does for containers.

Conclusion

Docker Model Runner is an excellent local resource for running LLMs and provides compatibility with OpenAI APIs, allowing for seamless integration into existing workflows.
Microcks, for example, can use Docker Model Runner to generate sample responses for the API it mocks, so you have a richer synthetic data for your integration testing purposes.

In this article we looked at what it takes to configure these two tools work together. If you have local AI workflows or just run LLMs locally, please let me know, I'd love to explore more local AI integrations with Docker.

DEV Community