I'm a pragmatist at heart. While I don't fully believe in using AI for everything, I did find myself getting very frustrated with my copy and paste process for "good" Terraform configuration. I already wrote Terraform configuration that ran with many resources and was mostly secure by default anyway. Why did I have to go back two or three years to an example and then update it? Could I really use AI to write some new demo code?
I realized I had a lot of content I could reference and get myself out of the copy-paste whirlpool. Most of the time, I looked up:
- Slides of old talks with accurate diagrams
- Some old code from two or three specific repositories on GitHub
- My book
- Terraform modules in the registry
The problem? I know how to build infrastructure with Terraform but I know nothing about AI. So I decided to learn.
When I started blogging and trying to learn technology for myself, I ran everything locally and avoided paying for resources. That meant using the free credits for most cloud or managed offerings and working within a resource-constrained system. For this series, I decided on the following tools:
- Ollama for models
- Langflow for no-code/low-code agentic development
- OpenSearch for vector search
- Docling to process my PDF documents
As for the model, I was willing to try some of the "open" models like Granite through Ollama. If they didn't work, I would try others.
Building something to run models
As a starting point, I ran everything in containers. If I need more resources, I would move it to some cloud deployment later. With a Docker Compose file, I deployed Ollama, Langflow, and OpenSearch.
Ollama runs models on your local machine. Since I get impatient waiting for Ollama to start and pull the models, I built a Docker container with Ollama and pre-pulled models and embeddings.
FROM ollama/ollama:latest
COPY ./init-ollama.sh /tmp/init-ollama.sh
WORKDIR /tmp
RUN chmod +x init-ollama.sh \
&& ./init-ollama.sh
EXPOSE 11434
In this example, I used granite4:tiny-h since I am running it locally on my laptop.
#!/usr/bin/env bash
ollama serve &
ollama list
ollama pull granite4:tiny-h
ollama pull granite-embedding:30m
Deploying an agent toolchain
I do not know how to write an AI agent. I also didn't feel like coding a whole agent toolchain to achieve my goal of writing infrastructure for my purposes. Luckily, I found Langflow, which offers a no-code/low-code way to deploy AI agents and MCP servers. I created a Dockerfile for Langflow.
FROM langflowai/langflow:1.7.2
USER root
RUN apt update && apt install -y libgl1 libglib2.0-0 && uv pip install langflow[docling]
CMD ["python", "-m", "langflow", "run", "--host", "0.0.0.0", "--port", "7860"]
Initially, I did use the Langflow image without creating a custom Dockerfile. Unfortunately, the Docling component I wanted to use for processing PDF chapters of my book needed a dependency installed. I built that into my own Langflow image so I didn't have to run the install separately.
Building a RAG stack for context
It turns out that retrieval augmented generation (RAG) is an important part of getting my use case working. I have context that I want my agent to use, so that information needs to be processed and stored in a vector database.
I chose Opensearch because I deployed it before and it was something I could run locally. Unfortunately, it turns out that using Opensearch as a vector database for Docling requires some additional configuration. Supposedly, Opensearch creates the index if it doesn't already exist. For some reason, it is unclear if auto-creation works for a simple index or it also works for a vector index. I kept getting errors from Langflow that the index did not exist.
As a workaround, I reverse engineered the index and manually requested the Opensearch API to create an empty index. At this point, I was tired of writing scripts and resorted to asking Project Bob, an AI software agent, for help. I think I asked it to generate me a Dockerfile for Opensearch with a step to create a vector index named "langflow" with "ef_search" of 512 and a property named "chunk_embeddings" of "knn_vector" type with 384 dimensions. It gave me a pretty good script as a response, complete with the proper API call.
#!/bin/bash
# Wait for OpenSearch to be ready
echo "Waiting for OpenSearch to start..."
until curl -s http://localhost:9200/_cluster/health > /dev/null; do
sleep 2
done
echo "OpenSearch is ready. Creating 'langflow' index..."
# Create the langflow index with vector search configuration
curl -X PUT "http://localhost:9200/langflow" -H 'Content-Type: application/json' -d'
{
"settings": {
"index": {
"knn": true,
"knn.algo_param.ef_search": 512
}
},
"mappings": {
"properties": {
"chunk_embedding": {
"type": "knn_vector",
"dimension": 384
}
}
}
}
'
echo ""
echo "Index 'langflow' created successfully!"
Next, Bob created a Dockerfile out of the script. Bob was a bit verbose but the script did work with some modifications.
FROM opensearchproject/opensearch:3
# Copy the initialization script
COPY ./init-opensearch.sh /usr/share/opensearch/init-opensearch.sh
# Create a wrapper script to run both OpenSearch and the init script
RUN echo '#!/bin/bash' > /usr/share/opensearch/entrypoint-wrapper.sh && \
echo '/usr/share/opensearch/opensearch-docker-entrypoint.sh opensearch &' >> /usr/share/opensearch/entrypoint-wrapper.sh && \
echo 'sleep 5' >> /usr/share/opensearch/entrypoint-wrapper.sh && \
echo '/usr/share/opensearch/init-opensearch.sh' >> /usr/share/opensearch/entrypoint-wrapper.sh && \
echo 'wait' >> /usr/share/opensearch/entrypoint-wrapper.sh && \
chmod +x /usr/share/opensearch/entrypoint-wrapper.sh
# Use the wrapper as the entrypoint
ENTRYPOINT ["/usr/share/opensearch/entrypoint-wrapper.sh"]
As I said before, I am pragmatic about my use of AI for coding. I didn't want to use something like Bob to speed things up but I got tired and thought, "Why not?" I think the Dockerfile and script were generated in about two minutes compared to the hour that it took me to write and test the Ollama one. The result was functional but I wouldn't use AI to generate anything I didn't have the confidence to verify or test myself.
Putting it all together
I created the set of containers using Docker Compose on my local machine, including the Terraform MCP server. By using the MCP server for the Terraform registry, I could search the public modules and providers available to expedite new examples and versions of modules I used before.
Each of the containers includes a set of environment variables to enable it to run locally. Some variables, like those for Opensearch, disable security plugins and enable demo configurations for ease of use.
services:
terraform-mcp-server:
image: hashicorp/terraform-mcp-server:0.3.3
container_name: terraform-mcp-server
ports:
- "8080:8080"
environment:
- 'TRANSPORT_MODE=streamable-http'
- 'TRANSPORT_HOST=0.0.0.0'
- 'TFE_TOKEN=${TFE_TOKEN}'
ollama:
build:
context: Dockerfiles
dockerfile: Dockerfile.ollama
container_name: ollama
ports:
- "11434:11434"
volumes:
- ollama_data:/root/.ollama
restart: unless-stopped
environment:
- 'OLLAMA_CONTEXT_LENGTH=131072'
langflow:
build:
context: Dockerfiles
dockerfile: Dockerfile.langflow
container_name: langflow
ports:
- "7860:7860"
environment:
- 'LANGFLOW_HOST=0.0.0.0'
- 'LANGFLOW_OPEN_BROWSER=false'
- 'LANGFLOW_WORKER_TIMEOUT=1800'
volumes:
- langflow_data:/app/langflow
depends_on:
- ollama
restart: unless-stopped
opensearch:
build:
context: Dockerfiles
dockerfile: Dockerfile.opensearch
container_name: opensearch
environment:
- cluster.name=opensearch-cluster
- node.name=opensearch-node1
- discovery.type=single-node
- bootstrap.memory_lock=true
- "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m"
- "DISABLE_INSTALL_DEMO_CONFIG=true"
- "DISABLE_SECURITY_PLUGIN=true"
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
volumes:
- opensearch_data:/usr/share/opensearch/data
ports:
- "9200:9200"
- "9600:9600"
restart: unless-stopped
volumes:
ollama_data: {}
langflow_data: {}
opensearch_data: {}
Each component represents an important part of the AI stack, such as prompts, agents, context, and models. After they came up, I could access Langflow on http://localhost:7860.
The remaining components I could access via API or connect it to a flow in Langflow.
Conclusion
After much trial and error, I managed to figure out how to create a local stack to build out an AI agent to help me update my examples based on knowledge from my book, talks, and other code examples. As I explore more, I will add more tools and context to improve the agent (and maybe even build multiple agents).
I realized I could probably use the AI agent built into my coding IDE to do most of this. Project Bob did end up helping in the process to build this stack and it did make it faster. The downside to using any AI agent built into my coding IDE was the overall cost. I quickly realized that I had to check my usage to ensure I didn't make too many requests.
I was glad that I could run this locally. The small Granite model really helped - I only had to give Ollama a little bit more CPU and memory to run the model. I could mitigate the cost of running against hosted LLMs and maybe achieve a similar result. I found the process of deploying each component valuable as a learning experience.
Next, I plan on building a flow in Langflow to process all of my book chapters, slides, and code examples before passing it to an agent to process.

Top comments (0)