DEV Community: Zachary Loeber

3 LLM Underdogs of 2025

Zachary Loeber — Thu, 08 Jan 2026 17:29:53 +0000

2025 has been a flurry of AI madness that has been hard to keep up with. I've been deep in learning and experimenting in the AI space and noticed that while everyone's hyping up the latest GPT variant or Claude release, there are some genuinely impressive open-source models that feel like they just flew under the radar in 2025. These aren't just "good for their size", they're legit excellent models that you can run locally, for free, and deserve more attention.

This isn't a benchmark shootout or a comparison article. Instead, I want to shine a light on three models that I think are more important than they're given credit for. They range in parameter size and are:

Technology Innovation Institute's Falcon H1R 7B - A super fast edge-ready math wiz and reasoning model
NVIDEA's Nemotron Nano 8b and 30b - This model sports a massive maximum context length of 1 million tokens!
ServiceNow's Apriel 1.6 15b Thinker - A mid-sized LLM that trounces other models on tool usage

Each brings something unique to the table lets proceed and try to gain some understanding as to what makes them special and where you might want to put them to work for ya.

NOTE If you want to compare/contrast these and other open source models in this class they are all technically in the 'small' class of models at Artificial Analysis

Falcon H1R 7B: Efficiency Meets Reasoning

Let's start with TII's Falcon H1R 7B, which just dropped (literally, like hours ago as I'm writing this). This one caught my attention immediately because it challenges a fundamental assumption we've all been making: that you need massive models for serious reasoning tasks.

What Makes It Special

The Falcon H1R 7B uses a hybrid Transformer-Mamba architecture, which is a fancy way of saying they've combined two different approaches to get better performance with fewer parameters. The result? This 7-billion parameter model is punching way above its weight class. It scored 88.1% on AIME-24 mathematics benchmarks, outperforming ServiceNow's Apriel 1.5 at 15B parameters. Yeah, a model with less than half the parameters performing better on advanced math.

But here's where it gets really interesting: it processes up to 1,500 tokens per second per GPU at batch size 64. That's nearly double the speed of comparable models. For anyone building multi-agent systems or handling high-volume inference workloads, this matters tremendously.

Where It Shines

As mentioned already, this model is quite good at math but that's not it's main superpower. The sweet spot for Falcon H1R 7B is anywhere you need reliable reasoning without the compute overhead of larger models:

Edge deployments: Running on constrained hardware where every parameter counts
Real-time applications: That token throughput makes it viable for interactive systems
Math and coding tasks: It delivers 68.6% accuracy on coding and agentic tasks, best-in-class for models under 8B
Energy-conscious deployments: Lower memory and energy consumption while maintaining near-perfect scores on benchmarks

Why It's Important

The open-source AI community has been in an arms race of parameter counts. Falcon H1R 7B demonstrates that architectural innovations can matter more than raw size. It's released under the Falcon TII License, making it accessible for both research and commercial use. For developers building on limited budgets or those who need to deploy at scale, this efficiency-without-compromise approach is exactly what we need.

Nemotron Nano: Massive Context Length

NVIDIA's Nemotron Nano family represents a different kind of innovation. The 8B variant (Llama-3.1-Nemotron-Nano-8B-v1) and the more recent 30B variant are part of what NVIDIA calls their most efficient family of open models with leading accuracy for agentic AI applications.

What Makes Them Special

The Nemotron Nano models use a hybrid Mixture-of-Experts (MoE) architecture combined with Mamba-2 layers. The 30B model is actually a 31.6B total parameter model that activates only about 3.6B parameters per token. This sparse activation approach means you get the intelligence of a much larger model with the speed and memory footprint of a smaller one.

The context window is another standout feature: 1 million tokens! Yes, you read that right. For comparison, that's enough to fit several entire codebases or extensive documentation in a single context. The implications for code review, documentation generation, and long-form reasoning tasks are significant.

Where They Shine

Aside from the large context, the Nemotron Nano models excel in scenarios where you need both reasoning capability and practical throughput:

Multi-agent systems: The 30B variant delivers 4x higher throughput than Nemotron 2 Nano, making it ideal for systems where multiple agents need to collaborate
Software development: Best-in-class performance on SWE-Bench among models in its size class
Agentic workflows: Built specifically for tasks like software debugging, content summarization, and information retrieval
Long-context tasks: That 1M token window makes it perfect for analyzing large codebases or extensive documents

The 8B variant is particularly interesting for edge deployments or when you want reasoning capabilities on more modest hardware. NVIDIA optimized it specifically for PC and edge use cases, and it shows.

Why They're Important

NVIDIA isn't just releasing models, they're releasing the entire ecosystem. The Nemotron 3 family comes with open training datasets (25T tokens worth), reinforcement learning environments through NeMo Gym, and the full training recipe. This level of transparency is rare and incredibly valuable for researchers and practitioners who want to understand not just what works, but why it works.

The hybrid MoE architecture is also proving to be a game-changer for efficiency. By activating only a subset of parameters per token, these models achieve what researchers call the "Pareto frontier", optimal speed without sacrificing quality. This architectural approach could influence how we think about model design going forward.

Apriel 1.6 15B Thinker: Tool Wielding Reasoning Model

Now let's talk about Apriel 1.6 15B Thinker, which might be the most underrated model in this entire lineup. ServiceNow has been quietly building something impressive with their Apriel SLM series, and version 1.6 demonstrates what's possible when you focus on both performance and efficiency.

What Makes It Special

Apriel 1.6 is a multimodal reasoning model, meaning it can work with both text and images. It scored 57 on the Artificial Analysis Index, putting it on par with models like Qwen3 235B A22B and DeepSeek-v3.2 (all models that are 15x larger).

If you look closer at this model compared to others in it's class you will find that it absolutely trounces almost all the other's in tool use. It outperforms larger parameter models like gpt-oss-20b by almost 10% in some of the tests. Looking through the various test charts it is almost funny to see how many of the models with 2x the parameters score less than Apriel.

NOTE The ability to use tools well is the difference between a toy LLM you play with and a machine you can use for real work. A model that can use tools well can also be supplemented with MCP servers to give them additional skills and capabilities beyond their training as well.

Where It Shines

Apriel 1.6 excels in domains where you need both vision and reasoning in the enterprise:

Document understanding: OCR, chart analysis, and structured data extraction from images
Enterprise applications: It scores 69 on Tau2 Bench Telecom and 69 on IFBench (key benchmarks for enterprise domains)
Function calling and tool use: The simplified chat template and special tokens make it easier to integrate with agentic systems
Resource-constrained deployments: At 15B parameters, it fits on a single GPU while delivering frontier-level performance

Why It's Important

Apriel 1.6 represents a crucial evolution in how we think about multimodal AI. Most multimodal models are either massive (100B+ parameters) or sacrifice significant capability to stay small. ServiceNow has found a middle ground that makes advanced vision-language capabilities accessible.

The training approach is also noteworthy. Trained on NVIDIA's GB200 Grace Blackwell Superchips, the entire mid-training pipeline required approximately 10,000 GPU hours, a relatively small compute footprint achieved through careful data strategy and training methodology. This efficiency-first mindset shows that throwing more compute at the problem isn't always the answer.

For developers building enterprise AI applications, Apriel 1.6 offers something unique: production-ready multimodal reasoning that actually fits in a reasonable memory budget. The focus on enterprise benchmarks and tool calling also makes it particularly well-suited for real-world business applications rather than just benchmark chasing.

The Bigger Picture

What ties these three models together isn't just that they're flying under the radar, it's what they represent about where AI development is heading. We're moving away from the "bigger is always better" mentality toward a more nuanced understanding of efficiency, architecture, and targeted optimization.

Falcon H1R 7B shows that hybrid architectures can achieve remarkable results with fewer parameters. Nemotron Nano demonstrates that sparse activation through MoE can give us the best of both worlds, large model intelligence with small model efficiency. Apriel 1.6 proves that multimodal capabilities don't require massive models if you're thoughtful about training and optimization.

All three of these models are:

Fully open and available for local deployment
Designed with efficiency as a first-class concern
Backed by transparent research and training methodologies
Focused on practical, real-world use cases

For those of us building AI-powered applications, especially in environments where we can't just throw unlimited compute resources at every problem, these models matter. They represent a future where advanced AI capabilities are accessible to anyone with modest hardware, not just those with access to massive GPU clusters.

Getting Started

If you want to try these models yourself all three can be run locally using tools like llama.cpp, Ollama, or vLLM.

Falcon H1R 7B: Available on Hugging Face and Ollama (ollama pull falcon:7b)
NVIDIA Nemotron Nano 8B/30B: Available on Hugging Face, through NVIDIA's NIM platform, and Ollama (ollama pull nemotron-3-nano:30b)
Apriel 1.6 15B Thinker: Available on Hugging Face and hosted on platforms like Together AI and Ollama (ollama pull ServiceNow-AI/Apriel-1.6-15b-Thinker:Q4_K_M)

Closing Thoughts

The AI landscape moves fast so it's easy to get caught up in the hype around the latest massive model releases. But some of the most interesting innovation is happening in the efficiency space, building models that are genuinely useful for practitioners who don't have access to unlimited compute resources. Falcon H1R 7B, NVIDIA Nemotron Nano, and Apriel 1.6 15B Thinker deserve more attention than they're getting. If you've been thinking about integrating AI into your projects but have been put off by the resource requirements of larger models, these three are worth a serious look.

Links and Resources

Technology Innovation Institute's Falcon H1R 7B
NVIDEA's Nemotron Nano 8b and 30b and its technical paper
ServiceNow's Apriel 1.6 15b Thinker

Terraform Module MCP Server

Zachary Loeber — Fri, 31 Oct 2025 01:07:55 +0000

I created an MCP server that streamlines access to custom Terraform modules that I'd like to share with the community. The project called terraform-ingest is a CLI, MCP, and API tool that can be used locally or with an AI agent to tap into your existing code base more effectively.

The Problem

I looked high and low for a model context protocol server I could use as an interface to the several dozen custom Terraform modules I've created. My search was futile so I spent a few cycles and just made my own. With this solution you can use a simple YAML file to define all your custom Terraform modules git sources. These are then ingested to extract and index all relevant information for embedding into a vector database. From here you can use this tool as a CLI, API, or MCP server to find modules that best suit your needs and weave them together to create organizational compliant infrastructure as code!

Model Context Protocol is the wildly popular interface for AI agentic workloads. I've written about how to use existing MCP Servers to augment AI for vast improvements in its ability to clean up crappy and even author new Terraform. This works quite well for non-modular code. But it lacks the ability to interface with the many custom modules I've personally created for the teams I've worked with. And while I've created tools to help manage multi-git project repositories as a single virtual monorepo it still is hard, even for me, to consistently know the right variables and outputs for the modules needed for any given solution. What we all need is an MCP server that better aggregates several dozen terraform modules (both as single projects or recursively as monorepo projects) into one managed source of truth.

The Solution

A RAG ingestion pipeline for Terraform modules is how I envisioned a solution for these issues. It makes sense to pull in and ingest key information about each targeted module, interested versions, providers, and target paths. We can accomplish this with some simple git cloning, hcl parsing, and then finally vectordb embedding of the generated json that summarizes each module for similarity searching.

How It Was Built

In this case the solution was built in a few rounds using AI for heavy lifting and scaffolding in this general order:

Create a YAML definition and click cli interface with FastAPI server to allow for the ingestion of one or more git repos of terraform modules into a local json file, one per-target git branch or tag.
Add a FastMCP interface for searching through the results.
Add the embedding of this data into a local vector database using an embedding method of your choice (starting with chromadb and local but less precise embedding but including ollama or external embedding if so desired).

In between these steps at various points I added in the following:

Multi-stage docker images
A bunch of gitlab pipelines, publishing to Pypi, better unit tests, semantic releases, et cetera
hatch-vcs for build and automatic versioning
mkdocs based static website generation for docs
Minor refactors from what got generated for me to be more cohesive to the way I like to have my Python apps
Additional MCP dynamic resources and prompt template generation
Lazy-loading of bulky dependencies

Use Cases

This project includes a practical configuration example that encompasses all the modules for a team of Terraform professionals I happen to hold in the highest regards, Cloudposse. They have produced almost 200 high-quality open source AWS terraform modules. You can look at this example project to get more information on how to generate and then ingest such a configuration. The configuration file has hundreds of entries like the following:

...
- name: aws-access-analyzer
  url: https://github.com/cloudposse-terraform-components/aws-access-analyzer.git
  recursive: false
  branches: []
  include_tags: true
  max_tags: 1
  path: ./src
  exclude_paths: []
...

Once you have ingested the modules to a local cache and embedded to a vector database it is easy to search without any AI.

# Retrieve the top 5 similar results using the vectordb search
terraform-ingest search "API Gateway Lambda integration" -l 5

# Get all the details about the top search result for 'vpc module for aws' (requires jq)
terraform-ingest search "vpc module for aws" -l 1 -j -c ./.vscode/cloudposse.yaml | jq -r '.results[0].id' | xargs -I {} terraform-ingest index get {}

The cli also includes the ability to directly call exposed MCP tools.

terraform-ingest function exec get_module_details --arg repository "https://github.com/cloudposse-terraform-components/aws-api-gateway-rest-api.git" --arg ref "v1.535.3" --arg path "src" --output-dir ./output

Use Cases

Some possible use cases include (but are not limited to):

On-demand module documentation and example generation
Query your authored modules via any LLM
Module upgrade planning and risk analysis
Greenfield deployment using your own organization's modules
Running a self-updating internal MCP server for inline validation of module use via any internal agent

Lessons Learned

I learned a few things about developing an MCP server for RAG that are worth mentioning.

STDIO Is Goofy

This is the default mode many run local MCP servers in. The name says it all, it uses stdio output streams to communicate with the server. As such, you want to prevent superfluous output to the console when running in this mode otherwise MCP clients will get confused and start having JSON serialization errors.

STDIO mode is just a local process that gets kicked off on behalf of the MCP client. If you want to speed things along you probably don't want to be running long running imports or embeddings when you start your app. Including a means of pre-processing long running workflows helps get around this. In my case I include a cli that can be used to do the needful for terraform-ingest before starting the MCP services.

Local Embedding Is Heavy

Using local vector databases and embeddings is really nice for development. But they are also quite large dependencies that can add multiple GB to your distribution. This makes for large Docker images and pypi downloads. More importantly, it really really drags out your CICD pipelines and I'm exceedingly impatient when it comes to long running pipelines.

To get around this we lazy-load in models and dependencies if they are needed. My code architecture now looks something like this:

┌─────────────────────────────────────────────────────┐
│  User Interface Layer                               │
│  ┌──────────────┬──────────────┬─────────────────┐  │
│  │     CLI      │     API      │  Programmatic   │  │
│  └──────────────┴──────────────┴─────────────────┘  │
│         ↓           ↓              ↓                │
├─────────────────────────────────────────────────────┤
│  TerraformIngest                                    │
│  ├─ auto_install_deps parameter                     │
│  └─ calls ensure_embeddings_available()             │
├─────────────────────────────────────────────────────┤
│  dependency_installer.py                            │
│  ├─ DependencyInstaller class                       │
│  │  ├─ check_package_installed()                    │
│  │  ├─ get_missing_packages()                       │
│  │  ├─ install_packages()                           │
│  │  └─ ensure_embedding_packages()                  │
│  └─ ensure_embeddings_available() function          │
├─────────────────────────────────────────────────────┤
│  embeddings.py                                      │
│  └─ VectorDBManager (no changes, already works)     │
└─────────────────────────────────────────────────────┘

WARNING I usually do not recommend this approach for production use as it has security ramifications and defies best practices for artifact immutability. If you are repackaging this server up to use in production bake a new image with the appropriate embedding models in place. Then let me know as well cause that would definitely fill my bucket a bit!

Conclusion

I'm pretty happy with the results of this little project and plan on configuring it for any large set of modules I author. I've included some other nice features like automatic updating, automatic import of a github organization, and indexing. I could easily see updating the caching to be more efficient and adding more/better searching but even as it is currently the MCP server is quite functional and I encourage the community to put it through its paces so it can be further improved!

Terraform with AI and Github Copilot

Zachary Loeber — Tue, 21 Oct 2025 02:32:56 +0000

Creating terraform or other infrastructure as code for a new project can be daunting for some. This shows how you can easily crank out a new deployment to meet your requirements using Github copilot prompt files and a few free MCP servers. For the heck of it, we will also convert between two totally different cloud providers to deploy the same infrastructure.

Introduction

Github copilot is getting more powerful with each update and I've been enjoying using it quite a bit to write quick scripts and even initializing whole project repositories for me. But I've never been very impressed with it (or any other LLM's) ability to create solid terraform. I've been exploring model context protocol (MCP) servers quite a bit lately and figured perhaps they can augment an agent with enough additional capabilities to upset me less with their terraform tasks. Turns out that providing Copilot with the right tools can really amplify its results!

The Setup

I'm using Github Copilot in VSCode along with several MCP servers for this exercise. You can setup a project locally with MCP servers easily enough by creating a file named ./.vscode/mcp.json in your project. Here is what mine looks like:

{
  "servers": {
    "sequential-thinking": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-sequential-thinking"
      ],
      "type": "stdio"
    },
    "server-filesystem": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-filesystem",
        "."
      ],
      "type": "stdio"
    },
    "terraform": {
      "command": "docker",
      "args": [
        "run",
        "-i",
        "--rm",
        "hashicorp/terraform-mcp-server"
      ],
      "type": "stdio"
    },
    "mcp-feedback-enhanced": {
      "command": "uvx",
      "args": ["mcp-feedback-enhanced@latest"]
    },
    "aws-knowledge": {
        "url": "https://knowledge-mcp.global.api.aws",
        "type": "http"
    },
    "azure-knowledge": {
      "url": "https://learn.microsoft.com/api/mcp",
      "type": "http"
    },
  },
  "inputs": []
}

These are the MCP servers used:

Server	Purpose
sequential-thinking	Very popular MCP for helping an LLM organize its thoughts
server-filesystem	Reading/writing to the filesystem
terraform	Terraform best practices and provider documentation lookup
mcp-feedback-enhanced	(Optional) User feedback forms for more interactive data gathering from the user
aws-knowledge	Official AWS online knowledge datastore
azure-knowledge	Official Azure online knowledge datastore

I don't let my MCP servers always run. If you want to start them you can open up the mcp.json file in the editor and above each of the definitions there is a little start button you can click on to get it going.

NOTE mcp-feedback-enhanced is optional because I believe copilot will handle interfacing with you on questions just fine. But I recognize that I personally am not always going to be using Copilot for my solutions and wanted to use a less vendor locked solution. I'm also simply interested in human-in-the-loop MCP servers and this one was the best of three I tested out.

The Prompts

To create a re-usable interface you can use Github Copilot prompt files in your project by creating them in the ./.github/prompts/ folder with a name like *.prompt.md. Once created you can kick them off at anytime in the Copilot agent chat window with a /<prompt> command.

Here is one I created to walk a user through creating an AWS terraform deployment from scratch.

---
mode: 'agent'
description: 'Create AWS terraform code for given requirements with interactive feedback.'
---

Create AWS Terraform code for the following requirements with step-by-step reasoning and interactive feedback:

Requirements: ${input:requirements:What infrastructure do you need? Please be as detailed as possible.}

Use the interactive_feedback tool to gather any additional necessary information from the user to refine their requirements.

Use the aws-knowledge MCP tool to ensure accuracy and best practices in AWS services and Terraform code.
Use the terraform-mcp-server tool to generate the Terraform code to meet the refined requirements for AWS infrastructure.
Output the final Terraform code only after confirming all requirements with the user, including any refinements made through interactive feedback.
Include a markdown file with all the requirements gathered along with any you have inferred along with the final Terraform code.
Refine all infrastructure requirements to be AWS-specific and aligned with best practices, security, and compliance standards. Be thorough and detailed in your analysis.
If you need to gather more information from the user to refine the requirements, use the interactive_feedback tool to ask clarifying questions before generating the code.

Rules:
    - Terraform should be written using HCL (HashiCorp Configuration Language) syntax.
    - Use the latest AWS provider version compatible with the required resources.
    - Follow best practices for Terraform code structure, including the use of variables, outputs, and modules.
    - Ensure that the generated code is well-documented with comments explaining the purpose of each resource and configuration.
    - Always try to use implicit dependencies over explicit dependencies where possible in Terraform.
    - When generating Terraform resource names, ensure they are unique and descriptive, lower-case, and snake_case.
    - Be sure to include any necessary provider configurations, backend settings, and required variables in the generated code.
    - Ensure the generated terraform code always includes a top level `tag` variable map that is used on all taggable resources, with at least the following tags: `Environment`, `Project`, and `Owner`.
    - Ensure that sensitive information such as passwords, API keys, and secrets are not hardcoded in the Terraform code. Use variables and secret management solutions instead.
    - Do not assume any prior knowledge about the user's AWS environment; always seek clarification when in doubt.
    - Do not ask for AWS specific information like instance types, instead focus on high level requirements and attempt to map them to AWS services for the user.
    - Before finalizing the Terraform code, always confirm with the user that all requirements have been accurately captured and addressed.
    - All output should be created in the `output/aws/` directory with appropriate filenames.

And one for Azure.

---
mode: 'agent'
description: 'Create azure terraform code for given requirements with interactive feedback.'
---

Create Azure Terraform code for the following requirements with step-by-step reasoning and interactive feedback:

Requirements: ${input:requirements:What infrastructure do you need? Please be as detailed as possible.}

Use the interactive_feedback tool to gather any additional necessary information from the user to refine their requirements.

Use the azure-knowledge MCP tool to ensure accuracy and best practices in Azure services.
Use the terraform-mcp-server tool to generate the Terraform code to meet the refined requirements for Azure infrastructure.
Output the final Terraform code only after confirming all requirements with the user, including any refinements made through interactive feedback.
Include a markdown file with all the requirements gathered along with any you have inferred along with the final Terraform code.
Refine all infrastructure requirements to be Azure-specific and aligned with best practices, security, and compliance standards. Be thorough and detailed in your analysis.
If you need to gather more information from the user to refine the requirements, use the interactive_feedback tool to ask clarifying questions before generating the code.

Rules:
    - Terraform should be written using HCL (HashiCorp Configuration Language) syntax.
    - Use the latest Azure provider version compatible with the required resources.
    - Follow best practices for Terraform code structure, including the use of variables, outputs, and modules.
    - Ensure that the generated code is well-documented with comments explaining the purpose of each resource and configuration.
    - Always try to use implicit dependencies over explicit dependencies where possible in Terraform.
    - When generating Terraform resource names, ensure they are unique and descriptive, lower-case, and snake_case.
    - Be sure to include any necessary provider configurations, backend settings, and required variables in the generated code.
    - Ensure the generated terraform code always includes a top level `tag` variable map that is used on all taggable resources, with at least the following tags: `Environment`, `Project`, and `Owner`.
    - Ensure that sensitive information such as passwords, API keys, and secrets are not hardcoded in the Terraform code. Use variables and secret management solutions instead.
    - Do not assume any prior knowledge about the user's Azure environment; always seek clarification when in doubt.
    - Do not ask for Azure specific information like instance types, instead focus on high level requirements and attempt to map them to Azure services for the user.
    - Before finalizing the Terraform code, always confirm with the user that all requirements have been accurately captured and addressed.
    - All output should be created in the `output/azure/` directory with appropriate filenames.

If you are ready to bootstrap either an AWS or Azure terraform project via Copilot go ahead and do so using the prompt. This starts the process for an Azure based terraform project /terraform-azure-bootstrap. It will start by asking what you want then ask you refining questions to figure out what needs to be created. You do not need to close the feedback window that comes up, it will automatically be reused and refresh its contents when it needs further information or approval from you.

Additional Prompts

For they heck of it I also created a few more prompts that can be used to convert a terraform project from Azure to AWS and vice versa. These use the same MCP servers but with different prompts. I'll let you look at the examples I constructed for each in the Github repo for this exercise. I created two fictitious projects off the top of my head, one for AWS and another for Azure. I then used the conversion prompt for each to create the equivalent project for the other cloud provider.

Pleasant Surprises

When it works the way I want AI can be extremely satisfying to wield. This is even more so when it yields more than what you asked for. In this case I found that:

For the managed kubernetes deployment it generated functioning Makefiles with a plethora of commands that are useful to the deployment.
The terraform conversion between one provider and another included cost comparisons between the two deployments.
The feedback tool used can remain open and be used for all prompts back and forth between the agent.
The requirements.md generated is quite comprehensive and additive to the deployment for user comprehension.
Both the AWS and Azure MCP servers were easily used by the Agent with very little extra prompting.
For the virtual machines I put in some rather complex logic for how I wanted the disks done and was surprised to find that the appropriate user-data.sh bash script for AWS and cloud-init.yml file for AWS was created for me not only with the disks done as I had requested (LVM and mounted to /opt) but much more. For instance, it also generated a pretty decent nginx deployment for wordpress, test scripts for cloud storage access (that I purposefully included as requirement to try to trip things up), and cloud specific agent installs for disk and memory monitoring. Pretty slick!
There was a corpus of additional documentation included with both example deployments that included a good deal of extra info that I might personally include in a project were I delivering it to a team to manage.

Irksome Things

The results are not all positive. I have a few minor gripes as well.

An abundance of emojis while visually pretty to see just screams LLM generated to the trained eye. Can probably reduce their use with minor prompt adjustments.
The nondeterministic nature of LLMs means results for documentation were wildly different between each project. I specifically requested requirements.md be generated in the bootstrap process but forgot to say anything about it in the migration prompts. The first example I migrated from AWS to Azure left the file mostly in tact. The second example migration from Azure to AWS created a 500+ line operational guide out of it (which was cool and all, but still makes my point here).
As mentioned before this can chew through your premium tokens pretty quickly depending on your requirements.

Conclusion

So would I use any of this terraform without reviewing it first? Of course not. Heck, it probably wouldn't even run without some modifications. But I certainly would use it to get things started for a project. It produces quite clean and easy to read terraform with the correct naming conventions, variables, and documentation to get things off to a very nice start. I will not use it to scaffold out every project I do though. This is mainly because it does seem to burn through premium tokens which I'd rather use for more complex work. I'm on a standard plan and creating the 4 examples you can find in the project repository ate almost 10% of my premium tokens.

This combo of MCP servers is quite good at overcoming some of AI's issues with building proper terraform as well. I'm quite happy that this is the case as repeated bizarre LLM results on terraform generation was starting to get upsetting. Next up, an MCP server that will allow you to use your own organizational modules. I'm hoping to have such a tool ready to test out next month sometime (if anyone already has one please reach out to me so I can collaborate with ya!).

AI Lessons Learned

Zachary Loeber — Wed, 01 Oct 2025 19:16:15 +0000

I've got a tendency to take tools and frameworks in IT and immediately push them to their limits and beyond. Sadly, this often lands me into the trough of disillusionment quite quickly when exploring any new technology. On the flip side, it is through this process I often learn some great lessons. This article will cover lessons learned as it pertains to AI in an effort to help shortcut some of those that are starting to dive further into this incredible new world we are entering with AI.

Context Management Is Key

Context is the length of your prompt in its entirety. This includes any conversation history, custom instructions, additional rules, available tool instructions, RAG results, and verbalized reasoning output. This adds up quickly in smaller local models and needs to be factored into your overall context management strategy. One decent strategy is to look into multi-agent frameworks where each agent has its own unit of context. It is quite easy to cram all your needs into a single agent because you as a human could do a single workflow from end to end. But if you give it just a bit more thought and logically break things out into sub units of work for various sub-agents it will be less likely you run into context limit issues.

NOTE Is your agent reading in several dozen files from the filesystem? This is one area where you can easily blow up your context if not thought out carefully!

Sometimes a Dumber AI is Better

Many LLM models include reasoning or thinking modes of operation that you may reflexively want to use. Why wouldn't you want your LLM to be a bit more thoughtful in how it responds right? I can give you a few reasons you may want to dial back the deep thoughts on these things. Firstly, it can cause token bloat which directly equates to additional cost and latency. Secondly, not all LLMs separate out the thoughts from the output the same. Ollama will inline the thoughts with standard responses in tags like <thought></thought>. This can be a bit of a bummer to deal with in some applications. While it can be fascinating to read how they are thinking through a process it can really pollute output if not handled properly. Third, I've experienced that including thinking in my requests sometimes led to worse results overall. This is only my anecdotal observations but I believe some models maybe overthink some simpler tasks or in the case of multi-agent interactions, simply confuse agents reading the reasoning output responses of other sub-agents.

If you are employing a multi-agent workflow I'd consider only allowing for the orchestrator/master agents to process with additional thinking models. Or if that is not suitable, just enable thinking selectively and just make a bunch of purpose driven sub-agents that can be a bit dumber.

MCP Is Sweet But Fickle

I've run into several issues with MCP tools that were driving me crazy. There are some great MCP inspection tools but in a pinch you can also simply ask the LLM to give you a report of available tools that have been exposed to it. Here is a cagent definition I put together that does this for a local ollama model I was testing out with some tools I was tinkering with on my local workstation.

#!/usr/bin/env cagent run
version: "2"

models:
  thismodel:
    provider: openai
    model: gpt-oss
    base_url: http://localhost:11434/v1
    api_key: ollama
    max_tokens: 16000
    temperature: 0.1

agents:
  root:
    model: thismodel
    add_date: true
    description: "Creates documentation on available tools for this agent"
    instruction: |
      You are evaluating the functionality of various tools available to you as an AI agent.
      Your goal is to generate a comprehensive report on the functionality of any tools that you can use to assist you in your tasks.
      You will use the filesystem tool to read and write your final report in markdown format as a .md file with a name like tool-report-<date>.md.
      No other tools are to be used by you directly but you can query the list of tools available to you.
      You will instead generate a list of all the tools you can use and their functionality.
    toolsets:
      - type: filesystem
      - type: think
      - type: memory
      - type: mcp
        command: terraform-mcp-server
        args: [ "stdio" ]
      - type: mcp
        command: npx
        args: [ "-y", "mcp-searxng" ]
        env:
          SEARXNG_URL: "http://localhost:8080"

Model Selection Is Hard

There are just so many models out there to choose from. It would be easy to think that local models are good enough but honestly, no they are not. Aside from their smaller context length there is no standard way to really even look them up. This makes finding effective context length and max token counts a chore at best. Ollama has their own online catalog (api for it is forthcoming I've read) and there are some other minor lifelines such as this gem buried in the LiteLLM repo. This is just the hard details of the models, not their numerous scores, capabilities, and more. OpenRouter.ai has an api endpoint that makes searching for some of this a bit easier for models it supports.

This is all only for language models by the way. Additional servers and consideration for their use come to play for image, video, or audio generation. So if you are planning on doing something multi-model then the efforts begin to stack up rather quickly.

This all said, often simply choosing a decent frontier model is the fastest and easiest way to go. Grok for more recent research is nice, Claude for coding is a good bet, OpenAI if you want to fit in with the broadest ecosystem of tools and community support.

Don't Forget Embedding Models

Let's not forget both RAG and (most) memory related tasks require embedding models. In (most) cases this will require some vector database which means you will need to encode your data into vectors via an embedding model. These are smaller and purpose driven to convert your language (or code AST blocks, or <some other esoteric data>) into embedded similarity vectors. If you are doing local RAG for privacy then you will need a local embedding model and vector database to target. I've been using ollama and one of a few models it offers for embedding with qdrant as my local vector store as it has a nice little UI I can use to further explore vectorized data. Towards the end of this article I'll include a docker compose that will bring up this vector database quite easily.

If you are embedding RAG data you will still often need to get it into an embedding model friendly format. I've taken a liking to marker for this task to process PDFs and other document formats. Once installed you can process a single document against a local ollama model to create a markdown file quite easily marker_single --llm_service=marker.services.ollama.OllamaService --ollama_base_url=http://localhost:11434 --ollama_model=gpt-oss ./some.pdf. There are so many options for marker that I think the author must be partially insane (in a good way, I dig it) so check it out if you get a few free cycles. The project is impressive in its scope.

Back to embedding models. There are several local ones you can choose from. Here are a few of the most popular open source ones as generated via AI.

Model Name	Dimensions	Max Input Tokens	Perf. (MTEB/Accuracy Score)	Multilingual Support
mistral-embed	1024	8000	77.8% (highest in benchmarks)	Yes
nomic-embed-text	1024	8192	High (state-of-the-art)	N/A
mxbai-embed-large	1024	N/A	High (state-of-the-art)	N/A
EmbeddingGemma	N/A (small model)	N/A	High (best under 500M params)	Yes (100+ languages)
Qwen3 8B Embedding	N/A (8B params)	N/A	70.58 (top in multilingual)	Yes

Some additional notes on each model as well:

Model Name	Notes
mistral-embed	Strong semantic understanding; open weights available on Hugging Face.
nomic-embed-text	Offline-capable via Ollama; privacy-focused for local deployments.
mxbai-embed-large	Efficient open-source option; available via Ollama or Hugging Face.
EmbeddingGemma	Mobile-ready; Matryoshka learning; ideal for edge devices or fine-tuning.
Qwen3 8B Embedding	Excels in diverse topics; Apache 2.0 license for customization.

Here is a simple diagram of choices to make for selecting one of the free embedding models for your own projects.

Matryoshka Support? This was new to me when writing this article. A model that supports this might embed a chunk of data with 1024k dimensions to query for similarity against but be trained to surface the most important ones into the top 256 or 512 dimensions. This allows for the embeddings to capture most of the semantic meaning with a slight loss of precision if truncated compared to the full vector. Pretty nifty as it allows single models to generate multi-dimension embeddings. This is inspired by the concept of Matryoshka dolls, where smaller dolls nest within larger ones, and is formally known as Matryoshka Representation Learning (MRL).

Web Search Without Limits/Keys

When you start to develop AI agents to do things one of the first activities will be to search the web for content then scrape it. This seems like a very innocuous task as it is something you might do every day without thought. But doing so automatically as an agent often requires some form of API key with an outside service (like Serper or any number of a dozen others) or through a free but highly rate limited target such as duckduckgo.

With MCP and a local SearXNG instance you can get around this snafu fairly easily. It is a local running search aggregator. Remember dogpile.com? SearXNG is kinda like that but locally hosted and more expansive in scope. You need only expose it to your agents using a local MCP server and they can search and scrape the web freely. I've included it in this docker compose file for your convenience (along with the valkey caching integration). This compose file is self-contained. All configuration can be done via the config blocks at the bottom.

# Exposes the following services:
# - http://localhost:6333/dashboard - qdrant (ui)
# - http://localhost:8080 - searxng (ui)
# - valkey (internal, for searxng)

services:
  valkey:
    container_name: valkey
    image: docker.io/valkey/valkey:8-alpine
    command: valkey-server --save 30 1 --loglevel warning
    restart: unless-stopped
    volumes:
      - valkey-data2:/data
    logging:
      driver: "json-file"
      options:
        max-size: "1m"
        max-file: "1"
    healthcheck:
      test: ["CMD", "valkey-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 5

  searxng:
    container_name: searxng
    image: docker.io/searxng/searxng:latest
    restart: unless-stopped
    ports:
      - "127.0.0.1:8080:8080"
    volumes:
      - searxng-data:/var/cache/searxng:rw
    configs:
      - source: searxng_limiter_config
        target: /etc/searxng/limiter.toml
      - source: searxng_config
        target: /etc/searxng/settings.yml
    environment:
      - SEARXNG_BASE_URL=https://${SEARXNG_HOSTNAME:-localhost}/
    logging:
      driver: "json-file"
      options:
        max-size: "1m"
        max-file: "1"
    depends_on:
      valkey:
        condition: service_healthy
    healthcheck:
      test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:8080/"]
      interval: 30s
      timeout: 10s
      retries: 5
      start_period: 30s

  qdrant:
    image: qdrant/qdrant:latest
    restart: unless-stopped
    container_name: qdrant
    ports:
      - 6333:6333
      - 6334:6334
    expose:
      - 6333
      - 6334
      - 6335
    configs:
      - source: qdrant_config
        target: /qdrant/config/production.yaml
    volumes:
      - ./data/qdrant:/qdrant/storage
    healthcheck:
      test: ["CMD", "bash", "-c", "exec 3<>/dev/tcp/127.0.0.1/6333 && echo -e 'GET /readyz HTTP/1.1\\r\\nHost: localhost\\r\\nConnection: close\\r\\n\\r\\n' >&3 && grep -q 'HTTP/1.1 200' <&3"]

volumes:
  valkey-data2:
  searxng-data:

configs:
  searxng_limiter_config:
    content: |
      # This configuration file updates the default configuration file
      # See https://github.com/searxng/searxng/blob/master/searx/limiter.toml

      [botdetection.ip_limit]
      # activate advanced bot protection
      # enable this when running the instance for a public usage on the internet
      link_token = false
  searxng_config:
    content: |
      # see https://docs.searxng.org/admin/settings/settings.html#settings-use-default-settings
      use_default_settings: true
        # engines:
        #   keep_only:
        #     - google
        #     - duckduckgo
      server:
        # base_url is defined in the SEARXNG_BASE_URL environment variable, see .env and docker-compose.yml
        secret_key: "some_secret_key123"  # change this!
        limiter: false  # enable this when running the instance for a public usage on the internet
        image_proxy: true
      search:
        formats:
          - html
          - csv
          - rss
          - json
      valkey:
        url: valkey://valkey:6379/0
  qdrant_config:
    content: |
      log_level: INFO

You can see my prior cagent yaml example to see an mcp server that can use this local instance.

...
      - type: mcp
        command: npx
        args: [ "-y", "mcp-searxng" ]
        env:
          SEARXNG_URL: "http://localhost:8080"
...

Conclusion

AI development is a rapidly evolving field, and the lessons learned along the way can save you time, frustration, and resources. By understanding the nuances of context management, model selection, embedding strategies, and practical tooling, you can build more robust and efficient AI workflows. Embrace experimentation, but also leverage the growing ecosystem of open-source tools and best practices. As the landscape continues to shift, staying curious and adaptable will be your greatest assets. Happy building!

Terraforming With AI

Zachary Loeber — Wed, 24 Sep 2025 19:29:43 +0000

This article will go over using a team of AI agents in conjunction with the Terraform MCP server and Docker's cagent tool to clean up some rather gnarly autogenerated terraform without needing to write any code.

Introduction

I've been digging quite deeply into the morass of AI related tools, models, and agent based frameworks as of late. It is hard to not be fascinated with the prospect of a human language driven declarative engine regardless of how non-deterministic LLMs output can be. I use tools like Cursor, Copilot, Dyad, or any number of agentic cli tools to code out solutions (or parts of them) daily. But I've not had the opportunity to create an agent based workflow. This is mainly because there have been few issues worthy of such attention that couldn't be resolved by using AI to create more deterministic solutions (aka. code/scripts). Using generated code is far less costly and resource intensive as having to push things through an LLM to get the results you are looking to achieve.

Recently I found a good reason to use an AI workflow and instead of custom coding out something with CrewAI+Python or similar I opted to give Docker's cagent tool a spin.

The Problem to Solve

Being asked to turn an existing infrastructure deployment into code kinda stinks as a whole. But this kind of task is often a necessity if the environment was hastily constructed via click-ops or existed before you came on board. Terraformer was released by Google for this very purpose. It generates terraform from existing resources for a large number of terraform providers. As such it is a great place to start.

The workflow is not so hard really:

The terraformer tool can also import directly into remote state but I'm opting out of doing this as I wish to rewrite the generated manifests to not give me seizures when reading them.

Using Terraformer

Terrafomer is a single binary tool that can create terraform for several provider types using some kind of demonic pact or wizardry that is beyond my mere mortal brain. The process for using it is pretty easy:

Create your version.tf file in an empty folder with your provider requirements and backend state target.

# version.tf

terraform {
  required_version = ">= 1.11"
  backend "local" {
    path = "terraform.tfstate"
  }
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = ">= 6.0"
    }
  }
}

Initialize the folder via terraform to pull down the provider(s)

terraform init

Use terraformer list to determine the provider resources you wish to import for the provider you are targeting. In my case this would be AWS.

terraformer import aws list

Let 'er rip! This example targets a specific AWS profile I'm already authenticated with for 1 region and several network related resources.

terraformer import aws \
    --resources=route_table,transit_gateway,vpc,vpc_endpoint,vpc_peering,igw,nat,subnet \
    --regions=us-east-2 \
    --profile=AWSAdministratorAccess-1111111111111 \
    --connect=true \
    --path-pattern=./generated \
    --output=json

The result will be a ./generated folder in the current directory with a bunch of terraform manifests and tfstate file. Without the --path-pattern=./generated each provider and resource type extracted would be created as a separate subfolder under a ./generated folder with its own state (in our case generated/aws/route_table, generated/aws/transit_gateway, et cetera). I also opted to export everything as JSON instead of HCL as it is far easier to parse and use in automation.

NOTE 1 If you supply multiple regions like us-east-1,us-east-2 then a folder for each region will be created instead. You can also use --path-pattern=./ to remove the ./generated folder from the mix to drop it all into the local path as well.

Export Issues

Terraformer is pretty cool in what it does but the code generated is abysmal and it exports an unsustainable mess of terraform manifests. Some issues with the autogenerated terraform include:

Resource and other terraform block names with -- (ugly).
Resource names that include upper-case and dashes instead of lower-case snake_case names.
A large number of attributes with superfluous default or constructed values being defined (ie. all_tags and region).
A variables.tf file with no actual variables (includes a data source for the tfstate file instead).
The use of remote state data that points to existing output as input to other terraform resources (???).
Hard-coded attribute values for ids of resources being created elsewhere in the output.
Only supporting terraform 0.13 and below for the state being generated.
Generating exports for resources that are not able to be imported.
No implicit dependencies at all.

Some of this can be individually fixed in a deterministic manner with scripts if you know the nuances of the provider. Other issues are rather nondeterministic in nature where your exported target resources, providers, and deployment can lead to a wide variety of possible results.

There are just too many nuanced issues to deal with here for a single script to solve for. Normally I'd just spend hours hand crafting the generated output for use in production.

The Approach

Since most of the exported terraform is named in a way that includes essential ids that are behind the resources being created let us take the following approach:

Use the export process to create the initial terraform and state locally as JSON.
Clean up the code base to address many of the issues as noted above (Use the MCP terraform server as needed here).
Create modern terraform import blocks for all the resources.
Delete the local state file.
Add implicit dependencies where it makes sense to do so (Use the MCP terraform server as needed here as well).
Optional: Convert final output to HCL.
Optional: Create reports on what was done.

Basically we will use AI agents to do all the work I might otherwise do manually (that word...'manual'...yuk, sorry for my filthy language)

This should allow us to do this for any terraformer export moving forward with minor changes based on the provider.

NOTE After going through this whole process I'm now considering just using JSON for all my terraform. Honestly, it is pretty easy to read and use compared to HCL and its many nuances.

The Chosen Tool

Agentic AI has several frameworks and tools to choose from. I'm proficient in multiple languages so there are many doors open to me. But I chose a largely no-code solution by docker called cagent for a few reasons;

No Code - Mostly no code. This allowed me to scaffold out and test required prompts with little up front development.
Simplicity - With this tool I'm defining a single yaml file with multiple agents, their tools, and the models to use.
MCP Support - This supports MCP natively quite easily, we need this for the terraform-mcp-server used for some of the more advanced tasks.
Scripts as Tools - Some of the required tasks are able to be completed with simple scripting due to how deterministic they are with the correct information. For instance, if you are able to lookup the terraform resource provider documentation for attributes assigned in your manifest to see if those attributes are being set as default values then removing them from the manifest becomes a simple shell script with jq. Cagent supports defining these scripts as custom tools with parameters.
Curiosity - I just wanted to check this project out, so sue me.

The Solution

My whole solution can be found in this project repo to clone and use as you see fit. It includes some additional scripts for downloading the required binaries and setting up the environment.

I use a multi-agent workflow that runs sequentially to break down the steps into manageable parts and reduce overall token context usage. It processes terraform export data you drop into the ./input directory to make clean and usable Terraform in the ./output directory.

Agent	Purpose
root	orchestrate the workflow of subagent calls
Cleaner	Performs a number of terraform cleanup tasks
Connecter	Connects exported resources to create implicit dependencies where possible
Importer	Recreates state using terraform import blocks for any elements able to be imported
Finalizer	Performs final Terraform best practice scan of results, converts to hcl

This single YAML file is the entire workflow.

#!/usr/bin/env cagent run
version: "2"

models:
  # You can use ollama local models. These sort of worked for me
  gptoss:
    provider: openai
    model: gpt-oss
    base_url: http://localhost:11434/v1
  # Or OpenAPI compliant endpoints like OpenRouter.ai
  openrouter:
    provider: openai
    model: x-ai/grok-4-fast:free
    base_url: https://openrouter.ai/api/v1

agents:
  root:
    model: openrouter
    description: Beautifies and refactors Terraform code that was automatically generated by the terraformer tool
    sub_agents:
      - cleaner
      - connecter
      - importer
      - finalizer
    instruction: |
      You are an expert Terraform developer that specializes in writing clean, maintainable, and efficient Terraform code. 
      You manage a team of terraform experts that perform various tasks for your workflow.

      <AGENTS>
      - cleaner - A cleaner Agent that performs a series of cleanup tasks on the terraform codebase
      - connecter - A connecter Agent that connects resources together by updating static values to use implicit dependencies instead
      - importer - An Importer Agent that creates the state import blocks for the resources defined in the terraform codebase
      - finalizer - A Finalizer Agent that reviews the changes made by the other agents and ensures that everything is correct and complete
      </AGENTS>

      You will start the workflow below to improve the quality and maintainability of a terraform codebase in the ./input path. 
      <WORKFLOW>
        1. call the cleaner agent to perform the cleanup tasks
        2. call the connecter agent to connect resources together by updating static values to use implicit dependencies instead
        3. call the importer agent to create the import blocks for all resources defined in the terraform codebase
        4. call the finalizer agent to review the changes made by the other agents and ensure that everything is correct and complete
      </WORKFLOW>

      ** Rules
      - Use the transfer_to_agent tool to call the right agent at the right time to complete the workflow.
      - DO NOT transfer to multiple agents at once
      - ONLY CALL ONE AGENT AT A TIME
      - When using the `transfer_to_agent` tool, make exactly one call and wait for the result before making another. 
      - Do not batch or parallelize tool calls.
      - Do not skip any steps or change the order of the steps. 
      - Do not add any additional steps or modify the workflow in any way.
    toolsets:
      - type: think
      - type: todo

  cleaner:
    model: openrouter
    description: A cleaner Agent that performs a series of cleanup tasks on the terraform codebase
    instruction: |
      You are an expert Terraform developer that specializes in writing clean, maintainable, and efficient Terraform code.

      You will perform the following tasks in order to clean up the terraform codebase in the ./input path:
        1. Use the remove_all_attributes script on `./input` directory to remove the 'tags_all' and 'region' attributes
        2. Use the replace_double_dashes script on the `./input` directory
        3. Update the `./input/provider.tf.json` file to remove the terraform block if it exists
        4. Delete the `./input/terraform.tfstate` file if it exists
        5. Delete the `./input/terraform.tfstate.backup` file if it exists
        6. Delete any `./input/.terraform` directories if they exist
        7. Delete any `./input/.terraform.lock.hcl` files if they exist
        8. Update all references found that look like this: `"${data.terraform_remote_state.local.outputs.*}"` with the associated output value in `./input/outputs.tf.json` that is being referenced.
        9. Delete the `./input/outputs.tf.json` file
        10. Delete the `./input/variables.tf.json` file
        11. Use the terraform-mcp-server tool to find and remove all resource attributes defined with default attribute values using the `remove_default_attributes` script.

      ** Rules
      - Do not make any changes outside of the `./input` path.
      - Do not add any additional steps or modify the workflow in any way.
      - Do NOT make recommendations, instead just follow the instructions and make the changes directly to files using the provided scripts and tools.
      - Follow the instructions exactly and in order. 
      - Do not skip any steps or change the order of the steps.
      - If you are unsure about a step, just do your best to follow the instructions and move on to the next step.
      - Only use the provided scripts and tools to make changes to the codebase.
    toolsets:
      - type: filesystem
      - type: think
      - type: mcp
        command: terraform-mcp-server
        args: [ "stdio" ]
      - type: script
        shell:
          remove_default_attributes:
            cmd: "./scripts/remove-default-attributes.sh $filename $attribute $value"
            description: "Remove resource attributes that are set to default values"
            args:
              filename:
                description: "The Terraform file to modify"
                type: "string"
              attribute:
                description: "The resource attribute to remove"
                type: "string"
              value:
                description: "The default value to match"
                type: "string"
          remove_all_attributes:
            cmd: "./scripts/remove-all-attributes.sh $pathname $attribute"
            description: "Remove all resource attributes from the Terraform files regardless of value"
            args:
              pathname:
                description: "The path to run this script against"
                type: "string"
              attribute:
                description: "The resource attribute to remove"
                type: "string"
          replace_double_dashes:
            cmd: "./scripts/replace-double-dashes.sh $targetpath"
            description: "Replace double dashes with single dashes in resource names"
            args:
              targetpath:
                description: "The path to replace double dashes in"
                type: "string"

  connecter:
    model: openrouter
    description: A connecter Agent that connects resources together by updating static values to use implicit dependencies instead
    instruction: |
      You are an expert Terraform developer that specializes in writing clean, maintainable, and efficient Terraform code.
      You will perform the following tasks in order to connect resources together in the terraform codebase in the ./input path:
        1. Find all defined resource attributes with static values that are logically connected to the output attributes of other resources in the deployment 
        and update their assignments to be implicit dependencies of the generated resources instead.
        (For example: A vpc endpoint defined with `"vpc_id": "vpc-0b009e1ed52947d16"` when we create that vpc as `resource.aws_vpc.tfer_vpc-0b009e1ed52947d16` 
        should become `"vpc_id": "${aws_vpc.tfer_vpc-0b009e1ed52947d16.id}"`). Look for other common attributes that are often statically defined that could be converted 
        to implicit dependencies as well. These include but are not limited to:
          - subnet_id
          - security_group_id
          - vpc_id
          - iam_role_arn
          - cluster_id
          - instance_id
          - bucket_name
          - key_name
          - db_instance_identifier
          - db_subnet_group_name
          - route_table_id
          - network_interface_id
          - elastic_ip
          - nat_gateway_id
          - load_balancer_arn
          - target_group_arn
          - certificate_arn
          - log_group_name
          - topic_arn
          - queue_url
      ** Rules
      - Do not make any changes outside of the `./input` path.
      - Use the terraform-mcp-server tool to lookup resource output attributes when needed 
      - Do NOT make recommendations, instead just follow the instructions and make the changes directly to files 
    toolsets:
      - type: filesystem
      - type: think
      - type: mcp
        command: terraform-mcp-server
        args: [ "stdio" ]

  importer:
    model: openrouter
    description: Creates Terraform import blocks for resources that were automatically generated by the terraformer tool
    instruction: |
      You are an expert Terraform developer that specializes in writing clean, maintainable, and efficient Terraform code.

      You will create import blocks for all resources in the Terraform manifests found in `./input` by
      using the terraform-mcp-server tool to lookup each resource that is defined in ./input/*.tf.json files to
      generate import terraform code blocks within a new `./input/imports.tf.json` file. If the resource
      does not support import, remove it from the codebase.

      `./input/imports.tf.json` should be a valid terraform json file with the following structure:
      ```

json
      {
        "import": [
          {
            "to": "${resource_type.resource_name}",
            "id": "resource_id"
          },
          ...
        ]
      }
      ** Rules
      - Only make changes inside of the `./input` or `./output` paths.
      - When looking up resources using the terraform-mcp-server tool; 
          only lookup resources that were created in the `./input` path,
          if a resource is not able to be imported, remove it from the codebase and do not include it in the imports.tf.json file.
          only use the resource name as the identifier (do not use any other attributes or values).
          if you are unable to find a resource, just move on to the next step without making any changes.
          lookup the most recent version of the provider
      - When done processing do not display your final report to the screen, instead create a report in ./output/import_report.md in markdown format with your notes, suggestions, and changes made.
    toolsets:
      - type: filesystem
      - type: think
      - type: mcp
        command: terraform-mcp-server
        args: [ "stdio" ]
      - type: shell

  finalizer:
    model: openrouter
    description: Reviews the changes made by the other agents and ensures that everything is correct and complete.
    instruction: |
      You are an expert Terraform developer that specializes in writing clean, maintainable, and efficient Terraform code.
      Your job is to review the changes made by the cleaner and importer agents in the `./input` path and ensure that everything is correct and complete.

      You will make any final adjustments or corrections to the terraform codebase in the ./input path as needed to ensure that it is ready for production use.
      You can use the terraform-mcp-server tool to lookup the terraform style guidelines for any resources that you are unsure about.
      You will also ensure that the imports.tf.json file is correctly formatted and contains all necessary import blocks for the resources defined in the terraform codebase.

      When completed, you will create the `./output` directory and copy the cleaned and finalized terraform codebase from the `./input` directory to the `./output` directory
      converting the codebase from json into valid hcl format for use in production terraform pipelines as you go.

      ** Rules
      - Do not make any changes outside of the `./input` and `./output` paths.
      - If you find any issues or inconsistencies you are able to resolve, you will correct them directly in the codebase.
      - If there are any issues or inconsistencies you cannot resolve, add a comment to the top of the relevant file describing the issue and suggesting a possible solution.
      - Do not display your thought process or reasoning, just make the changes directly to the codebase.
      - Do not display your final report to the screen, instead create a report in ./output/final_report.md in markdown format with your notes and suggestions for next steps.
      - Return success when complete.
    toolsets:
      - type: filesystem
      - type: think
      - type: mcp
        command: terraform-mcp-server
        args: [ "stdio" ]
      - type: shell

Results

I ran this solution against a rather large network deployment in AWS and was immensely satisfied with the results. It was approximately 95% accurate after running the results through a terraform init and plan. It only missed a single AWS eid import for a NAT gateway.

This is of course a warning to ALWAYS validate the output of a nondeterministic workflow like this. Had I just accepted it as-is a key resource would have been recreated thus causing an outage.

Lessons

I learned a few lessons in my endeavors that are worth noting.

1. Frontier Models are Just Better

The models you choose really make a difference. No matter the billions of parameters or mixture of agents or any other tricks used by the model if it is not good at tool calling it will fumble about, freeze, and produce subpar results. If you aren't using Claude or OpenAI then just create an OpenRouter account and become part of the training data of some of the larger more mature models offered for free.

I started with a capable local ollama host and tried a few decent models and had painfully random results. One out of ten runs would get me near my goals. It was maddening. The moment I jumped over to a frontier model things started working as designed consistently.

2. New Tools == Inconsistent Documentation

cagent worked well but there is invalid documentation right in the first README.MD file in the project repo (it is toolsets not toolset). As the yaml schema doesn't seem to be validated you don't even know there is an issue either. Additionally, no where in the included usage docs does it explain how to use custom scripts as tools. I only found out about them from some of the several dozen included examples.

Also nowhere is it documented that you can simply use an openAI compliant API. But it does work, promise! Your only caveat is that regardless if there is an API key or not you must have OPENAI_API_KEY set in your environment.

This is an open source app so if you are in doubt often it is best to just roll up your sleeves and dig into the code like I did to get the answers you are looking for.

3. MCP is Nice

I'm pretty happy with the results of my efforts but had to diverge from using the Docker MCP registry for the solution. As it supports calling the tools with arguments I opted to just install the binary locally using a mise http provider (also a nifty trick worth looking at in my mise.toml file). This allows for the entire solution to run without docker or using their pre-approved MCP images.

I'm convinced that using MCP is was what made this solution work well. Without it I've yet to see any LLM model create very good terraform, ever.

4. Mixed Solutions are Sweet

I've used the term 'deterministic' and 'nondeterministic' like 20x in this article. Using LLMs in any solution is prone to give you different results, thus they are nondeterministic. Any other code that produces results the same way every time is deterministic. Using both kinds of tools in a solution like this is a powerhouse hit in my mind. Provide your team with the right tools to get the job done and it need not produce the same result every time, it just needs to accomplish the tasks you are giving them.

The cagent tool is nice in that you are not required to create a whole MCP server for a few scripts to be exposed as tools to the agents. This allowed me to create a team of agents yet tell them to use some specific scripts as tools. This reduced the amount of effort being asked of the agents which, in turn, increased the quality of the produced results. As in real life, I don't really care if the results produced are always the same 100% of the time. I do care if they are technically infeasible or unusable though.

5. Pleasant Surprises!

I got a few unexpected but pleasant surprises with the addition of my Finalizer sub-agent. It took the liberty to rename some of the resources to be more descriptive to what they actually were in the environment. It was able to discern some of the vpc's and subnets to be reflective of their purpose and in general did WAY more than I expected from it. I believe this is because it was instructed to use the terraform mcp server to look up style guidelines and create a production worthy release.

On a whim I also had it do the HCL conversion and this worked way better than having to deal with a custom tool or script as well. I was expecting to just use the JSON results when finished but this ended up not being the case at all.

Conclusion

Automating the cleanup and refactoring of Terraformer output using agentic AI workflows has proven to be both practical and efficient. By leveraging tools like Docker's cagent and the Terraform MCP server, it's possible to transform messy, autogenerated Terraform code into production-ready infrastructure as code with minimal manual intervention. While some nondeterminism remains inherent to LLM-driven solutions, combining deterministic scripts with AI agents yields high-quality, maintainable results. As these tools and frameworks mature, expect even more streamlined and reliable workflows for infrastructure automation. Always remember to validate the final output, but with the right approach, AI-powered refactoring can save significant time and effort for DevOps teams.

Resources

Free Tokens for AI Exploration

Zachary Loeber — Mon, 14 Jul 2025 21:19:26 +0000

Using ChatGPT, Gemini, Grok, or any of the other chat based LLM service is a great way to start with AI. But in order to bring things to the next level you will either need some beefy hardware to run models locally or access to an online API with models you can use. This article will walk you through how to do the later of these two options for free.

What is OpenRouter?

OpenRouter.ai is an API service that acts like an umbrella to several dozen LLM providers with some of these models being free for training purposes. You can use these free models to develop AI at no cost if you don't mind being part of that training set. Sadly, OpenRouter does not include any form of embedding endpoints to use. Embedding is the conversion of knowledge/data for later retrieval. This is essentially how AI memory works so without this essential component your development efforts will be crippled.

Learning Point Embedding models are used in AI to convert text or other data into numerical vectors that capture semantic meaning. These vectors enable efficient comparison, search, and retrieval of information, making them essential for tasks like semantic search, recommendation systems, and enabling memory in AI agents.

This project works around the lack of an embedding model endpoint by using the ollama endpoint locally.

Requirements

Create your own local .env file from the .env_example included and update the OpenRouter API key to be your own key. Other dependencies can be installed in macos/Linux using the included configuration script, ./configure.sh.

NOTE I use mise for installing the required binaries here and recommend you install and use it if you do not already. Otherwise you can get away with just having the ollama binary, docker, and python 3.12+.

Starting The Embedder

Start ollama locally and pull down and embedding model to use:

# Configure ollama to run as a server and pull in the embedder used for storing 'memories'
ollama serve &
ollama pull nomic-embed-text

# Optionally test it out (the final output should be a list of vectors representing the embedded data)
curl http://localhost:11434/api/embeddings \
  -d '{
    "model": "nomic-embed-text",
    "prompt": "What is semantic search"
  }'

NOTE Here is a more detailed description with some example code of using an embedding model via ollama to store embeddings into a vector database (Chromadb) locally.

With this running, we can then focus on finding an appropriate (free) LLM model from OpenRouter.

Finding a Free LLM

I've included a script you can use to query OpenRouter for LLMs of any sort, including the free ones. In our case we are looking for LLMs with zero cost but also include tools as a feature.

source ./.venv/bin/activate
python -m src.select-openrouter-model --max-cost 0 --limit 10 --output brief --features 'tools'

This should provide a list of models you can use for free that support tools capabilities. This might be different in the future, thus the script.

1. Mistral: Devstral Small (free) (ID: mistralai/devstral-small:free)
2. Meta: Llama 3.3 8B Instruct (free) (ID: meta-llama/llama-3.3-8b-instruct:free)
3. Meta: Llama 4 Maverick (free) (ID: meta-llama/llama-4-maverick:free)
4. Meta: Llama 4 Scout (free) (ID: meta-llama/llama-4-scout:free)
5. Google: Gemini 2.5 Pro Experimental (ID: google/gemini-2.5-pro-exp-03-25)
6. Mistral: Mistral Small 3.1 24B (free) (ID: mistralai/mistral-small-3.1-24b-instruct:free)
7. Meta: Llama 3.3 70B Instruct (free) (ID: meta-llama/llama-3.3-70b-instruct:free)
8. Mistral: Mistral 7B Instruct (free) (ID: mistralai/mistral-7b-instruct:free)

Demo: Interactive Human-AI Chat with CrewAI

This demonstrates using the ollama embedder and openrouter.ai LLM with chainlit and crewai to prompt a user for more information.

Demo: Overview

This script creates a conversational AI assistant that collects personal information through natural dialogue. Using CrewAI's agent framework and Chainlit's user interface, it demonstrates how to build interactive AI systems that gather specific information while maintaining a natural conversation flow.

Demo: How It Works

When a user sends a message, two specialized AI agents work together:

Information Collector: Asks follow-up questions to gather name and location details
Information Summarizer: Transforms collected data into a natural, friendly summary

Key Features

Natural back-and-forth conversation with AI
Dynamic follow-up questions when more context is needed
Friendly web interface using Chainlit
Structured information collection in a conversational format

Running

source ./.venv/bin/activate
chainlit run ./src/human_input/crewai_chainlit_human_input.py

This example demonstrates how AI systems can be made more interactive by combining structured task workflows with natural human conversation.

Good Take Away Knowledge

Some things to understand about all of this.

CrewAI uses LiteLLM

CrewAI uses LiteLLM to proxy most connection requests to various LLM providers. This can lead to some confusing results when you go to run your crew and find that your manually passed information for models, endpoints, and keys do not work. This is due to how LiteLLM will source in environment variables and use them instead. In our case, we load a .env file into the current environment variables list of the current process which means they should align with the expected names of the LiteLLM provider. This means the variable names are not fungible. You must define them as the following for CrewAI to function properly for OpenRouter.

...
OPENROUTER_API_KEY="<your-key>"
OPENAI_MODEL_NAME="<model>"
OPENAI_API_BASE="https://openrouter.ai/api/v1"
...

CrewAI Memory

CrewAI Memory is stored locally but still requires an embedding API endpoint to function. By default this is OpenAI's endpoint. Without modification you will be left with many errors about invalid credentials in your logs for OpenAI even when you aren't using it for your LLM calls.

In my examples I overwrite the memory target from a default location in your home directory to the local project. See the CrewAI docs on how to review collections in this data and do other troubleshooting around this.

Summary

This was an example using CrewAI with OpenRouter but should apply for most any Agent framework you decide to use. Enjoy coding up your AI app!

Semi Auto Importing Terraform State

Zachary Loeber — Mon, 14 Jul 2025 20:06:18 +0000

On more than one occasion I've longed for the mean's to automatically import terraform state. But this is a feature I know will likely never be added for a number of very good reasons. In some cases it is possible to use only a plan file to automate the generation of terraform import blocks though. Here is how it can be done.

TLDR

This blog, script, and example can be found here

Use Case

You have created a slew of terraform modules or code for resources that already exist and would like to import them into your state via terraform import blocks.

Why No Auto Import?

For a long time I've wondered why terraform does not include a native ability to automatically import targeted state elements. It 'feels' like it would be such a killer feature to have when you need to refactor a bunch of infrastructure as code or add it where none existed before. But a deeper inspection on this yields several reasons that this feature will never be added.

1. Ambiguity

Terraform relies on explicit code definitions in .tf files to know what resources to manage.

Auto-import would require Terraform to guess the appropriate HCL configuration for each resource. This would be error-prone or incomplete, especially for:

Resources with complex dependencies
Resources using computed values, modules, or for_each/count
Custom logic or dynamic blocks

2. Lack of 1:1 Mapping from API → HCL

Many cloud resources have non-obvious or lossy mappings between API responses and HCL syntax.

Example: AWS IAM policies, ECS task definitions, security group rules, etc., may include generated or optional data not present in Terraform configs.

Some fields are ignored by Terraform or only exist as computed outputs, meaning Terraform can't regenerate full HCL from state.

Analogy: It's like trying to reverse-engineer source code from a compiled binary — possible in some cases, but often messy.

3. Tooling Complexity

Implementing robust auto-import across providers would require:

Parsing provider schemas for every resource type
Generating idiomatic HCL, including nested blocks
Ensuring generated code matches best practices
Handling drift or manual configuration inconsistencies

This is difficult to maintain across hundreds of providers and thousands of resource types.

4. Risk of State Drift or Mismanagement

Automatically importing resources could lead to:

Accidental overwrites of unmanaged resources
Misalignment between actual infrastructure and expectations in code
Users thinking resources are safely managed when they aren't

Terraform's import is deliberately manual and opt-in to prevent such surprises.

5. Terraform's Design Philosophy: Explicit is Better

HashiCorp prefers a conservative, explicit workflow where users:

Define resources in HCL
Import them manually using terraform import
Verify state and code alignment

This makes changes and intentions clear, especially in regulated or production environments.

So What? I Need This!

The built in options are:

Use terraform cli to manually import each state element with the correct ID and target after running your plan.
Use terraform's built in import blocks to perform one-time state import in your pipeline.

Either option is a manual slog. But in some cases we can use a code generation approach against an existing plan file (in json) to emit all the import statements for new resources you know already exist. This can be done with clever mapping for predictable provider resources and the process works like this:

Author the initial terraform manifests
terraform init
terraform plan -out=plan.tfplan
terraform show -no-color -json plan.tfplan | jq > plan.json
Look up the import schema to determine the import id format and, more importantly, if you already have all the data you need to construct the id within your existing plan file.
Create an id map for your import data. This should target one or more providers and can include any known data you already have or that can be scraped from the existing plan file (see example further on).
Run the this script with your plan json data and the map file to create a new set of import commands. uv run ./import-terraform.py ./plan.json new_imports.tf --id-map ./import_map.yaml
terraform plan -out=plan.tfplan --> If this shows only imports and additions then you likely are ready to apply. If not, then review what went wrong or how your mappings are defined to ensure they are accurate.

NOTE 1 In step 4 I use jq to make the resulting output prettier for you to visually parse later when making your map file.

NOTE 2 Each map file is going to be highly dependant on your needs! I've yet to figure out how to import the appropriate schema to automate this process for a target provider. Import ids can be wildly different based on the provider and resource. See ./import_map.aws.example.yaml for one that merges several data points into a single id for instance.

How The Script Works

This script first parses out the resource_changes of the plan file for anything with change.action == create. It then correlates each item found to a map file entry based on the provider and resource name. If one exists, it extrapolates the ID map for an import block based on the same created resource's change.after data. Finally it uses all this to generate a valid import block for the target resource.

This means we can only auto-import predictable terraform based on named elements. We do not auto pull data from any outside resources which limits the scope of what can be imported using this method. For example, a new ec2 instance would be impossible to auto-import using this method. This is because the instance id required for the import block would never be available in our plan file data. But one could still produce some template import blocks using this script, then post-process the results with another script that replaces the output with found aws instances!

Requirements

Install requirements via uv: uv sync

You can also use mise to install terraform and python and uv if required mise install -y (also included in ./configure.sh)

Example

A fairly poor but working example of how to do this can be found in the ./example path. Local resources do not lend themselves well to importing state so I used a local vault deployment with the hashicorp/vault terraform provider instead. Here is how you can run through this example locally:

cd examples

# Start local vault dev instance
docker compose up -d

# export the dev root token
export VAULT_TOKEN=dev-token-12345

# perform initial deployment

terraform init
terraform plan -out plan.tfplan
terraform apply plan.tfplan

# delete the local state then get your plan as json again.
rm ./terraform.tfstate ./terraform.tfstate.backup
terraform show -no-color -json plan.tfplan | jq > plan.json

At this point you will need to figure out the import id by visiting the terraform provider documentation (I know of no way to scrape import id schema automatically anywhere). So I dropped into the terraform provider website for vault and looked up the vault_mount and vault_kv_secret_v2 resources. I discovered that that they both require just a path to import. Sweet! Lets start with the vault_mount resource. I inspect the plan json output for the vault_mount create resource data and zero in on the after section for the created resource:

{
      "address": "vault_mount.kv",
      "mode": "managed",
      "type": "vault_mount",
      "name": "kv",
      "provider_name": "registry.terraform.io/hashicorp/vault",
      "change": {
        "actions": [
          "create"
        ],
        "before": null,
        "after": {
          "allowed_managed_keys": null,
          "allowed_response_headers": null,
          "delegated_auth_accessors": null,
          "description": "KV Version 2 secret engine",
          "external_entropy_access": false,
          "identity_token_key": null,
          "listing_visibility": null,
          "local": null,
          "namespace": null,
          "options": {
            "version": "2"
          },
          "passthrough_request_headers": null,
          "path": "kv",
          "plugin_version": null,
          "type": "kv"
        },
        ...

Looks like they give us path straight away so the start of our map file looks like this:

registry.terraform.io/hashicorp/vault:
  vault_mount:
    id: "{path}"

Now if we look at the next resource, vault_kv_secret_v2, we see something like this:

{
      "address": "vault_kv_secret_v2.secrets[\"user-credentials\"]",
      "mode": "managed",
      "type": "vault_kv_secret_v2",
      "name": "secrets",
      "index": "user-credentials",
      "provider_name": "registry.terraform.io/hashicorp/vault",
      "change": {
        "actions": [
          "create"
        ],
        "before": null,
        "after": {
          "cas": null,
          "data_json": "{\"admin_password\":\"secure_password_123\",\"admin_username\":\"admin\",\"last_backup\":\"2024-01-15T10:30:00Z\",\"user_count\":\"150\"}",
          "data_json_wo": null,
          "data_json_wo_version": null,
          "delete_all_versions": false,
          "disable_read": false,
          "mount": "kv",
          "name": "user-credentials",
          "namespace": null,
          "options": null
        },
        ...

No path! But we can make the correct path with mount and name so that becomes our mapping to complete our map file.

registry.terraform.io/hashicorp/vault:
  vault_mount:
    id: "{path}"
  vault_kv_secret_v2:
    id: "{mount}/data/{name}"

NOTE: An astute reader will notice that I included the 'data' section in that path. This is just a nuance of Vault kv version 2 that I happen to know already. It is also a good example of how you can manually tweak these mappings.

Now that we have this we can create the import block file and proceed to replan with it in place to import all the existing paths.

uv run ../import-terraform.py ./plan.json kv_imports.tf --id-map ./import_map.yaml
terraform plan -out plan.tfplan
terraform apply plan.tfplan

This will pull in the existing secrets as state and recreate your state file as it was before you deleted it.

Improvements

The manual part of this, creating map file entries, is pretty hard to automate. We could theoretically automate it somewhat using AI that scrapes the terraform registry documentation for each resource found and seeks out the import requirements to then generate the map entries. Or we could (somewhat dangerously) try to evoke terraform import commands with a bogus id and scrape the returned errors as some providers (such as AWS) will give useful errors for id format issues.

I am also fascinated by the terraformer project's ability to create terraform from existing resources for several dozen providers. Perhaps there is some way to generate import commands instead of painfully large terraform manifests with some clever engineering of it's source code.

If anyone has better options for this arduous process leave a comment, I'm keen to know what I'm missing here. Otherwise, maybe this script will help get you part of the way to clean terraform state. May all your terraform pipelines run green my friends!

Pre-Cache Terraform Provider Plugins

Zachary Loeber — Wed, 19 Mar 2025 17:04:38 +0000

Pre-caching terraform providers in your CICD pipeline images is awesome but hardly anyone does it. I've created a project that makes this task easier than ever.

Why

In a very active platform as a service inside a larger organization you can see hundreds if not thousands of pipelines being run in a day for various terraform provisioning. This can add up to quite a bit of network activity and time wasted downloading the same terraform providers over and over again. But terraform is pretty smart and will not redownload provisioners that already exist in the local plugin cache. Pre-caching these can be beneficial for two main reasons

Reduce provisioner pipeline run time - Eliminates the near constant re-downloading of these external binary packages from the terraform registry.
Reduce external dependencies - The terraform registry has gone down at least 1 time in the last few years. This causes an unresolvable provisioning outages.

How

I've created a project that includes a Dockerfile and some scripts that process a yaml file that contains target git repos (and any subpaths) that the image would be used within. When built, this image will:

Pre-cache providers for the defined target git projects/folders
Install multiple versions of the terraform and other binaries via mise.

Usage

Clone this repo into your organization then make updates as needed:

Update the config/provisioners.yml file with all of your downstream terraform provisioning projects, their branches, and target folders that will be processed.
Update the mise.toml file to include terraform and other binary versions you wish to have included.
Add CICD pipeline code for your organization to build and push your image.

NOTE The order of versions in mise.toml matter. The first one in the list will be used by default. See the configuration of mise for more details on this wonderful tool.

Manual Providers

If you need to include latest versions of a provider or have a need to manually define one, you can easily do this as well. Edit the local config/provisioners.yml file and add a local path that contains a terraform version.tf file within the local config directory. Examples are provided in this project (that can be removed if you do not need them)

Local Testing

To see how this will work you can run everything locally using the included taskfile tasks within.

task providers

This should produce a local tempproviders folder with all of the plugins for your downstream terraform provisioners.

Additionally, helper tasks for building and shelling into the container image are included.

task docker:build docker:shell

Conclusion

Shaving off 10 seconds per pipeline may seem like a fool's errand but the benefits from such an exercise are hard to ignore. Eliminating external dependencies while speeding up your frequently used pipelines should always be in you scopes when engineering your solutions.

The Technical Interview

Zachary Loeber — Thu, 27 Feb 2025 20:57:21 +0000

In the last several years I've had the great privilege of being the final technical interviewer for a large number of candidates. This article is an inside scoop on how I perform these interviews with tips on how you can shine as a candidate.

Why My Opinion Matters

I come from a background of 25+ years of experience across a wide range of technical areas with a large number of certifications under my belt. I often joke that I got into this industry almost 30 years ago with the goal of knowing all there is yet here feeling like I know less than ever.

Sadly, unless the interview is a direct hire for my own team there is a strong chance we will never actually work with one another. But that is OK, I still get the benefit of having deep technical discussions with a wide range of geeks across all walks of life and technical specialties. I just happen to be the one that has been chosen to vet candidates that are fibbing on their abilities from those that are not. This must be done in under an hour with little more than your resume in most cases.

The Basics

Obviously, not all technical positions are made the same. A Sr. DevOps role will demand entirely different skills than a Cloud Architect. But there are common elements of all technical roles that will surface in any high level technical interview with me. Regardless of your skill set I'd be prepared to talk on one or preferably all of these topics;

Software development life-cycle (SDLC) - This includes varying degrees of knowledge about git and CICD and how to deliver artifacts to production. This may feel strange to list first but it is part of most roles currently in some form or another.
Automation - Ansible, Terraform, or some declarative infrastructure as code is a huge plus in any technical role. If you have none of this then the next section is your friend.
Scripting - Pick a language of your choice and know something about it. PowerShell, Bash, JavaScript, or Python are easy ones but any form of scripting knowledge is a sign you are not just an average mouse clicker. If you come from a development background then even better (unless that language is something like FoxPro or QuickBasic)!
AI - Yes, I expect you to be able to explain some ways you have used it to make your life easier and your job more productive. It shows you are sticking with the times as well.

Your Specialty

This is the reason why we are talking, your moment to shine as it were. Here are my general tips for how to handle me during this most essential portion of our discussion.

Know It - Simple tip but worth putting out there first. I promise you cannot fake it with me very easily. I've done hundreds of interviews across many technical specialties and have a very high success rate for weeding out fakers. Your resume should reflect what we will be talking about on the very first page somewhere and be recent enough for you to go into depth on projects you have worked on that exhibit that knowledge. I will dig as deep as I can and then keep digging until you cannot answer my questions in most cases.
Teach Me - Adding to the prior tip, I gain supreme satisfaction learning something new. If you can teach me about something in our industry I never knew before I'm going to be not only impressed but also more grateful for our time together.
Be an Expert in Something - This is a bit more nuanced but I find that in most interviews I end up asking the candidates what they feel they are best at. This is intentional because then I know you should have answers in that topic and I should be able to dig and learn from you. This is one way I'll litmus test your brain on things. For example, if you are an expert in Ansible be ready to know what template language playbooks are written in (Jinja2 and YAML), how you can automate them, and the protocol it runs over.
Be Curious - No one likes a know it all. Curiosity is how we grow ourselves in this industry and push our knowledge boundaries to new levels. I'm going to want to know where you would like to expand on your current skills and grow yourself further in your journey.
Be A Problem Solver - Have at least one recent experience you can talk about in great detail that exhibits that you were in the trenches and getting things done in a self-directed manner. It should be something that you are proud of and can speak towards. This should easily be found on the first or second page of your resume.

Some Other Tips

While each discussion I have is bound to be somewhat different there are some common elements that will come up. Often I'll even start my discussion with these tips because, believe it or not, I'm rooting for you to do well!

Focus on Yourself - If you got this far in your career then I already know you must work somewhat well with others and so I don't really care about what you have done 'as a team'. I'm only interested in your personal accomplishments and solutions as that is how I can further assess your actual knowledge.
Be Honest - If you don't know, just say so and we will pivot to something else. I will never hold honesty against you. But I will hold it against you if you waste our limited time spitting out totally wrong answers just because.
Be Passionate - No one wants to have a dull conversation. You will stick out if you are into what you do far more than if you drone on about technology and your accomplishments. Passion for this industry is why I am part of it, ideally you reflect that insatiable desire to learn and grow in our industry of endless possibilities as well.
Be Visible - If you don't show your face it severely lessens your chances of being selected among a list of candidates that do. Additionally I will rely on this to better read how flustered you get and how well you handle things when I start digging deep into your brain for what you know.
Be patient - Specifically with me. Obviously I don't know everything and sometimes I may ask questions because I selfishly want to walk away learning something new. I also might make hard pivots on purpose to keep things going and to see how well you handle it. Its not personal it is by design.

Final Notes

This was a rather non-technical article but one that felt right to put out there given the climate of uncertainty within out industry. I genuinely hope you take this information and do well in your technical interviews. Maybe, if we are both lucky, you and I will do one of these together and you can shine brightly like the incredible geek that you are!

Mise - Jump between Per-Folder Dev Environments like a Wizard

Zachary Loeber — Mon, 03 Feb 2025 22:09:45 +0000

Developing software across multiple programming languages and projects can feel like juggling flaming chainsaws while riding a unicycle. The additional cognitive load of getting things in place locally for your workstation for some other team's project can really slow a day down. Each language comes with its own ecosystem, version managers, and dependency hell. One project might require Python 3.11, another Node.js 18, and yet another Go 1.20. Keeping your local development environment in sync with these requirements is a nightmare. Containers like Docker can help, but they often feel heavy-handed for local development. And then there’s Nix—powerful, but let’s be honest, it’s like learning a new language just to manage your tools.

Enter asdf-vm, the tool that promised simplicity. It was a breath of fresh air—a single tool to manage multiple runtime versions. But as I dug deeper, I stumbled upon aqua, which added a layer of declarative configuration. Finally, I discovered mise (formerly known as rtx), and it felt like the missing piece of the puzzle. Mise combines the simplicity of asdf-vm with the declarative power of aqua, making it my go-to tool for managing local development environments.

In this post, I’ll walk you through how I use mise to tame the chaos of multi-language development, complete with a mise.toml example to configure a project.

Why Mise?

Mise is a tool that allows you to define your project’s runtime requirements in a simple, declarative way. It’s like having a personal assistant who knows exactly which tools and versions you need for each project. Here’s why I love it:

Declarative Configuration: Define your tools and versions in a mise.toml file.
Multi-Language Support: Works with Python, Node.js, Go, Ruby, and more.
Multi-Package Manager Support: Can install packages from asdf-vm, aqua, npm, pipx, ubi, go, gem, dotnet, cargo, spm, and vfox (vfox?)
Simple Setup: No need to wrestle with containers or learn a new ecosystem like Nix.
Seamless Integration: Works alongside your existing tools and workflows.
Environment Support: Can inject local .env files into your environment.

You get lots of wins using mise.

Example: Using `mise.toml` to Configure a Project

Let’s say you’re working on a project developed by some maniac that requires:

Python 3.11
Node.js 18
Go 1.20

Additionally, perhaps this project requires some environment variables as secrets in the git ignored ./.SECRETS.env as well as non-secret environment variables in .env.

Here’s how you can define these requirements in a mise.toml file. You would drop this into the project folder you are working on.

[tools]
python = "3.11"
nodejs = "18"
go = "1.20"

[[env]]
_.source = './.env'
PYTHONPATH = "./src"
NODE_ENV = "development"

[[env]]
_.source = './.SECRETS.env'
SOMETHING = "nope"

With this file in place, running mise install -y will automatically install the specified versions of Python, Node.js, and Go. Mise also sets up the environment variables defined and sources in the secrets.

NOTE: This is all predicated on mise being injected into your console session.

You may notice that the versions are at the minor semver level. This should net you the latest patch semver release for that software. You can, and probably should, pin these versions to the exact version for a project you maybe inheriting.

Visualizing the Workflow with Mermaid

Let’s break down the workflow with a Mermaid diagram:

Why This Matters

Managing development environments shouldn’t be a full-time job. With mise, you can spend less time configuring tools and more time writing code. It’s simple, declarative, and works seamlessly across multiple languages. Whether you’re a solo developer or part of a team, mise can help you standardize your development environment and reduce onboarding friction.

Give Mise a Try

If you’re tired of juggling versions and wrestling with containers, give mise a shot. It’s a game-changer for multi-language development. Here’s how to get started:

Install mise:

curl https://mise.jdx.dev/install.sh | sh
echo 'eval "$(~/.local/bin/mise activate zsh)"' >> ~/.zshrc

Create a mise.toml file in your project.
Run mise install and start coding!

Retrofitting An Existing Project

Typically I'll just add the mise.toml file at the project root including only the following:

Used programming languages
Build tools (jq, yq, task, et cetera)

I then add a local ./configure.sh file at the root like this:

#!/usr/bin/env bash

# Check for GITHUB_TOKEN to not be rate limited
if [ -z "$GITHUB_TOKEN" ]; then
    echo "GITHUB_TOKEN is not set. Please set it and try again."
    exit 1
fi

# check for mise
if ! command -v mise &>/dev/null; then
    echo "Please install mise first (run 'curl https://mise.run | sh')"
    echo ""
    exit 1
else
    eval "$(mise activate bash)"
fi

## This is optional if you require any [env] sections be processed
#mise settings set experimental true
#mise trust

# Install all dependencies
mise install -y

WARNING: I'd ensure you have the GITHUB_TOKEN env var in place using your own created PAT. This will ensure rate limiting doesn't ruin your day.

If you have some unique tool that is not an asdf-vm plugin you can add software via any one of the back end package providers supported. I've found that aqua may have some newer packages. But you can also try your luck with the ubi backend. This one hits up GitHub releases and tries to guess the latest release for your architecture.

Results

I'm very happy with how well it integrates with my console sessions and shims the right application versions based on my location seamlessly. You can get rather fancy with the ability to source .env files (they are essentially bash scripts).

Bonus - Sops Integration

You can have your secrets encrypted via sops and an age key and mise should handle the decryption entirely behind the scenes. I've created an example project that implements this feature here for your convenience. This project uses a generated age key to encrypt a local .secrets.env.json file for inclusion into the local environment via mise.

Happy coding, and may your development environments always be easy to use!

Atmos - Wield Terraform Like a Boss

Zachary Loeber — Wed, 29 Jan 2025 19:21:56 +0000

Terraform is great until you have to deal with state. As large state inherently will not scale you find that the more things grow the more state that needs to be managed and connected and otherwise understood.

Atmos is a tool for this and so much more. This article will build on my prior terraform to opentofu encrypted local state example to introduce the use of atmos for deployment of my multi-state project. This time I'll change over to the tofu-encrypted-atmos branch.

What is Atmos?

Atmos is an opinionated infrastructure deployment tool from the great minds at CloudPosse. This team has a long history of releasing incredible Terraform modules and being deeply involved within the DevOps community. Erik has been running a weekly podcast/open office hours to talk all things DevOps for years now that I highly recommend people check out.

Anyway, the point is that makers of this tool are really really good at flinging Terraform and have strong opinions on how to automate it. Atmos is the culmination of experienced gained by being in the trenches with infrastructure automation. It might be comparable to terragrunt, Terraform stacks (public beta), Cisco's stacks, or even the cdktf which also has a stack concept.

Stack Origins: One may say that AWS CloudFormation is the origin story for treating infrastructure deployments as a 'stack'. It spawned AWS CDK which resulted in the libraries used to generate some other CDKs that build upon the stack concept.

Understanding

As always, there is a short learning curve to grok the atmos view of the world. I'll distill it down as best I can (please read more in their docs for a deeper dive). Let us start by level setting on vocabulary as it will help shortcut understanding of how to look at the atmos project structure layout. Here are some terraform terms and their Atmos equivalent.

Terraform	Atmos	Description
root module	component	State lives here
multiple root modules	stack	Multiple components grouped together
group of available root modules	component library	Just a bunch of components in a logical group
child module	child module	same as they ever were

TIP: A stack is effectively an environment.

Atmos makes clever use of terraform workspaces for each component defined in a stack. This is pretty efficient and an entirely seamless use of terraform workspaces that looks a little like this:

In this next diagram we'd have one state element for the localhost, cluster1, and cluster2 components in the dev workspace. We'd also have 1 state element in the baremetal and cluster1 components for the prod workspace. This gives us 5 total state targets when completely deployed.

NOTE: The component library concept allow for and additional vector to parameterize your deployments and allows for dependency mapping between disparate terraform states.

Adopting Atmos

In order to accommodate for atmos I had to allow for some previously git ignored paths and trust that my openTofu encrypted state process was working properly.

NOTE I effectively relinquished the location of state files to atmos for local state file. I did end up adding a quick validation task unit test the state is encrypted for each deployment. task test:state

In my root modules (aka. 'components') I had to remove all traces of the local backend configuration as atmos overwrites it otherwise (causing an endless loop of changed backend state migration approval prompts). The documentation for atmos goes over a slew of different state management schemas that allow for deep customized workflows tailored to an organization's structure.

I also changed the base folders to comply with the atmos way of doing things and ended up with a basic project structure like this:

I made almost no real changes to my modules or my base terraform but I did move them which required some validations. I also had to create the YAML scaffolding. This included the component library definition:

# stacks/catalog/localhost.yaml
components:
  terraform:
    localhost:
      metadata:
        component: "localhost"
    cluster1:
      metadata:
        component: cluster1
      settings:
        depends_on:
          1:
            component: "localhost"
    cluster2:
      metadata:
        component: cluster2
      settings:
        depends_on:
          1:
            component: "localhost"

The stack definition:

# yaml-language-server: $schema=https://atmos.tools/schemas/atmos/atmos-manifest/1.0/atmos-manifest.json
# stacks/deploy/dev.yaml

vars:
  stage: dev

import:
  - catalog/localhost

components:
  terraform:
    localhost:
      vars:
        env: "dev"
        clusters: ["cluster1", "cluster2"]
        secrets_path: "../../../secrets/dev"
    cluster1:
      vars:
        cluster_name: "cluster1"
        key_path: "../../../secrets/dev/"
        kubeconfig: "../../../secrets/dev/cluster1_config"
    cluster2:
      vars:
        cluster_name: "cluster2"
        key_path: "../../../secrets/dev/"
        kubeconfig: "../../../secrets/dev/cluster2_config"

And also the workflow to run them all as a set of additional atmos tasks.

# stacks/workflows/localhost.yaml
name: localhost
description: Bring up and configure a few kind clusters
workflows:
  up:
    description: |
      Bring up the local environment
    steps:
      - command: terraform apply localhost -auto-approve
      - command: terraform apply cluster1 -auto-approve
      - command: terraform apply cluster2 -auto-approve

  down:
    description: |
      Tear it all down
    steps:
      - command: terraform destroy cluster2 -auto-approve
      - command: terraform destroy cluster1 -auto-approve
      - command: terraform destroy localhost -auto-approve

And finally the atmos.yaml definition that points the binary to tofu (installed via mise)

base_path: ""

components:
  terraform:
    base_path: "components/terraform"
    command: "tofu"
    apply_auto_approve: true
    deploy_run_init: true
    init_run_reconfigure: true
    auto_generate_backend_file: false

stacks:
  base_path: "stacks"
  included_paths:
    - "deploy/*.yaml"
  excluded_paths:
    - "**/_defaults.yaml"
  name_pattern: "{stage}"

workflows:
  base_path: stacks/workflows

logs:
  file: "/dev/stderr"
  level: Info

With this in place and a few custom Taskfile.yml tasks the entire infrastructure can be brought up or down with secured local encrypted state via 2 commands.

NOTE: I use the taskfile for kicking things off but all workflows and tasks can simply be done via the atmos cli directly.

task up
task down

Impressions

Overall I'm appreciating this tool succinctly wrapping Terraform state operations into manageable and highly customizable deployments.

Pros

There is an interactive TUI app that will delight you to see when you get your first atmos configuration working properly (if you struggled to get it working that dopamine hit when the tui pops up is incredible...)
There is OPA policy validation as well as jsonschema validation included. I love me some rego!
Just about every aspect can be configured via declarative configuration.
The tool's author's are heavily involved with the community and making regular improvements and significant feature additions.
Can add custom commands and workflows to replace your tasker tool.

Cons

At first, the proposed file structure for Atmos might feel unfamiliar, but that’s only because so many projects lack a consistently enforced, programmatic structure. While any new system comes with a learning curve, these conventions ultimately reduce cognitive load and create efficiencies—especially in a team environment (Stick with it, and it’ll ‘click’ before you know it!).
It felt quite difficult to get my existing deployment working with atmos. Yet I got it all working in an afternoon so 'difficult' may be relative here.
Almost every aspect of a deployment is able to be customized but it often is not readily apparent just where to change things.
Additional workflows will need to be created per-stack that you are deploying to automated the deployment of all the components within.
Just about every aspect can be configured via a slew of YAML.

Interesting

I'm only mildly uncomfortable with the fact that when atmos runs it generates local .tfvar files in semi-deeply buried locations within the component folders. I added this to the .gitignore list as I'm not certain they need to be there (and they do get recreated automatically).
I dig that atmos supports vendoring (akin to caravel's vendir app or the air-gap deployment tool zarf). But this is a manual affair of defining your vendored content.
There are strong tie ins with one of my other favorite declarative manifest deployment tools, helmfile.

I really like this tool and see myself using it to bootstrap some other local infrastructure via Terraform. How about you? What tools are you using to manage your terraform deployments?

OpenTofu - Encrypted State + Git to Bootstrap Infrastructure

Zachary Loeber — Sun, 26 Jan 2025 02:38:56 +0000

In the evolving world of infrastructure-as-code (IaC), tools like OpenTofu are pushing boundaries, enabling developers to efficiently manage and deploy infrastructure. The OpenTofu team has been on a roll with new features to address some of the longest running complaints in the Terraform community.

Two recent standout features are encrypted state and provider iteration. Both are intriguing and deserve a closer examination to understand their potential impact (and limitations) in real-world scenarios.

In this article I'll show how to maintain infrastructure bootstrap code and its state in git without the need for a third party vault, cloud storage, or additional secret sprawl. I lay out fully working examples of how this might be done both with standard Terraform and also via OpenTofu's encrypted state feature.

The example project I'll be covering deploys a couple of local kind clusters with ArgoCD installed. It then creates and pushes a ssh public/private key pair as Kubernetes local secrets.

Working Example - Part 1 (Terraform)

To explore this further I'll start with a deployment done entirely via Terraform.

NOTE To follow along with things you should clone this repo locally and run in any local bash/zsh shell with docker running. Further configuration information can be found in the project readme.

PROJECT: tofu-exploration
BRANCH: main

The main branch includes manifests for deploying 2 kind clusters side by side in the infrastructure/environments/local folder. The state is stored for each component as separate Terraform state files in the ./secrets folder. This folder is then targeted with sops to encrypt contents within.

# Bring cluster1 and cluster2 up
task deploy:all

# Here you should review secrets and other state stuff in ./secrets.
# Don't commit this to git yet!

After this has completed you should have a handful of files in the local ./secrets folder including:

Kubernetes configuration files with full rights to your created clusters
Additional per-cluster public and private keys
Infrastructure and per-cluster state files with all applied Terraform (including the generated ssh private keys and other sensitive information)

Encrypting Local State

Both plan and state files are inherently plain text. We can encrypt the state files easily enough though. To start you will need some private key that is kept locally. I've chosen age keys with sops. You could use PGP or anything that sops supports.

task | grep sops # Show a list of our convenience tasks
task sops:show # Show all the variables setup for the tasks

task sops:age:keygen # Generate a local age key
task sops:init # Initialize this project repo with your public age key
task encrypt:all # Encrypt every file in the ./secrets folder

You can now review the secrets files and see that they have all been encrypted. Binary looking files like ssh keys will be converted to JSON format with the information required to decrypt them baked into the metadata (obviously minus our private age key).

With the age private key in ~/.config/sops/age/keys.txt and all secrets files are encrypted you can now safely commit your changes to git.

When you need to decrypt and run terraform operations again:

task decrypt:all

NOTE You can and should use pre-commit hooks to prevent accidentally committing your secrets!

Clean Up

To remove the clusters and clean up your work in preparation for opentofu run this:

# Tear it down
task destroy:all
task clean

Working Example - Part 2 (OpenTofu)

I created, then updated the tofu-encryption branch from main.

PROJECT: tofu-exploration
BRANCH: tofu-encryption

This is the same deployment is done using opentofu's encrypted state instead of sops. First big update is that we are changing the binary used in our main Taskfile.yml definition to tofu.

NOTE I did try to use the VSCode plugin for OpenTofu but it was not very helpful for the more recent features (like the encryption block).

State/Plan Encryption

As per the docs we can encrypt state and plan data with native opentofu.

This can be enabled via the TF_ENCRYPTION environment variable or in the terraform block. The way this works is that you define a method which can optionally contain key providers or other configuration for encryption. The key providers and methods available are not so large currently but it is still enough to get along.

Vault Transit Support is not available if vault is running beyond 1.14 (the license change). It is experimental for openbao otherwise.

Anyway, the methods are assigned to the state and/or plan terraform definitions as either the primary or backup encryption types.

You can infer that your entry point for secret zero in a local file based state encryption will be that passphrase. We need to use something greater than 16 characters and private. The age private key can be used for this easily enough by setting the TF_VAR_state_passphrase variable I created just for this purpose.

Important! Ensure you have your local age key pair created with task sops:age:keygen (existing key will always be preserved).

With this in place I updated the local Taskfile.yml manifest to automatically source the private key value into that environment variable so it could be used as the encryption passkey in the relevant terraform block. The result is something like this:

variable "state_passphrase" {
  type = string
  description = "value of the passphrase used to encrypt the state file"
  validation {
    condition     = length(var.state_passphrase) >= 16
    error_message = "The passphrase must be at least 16 characters long."
  }
}

terraform {
  required_version = ">= 1.9.0"
  encryption {
    ## Step 1: Add the desired key provider:
    key_provider "pbkdf2" "mykey" {
      passphrase = var.state_passphrase
    }
    ## Step 2: Set up your encryption method:
    method "aes_gcm" "passphrase" {
      keys = key_provider.pbkdf2.mykey
    }

    method "unencrypted" "insecure" {}
    state {
      # enforced = true
      method = method.aes_gcm.passphrase
      fallback {
        method = method.unencrypted.insecure
      }
    }
    plan {
      # enforced = true
      method = method.aes_gcm.passphrase
      fallback {
        method = method.unencrypted.insecure
      }
    }
  }
  required_providers {
    kind = {
      source  = "tehcyx/kind"
      version = "0.7.0"
    }
  }
  backend "local" {
    path = "../../../secrets/local/infrastructure_tfstate.json"
  }
}

If we run the deployment with no further changes then it automatically encrypts the terraform state files when we deploy via task deploy:all.

The SSH keys I was generating and encrypting via sops before are not covered in this case. But that data is sourced from our state so we simply start ignoring them via .gitignore knowing we can always recreate them later.

Interesting: Because the kind provider I used doesn't track the local config file resource when it gets created, I needed to make changes to isolate the kubeconfig files to their own generated file resources instead.

With this in place we should be able to push state up to your git repo directly after any kind of state altering task has been done, clone it later to another machine with the same age private key, and run through the deployment lifecycle again seamlessly.

Impressions

I'm really happy with how fluid encrypted state works and will definitely be using it for some personal projects. Remember to keep all your secrets in state when doing this, be extra careful of what you commit, and of course protect/backup that private age key.