Ok so you're using Terraform to deploy and manage your Azure resources? It's pretty neat, but at some point it's likely you will stumble over an Azure resource or configuration that's not available in Terraform. The Terraform Provider for Azure covers a huge number of resources in Azure, but there's always going to be gaps, given how quickly Azure evolves.
So what can you do? Well, you can find a workaround (like using Azure CLI or even ARM Templates), or raise an issue on the Azure Provider GitHub project and hope some kind soul picks it up, or <deep intake of breath> roll up your sleeves, open your IDE of choice (VS Code of course!) and start work on adding yourself.
In this post I'll talk through some of the planning, pre-work and other technical investigation you're going to need to do, before you go diving headfirst into the code base. This is a replay of my own journey of adding to the provider, and trust me this isn't something you want to start doing without more than a little planning.
Basics & Pre-reqs
In Dante's Inferno, he describes hell having nine different layers or circles. Thankfully we have just five to worry about, however the amount of suffering involved might be the same as a trip through Dante's vision of unending torment.
Our layers are a little different, but no less infernal 😉
- The Azure REST API Spec - This repo contains OpenAPI specifications for all the Azure APIs. These specs are OpenAPI/Swagger definitions which describe shape of every API in Azure. Despite being at the top of this stack, most of the time you can pretty much ignore this, unless you think you've hit a bug with the way the API is behaving.
- The Azure REST API - This is the main management interface to Azure and implements the above API spec. This is going to dictate & shape everything else we touch.
- Azure SDK for Go - This wraps the above Azure API with a set of packages & client code making it "easier" than calling the API directly. Easier is matter of some debate.
- Terraform Provider for Azure - Also called 'azurerm' (which is the provider name), this in turn wraps the Go SDK, with a set of CRUD operations that Terraform understands.
- Terraform - The Azure provider is a plugin and extension to the core Terraform system
Needless you say you're going to need some Terraform experience, at least with the basics. It's also worth reading the docs on provider plugins to get familiar with the concepts and main interfaces around providers.
You're also going to need to be able to program in Go. You won't need to be a Go uber-coder, but if you've never touched it, you're going to have a very rough time, this is not the sort project in which I would advise you start learning Go with.
It also helps if you've some prior experience with the Azure REST APIs (also called the ARM APIs), such as how they are versioned, the sorts of operations you can perform and concepts such as authentication, resource providers, scopes etc.
So brave traveller, are you ready to descend?
The Azure API
First you need to identify the API or APIs in Azure that are going plug the gap you need. This might be something you can deploy or configure from the Azure portal, the CLI but not through Terraform. You're going to spend a lot of time in this section of the Azure docs https://docs.microsoft.com/en-us/rest/api/azure/
Let's assume you find what you need (if you didn't your journey has been a short one!)
Take plenty of time to research the API, try it out & experiment with it. My suggestion is to use the fantastic REST Client for VS Code. Build up a set of calls in a .http
or .rest
file which you can refer back to
I wrote a short post on using this extension to authenticate and call Azure APIs, and how easy it makes it - so it's worth referring to that
Calling Azure APIs with the REST Extension for VS Code
Ben Coleman for CSE Dev Crews ・ Nov 18 '20
Test the API thoroughly, time spent here will time well spent later. Poke at all the edge cases, "what if that array is empty?", "what if I put an invalid string here?", "if I remove that value, what is the default?". The docs will be some help, but generally are pretty bare bones.
Once you're happy that you've got good a handle on calling the API directly, it's time to move on to the next layer...
So traveller, I see you are prepared to descend deeper, let us journey on?
Build Go Prototype / Code Spike
The temptation might be to now jump into the Terraform provider code, but I'd hold off. Next it's worth spending some time working with the Azure SDK for Go to call the API in question
Azure / azure-sdk-for-go
This repository is for active development of the Azure SDK for Go. For consumers of the SDK we recommend visiting our public developer docs at:
The Go SDK is auto-generated from the aforementioned Azure REST API Spec, this means it's not exactly the most friendly SDK to work with. And with extremely rigid adherence to having everything well typed, means it can be laborious work to even see what the API is returning.
I would suggest using a code spike approach and build a simple console Go app, and in doing so take a look at:
- Which package in the SDK you need, as they are versioned to match the Azure API versions, but it can be a challenge to even find the right package in the complex tree within the SDK.
- How the API client in that package is instantiated, normally with a subscription ID and possibly location, but each client is different.
- Make calls to the create, update and get operations. It's important you know how to build the input struct for a create operation, which might not be as trivial as it sounds. Also how to extract data from the response, which can often involve some "walking" down through all the properties and casting/asserting as you go
This might seem like busy work and a waste of time, but you'll learn a lot in doing this, and it's much easier to test at this point before needing to go through an entire Terraform plan & apply loop. In addition a lot of this code you'll be able to copy and paste into the provider code.
Are you confident to push on, to our final infernal layer?
The Azure Terraform Provider
The final part of the "journey", but there's still a long way to go - as this part will by far take the longest. I'm not going to go super deep into the low level code & implementation specifics in this section, that could fill another post (or in fact several!), instead I want to focus on the design, approach & thinking you'll need to do.
Start with designing the shape of your resource, sketch out some pseudo code HCL. By now you've probably got a good idea what this will look like based on the body of your API requests, but think about if you can make it a little more user friendly. For example where it comes to field names, think what does the end user likely know this field/feature by? There can often be a disconnect between the "internal" names used in the API and the "external" names used in the Portal & CLI
One area to consider is where the API accepts an array of objects, this can be simplified in the HCL as a repeated block with the same name, this will automatically become an array. Therefor you will need to switch from a pluralized name to a singular. For example
Example API JSON payload
"rules": [
{
"name": "rule 1"
},
{
"name": "rule 2"
}
]
Example equivalent Terraform HCL
resource azurerm_amazing_new_thing "example" {
rule {
name = "rule 1"
}
rule {
name = "rule 2"
}
}
Moving onward finally to the provider code itself which is hosted here:
hashicorp / terraform-provider-azurerm
Terraform provider for Azure Resource Manager
You're contributing to a public open source project on GitHub, ultimately ending with a pull request which (hopefully!) will be merged into in main codebase and released. There's a certain way of working when contributing to OSS projects, there's plenty of guides online to help with this, e.g. first-contributions and MarcDiethelm/contributing
Take plenty of time to explore the codebase, but thankfully you don't need to understand it all, focus on digging into the azurerm/internal/services tree. Take some time to look at some Azure resources you are familiar with, try to find something not too complex (e.g. App Service) but not so simple it's trivial.
The resources are grouped by their Azure Provider API, and under each group there will be some shared code common to all resources in that package. The key ones being:
-
registration.go
- This should be a short file containing aSupportedResources()
function which maps the Terraform resource names e.g.azurerm_new_thing
to the Go function that provides that resource, e.gresourceArmNewThing()
. You should be able to copy and paste an existing line in here. -
client/client.go
- This is where all the API clients from the SDK are created for the given API package, really it's another case of copy and pasting an existing one as there's a clear pattern to things.
Implementing your resource will be a matter of copying an existing one and making a LOT of changes. There will be a set of callback functions for the CRUD operations Create
, Read
, Update
and Delete
, implementing those functions is the heart of the task at hand, and detailed here. Also the Schema is defined in here, this is the implementation of the HCL design we discussed a moment ago. You can read the docs also take a good look at how other resources define their schemas, and learn by example
Testing your code will take two shapes, informal testing locally with some Terraform HCL is the first place to start. In order to load your locally built version of the provider there's some hoops to jump though, which have been made a lot more difficult since Terraform 0.13.
The script below I created helps with this, it builds the azurerm code, and then copies the resulting provider binary to where the Terraform CLI can find it. Runs terraform init
with -plugin-dir
set and then does a standard Terraform plan & apply
Build provider & load plugin test harness script:
https://github.com/benc-uk/terraform-sandbox/blob/master/provider-test-harness/run.sh
Note. In the script I pick a deliberately out of range version for the plugin as a belt and braces to make sure my copy is really being picked up, so the version of azurerm in your test Terraform would need to be
version = ">=99.0.0"
You'll run this inner loop test many, MANY times until you get things working. The next set of tests are more formal acceptance tests. These live in the tests folder of the tree you are working in. All acceptance tests start with TestAcc
this means they won't run in the standard unit test cycle, they all contain embedded inline HCL which defines the test resources to create & destroy.
These tests are only run when certain flags are specified, namely calling make acctest
. Again I suggest learning by example, copy some existing tests and build on what you find. Note these tests will deploy real resources in Azure which means they can be very slow to run and incur costs!
The final stage before submitting your PR is to run though the test/check suite locally. This is done automatically in the CI pipeline of the project in GitHub Actions however there's no point submitting your PR if the CI validation is going to fail, so get it working locally first, the main checks to run locally are:
make tools
, make test
, make lint
, make tflint
, make depscheck
These scripts can take a little while to run and tend hit your machine quite hard, but once they are passing, you can finally submit your PR! Sit back and wait for the team to check it over, and get back to your with feedback. Be patient they are a very busy team!
Conclusion
PHEW! What a journey, but hopefully like Dante's Virgil you made it through unscathed! Undoubtedly you learned a lot in the process. Contributing to the Azure Terraform Provider isn't something you do lightly, but can be very rewarding.
Your addition can make Terraform just a little bit better for everyone using Terraform with Azure, and for that reason alone it's sometimes worth going on a bit of an adventure
Top comments (0)