Christian Leinweber

Posted on Dec 27, 2021 • Edited on Jan 18, 2022

Moving Azure Functions from AKS to Container Apps

#azure #containerapps #functions #keda

On my current project, we run more than 100 Azure Functions inside an Azure AKS Cluster, and it works fine!

We use a lot of event triggers with Azure Event Hub bindings, HTTP triggers and some timer functions.

Why we do it this way?

One reason, the whole backend had to run inside a closed network. We had the choice of using a premium plan, but this came with a "pay per vm" model. Still, we knew we would need a cluster for other non-Azure Function workloads anyway. This means we pay for some vm pools beside AKS nodes.

At the same time, KEDA became available for production. So we decided to make a tryout with Azure Functions on AKS. And it works very well!
For Deployment, we use currently Helmfile, so we have only one file for a list of functions. With helmfile sync, we deploy all our functions with a single command. The corresponding helmfile.lock is stored in git. So it is possible to transfer the same deployed workload to any other environment.

On a higher level our solution looks like this:

What are the downsides?

First of all, we pay for nodes, not for calls. We use very small node sizes to scale down as much as possible. But there will be always some time where we pay for instances where we don't need it.
Aside from that, our HTTP triggers cannot scale to zero because we use the "avg call per minute" metric from Azure Monitor as an indicator. For upscaling it works very well, but the information flow is too slow for scaling down to zero.

In the end, even if running an AKS is not that hard, we have to maintain a Kubernetes cluster.

Last year, I held a talk about this project. I closed with a slide that it would be so great when Azure comes up with a fully managed service for KEDA.

So, I was very excited as Azure released the public preview for Azure Container Apps! And additionally, they put Dapr on it ;)

Now I will share my first experience if we could move our functions to container apps.

let's make a tryout with an event triggered function

First, I've started with a copy of our position-handler function and keep folder- and c# project structure. I put it into a new git hub repo, removed the domain logic and replace it with a simple log for this demo case.

christle / azure-functions-goes-container-apps

So the function itself looks like this:

[FunctionName(nameof(UpdatePosition))]
public async Task RunAsyc(
  [EventHubTrigger("hub-positions", 
    Connection = "positionEventhubConnection", 
    ConsumerGroup = "capp-consumer-group")] EventData[] events, 
            ILogger log)
{
  foreach (EventData eventData in events)
  {
    string messageBody = Encoding.UTF8.GetString(eventData.Body.Array, eventData.Body.Offset, eventData.Body.Count);
    var incomingEvent = JsonConvert.DeserializeObject<Event>(messageBody);
    log.LogInformation("UpdatePosition: longitude={longitude}, latitude={latitude}", incomingEvent.Longitude, incomingEvent.Latitude);
 }

For publishing, i can leave the Dockerfile as it is and make a push to the Azure Container Registry:

cd position-handler
docker build -t  $registry/mc/capp-position-handler:1.0.0 .
docker push $registry/mc/capp-position-handler:1.0.0

I decided to use bicep for the deployment. I put a simple kubeEnvironment and a log analytics workspace to my main.bicep file.
For the Azure Container App itself, i need some secrets like registry credentials and connection strings. So the first try without autoscaling looks like this:

resource positionHandlerApp 'Microsoft.Web/containerApps@2021-03-01' = {
  name: 'position-handler-capp'
  kind: 'containerapp'
  location: location
  properties: {
    kubeEnvironmentId: env.id
    configuration: {
      secrets: [
        {
          name: 'container-registry-password'
          value: registryPassword
        }
        {
          name: 'eventhub-connectionstring'
          value: connectionString
        }
        {
          name: 'storage-connectionstring'
          value: storageConnectionString
        }
      ]      
      registries: [
        {
          server: registry
          username: registryUsername
          passwordSecretRef: 'container-registry-password'
        }
      ]
    }
    template: {
      containers: [
        {
          image: '${registry}/mc/capp-position-handler:1.0.0'
          name: 'position-handler-capp'
          env: [
          {
            name: 'positionEventhubConnection'
            secretRef: 'eventhub-connectionstring'
          }
          {
            name: 'AzureWebJobsStorage'
            secretRef: 'storage-connectionstring'
          }
          ]
        }
      ]
      scale: {
        minReplicas: 1
      }
    }
  }
}

Before execution, I've created a new consumer group named capp-consumer-group on our dev Event Hub. I don't wanna take away messages from my team members. Let's do a deployment:

az group create -n rg-container-app -l northeurope
az deployment group create -n capps \
 -g rg-container-app \
 --template-file ./main.bicep \
 -p registry=$registry \
 registryUsername=$acrUser \
 registryPassword=$acrPassword \
 connectionString=$hubConnectionString \
 storageConnectionString=$storageConnectionString

Ok great! if I look into the azure portal I can see a Container App with currently 1 replica and the logs looks like this:

But these are the raw unformatted console logs. To use the same way for logging, like in the running project, I put a new environment variable to the Container App declaration:

env: [
  ...
  {
    name: 'APPINSIGHTS_INSTRUMENTATIONKEY'
    secretRef: 'instrumentation-key'
  }
]

If check our Application Insights instance, I find my new function there without any lack of information.

Add KEDA autoscaling

Until now, it was very easy to get our function running but with only replica. In our AKS setup, we use KEDA for autoscaling, which is build-in Container Apps. Next, I will try to move the current configuration to the Container App.

KEDA comes with a lot of different scaler types for specific resources. A scaler object is a K8s custom resource definition. So on AKS we put in every helm chart for event functions a resource definition like this:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: {{ include "position-handler-hub.fullname" . }}
  labels:
    deploymentName: {{ include "position-handler-hub.fullname" . }}
spec:
  scaleTargetRef:
    name: {{ include "position-handler-hub.fullname" . }}
  minReplicaCount: {{ .Values.Scale.minReplicaCount }}
  maxReplicaCount: {{ .Values.Scale.maxReplicaCount }}
  pollingInterval: {{ .Values.Scale.pollingInterval }}
  triggers:
  - type: azure-eventhub
    metadata:
      type: eventHubTrigger
      connectionFromEnv: positionEventhubConnection
      eventHubName: hub-positions
      storageConnectionFromEnv: AzureWebJobsStorage

After some research, I replace the scale part in the bicep.main with this:

scale: {
  minReplicas: 0
  maxReplicas: 5
  rules: [
  {
    name: 'azure-eventhub'
    custom: {
      type: 'azure-eventhub'
      metadata: {
        eventHubName: 'hub-positions'
        consumerGroup: 'capp-consumer-group'
      }
      auth: [
        {
          secretRef: 'eventhub-connectionstring'
          triggerParameter: 'connection'
        }
        {
          secretRef: 'storage-connectionstring'
          triggerParameter: 'storageConnection'
        }
      ]
    }
  }
  ]
}

Great! If check my Container App, it just runs with one replica:

And it scales down to zero if I stop the workload for testing.

Ok, the first impression, it could be very easy to move the event-based workload from AKS to Azure Container Apps.

Going ahead with an HTTP trigger

Next Step! I copied an HTTP function from the project context and did all the same steps as on the position handler. I removed the domain logic, but preserve the project structure.

Now I create a new Container App resource in the bicep file. For testing purposes it is ok, but later I should create bicep modules for each function type like an event handler, HTTP API, or timer. The configuration is only slightly different between those types.

how about HTTP scaling?

Inside the k8s cluster we use the azure-monitor-scaler for HTTP based scaling depending on requests like this:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: {{ include "worker-api.fullname" . }}
  labels:
    app: {{ include "worker-api.fullname" . }}
spec:
  scaleTargetRef:
    name: {{ include "worker-api.fullname" . }}
  minReplicaCount: {{ .Values.Scale.minReplicaCount }}
  maxReplicaCount: {{ .Values.Scale.maxReplicaCount }}
  triggers:
  - type: azure-monitor
    metadata:
      resourceURI: Microsoft.Insights/components/{{ .Values.Scale.componentName }}
      tenantId: {{ .Values.Scale.tenantId }}
      subscriptionId: {{ .Values.Scale.subscriptionId }}
      resourceGroupName: {{ .Values.Scale.resourceGroupName }}
      metricName: requests/rate
      metricFilter: cloud/roleName eq '{{ .Values.Scale.roleName }}'
      metricAggregationType: Average
      metricAggregationInterval: "0:1:0"
      targetValue: "4"
    authenticationRef:
      name: azure-monitor-trigger-auth

Scaling based on a request/rate metric works quite well, but has one disadvantage. A "scale to zero" is not possible, because with Azure Monitor metrics the delay is too large.

With Container Apps there is a much better-integrated way for HTTP scaling. It's based on the HTTP queue length and can react very fast. So scaling from zero is no problem.
That means, the HTTP function scaling configuration is just reduced to this:

scale: {
  minReplicas: 0
}

This is awesome! After a successful deployment, the logs confirm that the service scale to zero replicas, so there are no calls anymore:

Introduce a function to function call

The current project application comes with some simple function to function calls. All function projects were build on a hexagonal pattern. So currently we use an adapter class for HTTP calls to other HTTP triggered functions.

Now I will try, if I could set up a simple adapter class in the same way and see if it is runnable inside Container Apps.
For demo purposes, I add the following adapter to the position-handler:

public class WorkerApiProvider : IWorkerApiProvider
{
  private readonly string url;
  private readonly HttpClient client;

  public WorkerApiProvider(string url) 
  {
    this.url = url;
    this.client = new HttpClient();
  }

  public async Task<string> GetWorkerNameAsync()
  {
    var response = await client.GetAsync($"http://{this.url}/api/worker/1");
    var content = await response.Content.ReadAsAsync<Worker>();
    return content.Name;
  }
}

the URL is configured via an environment parameter on startup:

public override void Configure(IFunctionsHostBuilder builder)
{
  var urlWorkerApi = Environment.GetEnvironmentVariable("WORKER_API");
            builder.Services.AddSingleton<ITelemetryInitializer, CloudRoleName>();
  builder.Services.AddSingleton<IWorkerApiProvider>(x=> new WorkerApiProvider(urlWorkerApi));
  builder.Services.AddLogging();
}

The call ist just a demo call on the position function itself:

var workerName = await provider.GetWorkerNameAsync();
log.LogInformation("UpdatePosition: found Workername {WorkerName}", workerName);

The last step is the initialization of the environment variable for the URL. Inside the AKS cluster, it is just the logical name of the Kubernetes service resource.
On container apps we could use the following bicep expression in the environment section:

{
  name: 'WORKER_API'
  value: workerApiApp.properties.configuration.ingress.fqdn
}

After redeployment of the position-handler app, I could find this on the logs:

At this point, I'm very happy, that the whole stuff could just be a matter of configuration for us. Even a function to function call needs no change at the code level!
I'm very confident, that we could migrate all 100 functions from AKS to Azure Container Apps when they become GA! This is very great!

Dapr

From the migration perspective, moving all functions without code changes should be step 1.

But besides KEDA, Container Apps have another integrated open source technology named Dapr, with the same easy-to-use experience.
Dapr uses sidecars to simplify microservice connectivity, so developers could focus more on business logic.
But for the same reason, we use Azure Functions. The input and output bindings reduce a lot of crosscutting code. So does it make sense to use Dapr too when function bindings are already in place?

Let's see, if it's getting even better with Dapr.

Using Dapr for a function to function call

First of all, I have to enable Dapr on both Container Apps (the appPort isn't important for the position handler):

dapr: {
 enabled: true
 appPort: 80
 appId: 'worker-api-capp'
}

At the next step, I going to change the adapter implementation. For this I need a new dotnet dependency:

dotnet add package dapr.client

Inside the adapter class, I replace the HttpClient with the newly installed DaprClient. Instead of passing the worker API FQDN, I call the local sidecar through an HTTP based Dapr client and use the logical app name:

public async Task<string> GetWorkerNameAsync()
{
  var client = DaprClient.CreateInvokeHttpClient();

  var response = await client.GetAsync("http://worker-api-capp/api/worker/1");
  var content = await response.Content.ReadAsAsync<Worker>();

  return content.Name;
}

After a short try, the function to function communication works well, like before! But not only the remote call itself is more straightforward with Dapr. The communication between both sidecars is TLS encrypted.

Also, with the Dapr extension for Azure Functions, are new bindings available.
For example, you can define an HTTP output binding like this on a function:

[DaprInvoke(AppId = "{appId}", MethodName = "{methodName}", HttpVerb = "post")] IAsyncCollector<InvokeMethodParameters> output

This is pretty cool for functions that receive any kind of data, maybe from a queue, and transform the output via a POST method to an HTTP endpoint. And there is no need for any kind of adapter implementation like above.

processing event data with Dapr

But, with Dapr I can move even further. I will try to replace the native Azure Function binding for Event Hub data with a Dapr binding. Why does this make sense? The advantage of the Dapr binding is, you get rid of the concrete dependency to Azure Event Hub on a code level. Then it is just a matter of configuration.

Here we go with the Dapr extension:

dotnet add package Dapr.AzureFunctions.Extension

The function itself now using the common DaprBindingTrigger instead of the concrete EventHubTrigger. This Trigger don't need a loop over the json data, because the current implementation are not able to process data as a batch.

[FunctionName(nameof(UpdatePosition))]
public async Task RunAsync([DaprBindingTrigger(BindingName = "positionhandler")] JObject triggerData, ILogger log)
{
  var incomingEvent = JsonConvert.DeserializeObject<Event>(triggerData.ToString());
  log.LogInformation("UpdatePosition: longitude={longitude}, latitude={latitude}", incomingEvent.Longitude, incomingEvent.Latitude);

  var workerName = await provider.GetWorkerNameAsync();
  log.LogInformation("UpdatePosition: found Workername {WorkerName}", workerName);
}

Adding a Component to the Dapr configuration

the Dapr configuration for the Container App grows a little bit, but at the same time, I could remove all env variables, except the instrumentation key. This is because the Azure Function itself, don't need this information anymore. The Dapr configuration part looks like this:

dapr: {
  enabled: true
  appId: 'position-handler-capp'
  appPort: 3001
  components: [
  {
    name: 'positionhandler'
    type: 'bindings.azure.eventhubs'
    version: 'v1'
    metadata: [
    {
      name: 'connectionString'
      secretRef: 'eventhub-connectionstring'
    }
    {
      name: 'consumerGroup'
      value: 'capp-consumer-group'
    }
    {
      name: 'storageAccountName'
      value: storageAccountName
    }
    {
      name: 'storageAccountKey'
      secretRef: 'storage-account-key'
    }
    {
      name: 'storageContainerName'
      value: 'positionhandler'
    }
    ]
  }
  ]
}

The important changes:

the appPort has to be 3001. The function trigger extension communicates with Dapr on this port.
the new component is the concrete Azure Event Hub binding.
i need 2 more parameters for the storageAccountName and the storageAccountKey. I cannot use the storage-connectionstring
the storageContainerName is the target container name inside the account. This is important for the KEDA configuration

Autoscaling Dapr Event Hub binding with KEDA

Last but not least, we have to adjust the KEDA configuration, if we want to keep using auto-scaling.

On the metadata part we need 2 additional parameters, checkpointStrategy and blobContainer:

metadata: {
  eventHubName: 'hub-positions'
  consumerGroup: 'capp-consumer-group'
  checkpointStrategy: 'goSdk'
  blobContainer: 'positionhandler'
}

With Azure Event Hub, everything is about checkpoints. A checkpoint is like a bookmark of read data. Common Event Hub clients storing this checkpoint inside an Azure Storage Account. The KEDA Event Hub scaler behavior is based on this data. the blobContainer marks the checkpoint location inside the Storage Account.
Unfortunately not all Event Hub clients store the checkpoint in the same format. The Golang Event Hub client differs from all other implementations. Dapr is using the Golang Sdk for Event Hub, so I have to set the checkpointStrategy to goSdk.
That the KEDA Event Hub scaler can handle more different checkpoint formats is relatively new. I'm happy, that I could be part of that change :). It's pretty cool, that this redesign help Dapr and Container Apps to work with Event Hub scaling. More informations about the checkpointStrategy on the Keda docs.

I redeployed again and create some load. Everything just works and the scaling runs as expected!

All Dapr related changes could be found at the dapr branch.

Conclusion

What do we gain from moving from AKS to Azure Container Apps?

As mentioned, we use AKS for the function workload. Why? Because all have to run on a closed network and we want to share the Azure Function Apps with other apps on the same nodes. With KEDA it was very easy to run it all together on AKS.

But from the view of serverless, we pay for running vm's. With Container Apps we only pay per call and only for real computing time. We don't have to care about the right level of vm sizes or the node scale in and out behavior from the underlying cluster.

Another big advantage is the easy scale to zero mechanism for an HTTP trigger. On AKS, we use for HTTP triggers a minReplica value of 1, because upscaling from zero based on the Azure Monitor metrics are too slow. With Azure container Apps, we don't have to pay for a minimum HTTP workload on idle state, which is very nice.

A migration could be very easy for us after Container Apps reach the GA state. I see no real changes on our codebase.
The only downside, we cannot use our simple Helmfile deployment anymore, so we need some scripting for it. But that is no big deal.

After a migration, using Dapr makes sense for us, even with Azure Functions. Especially the HTTP-based function to function communication is easier with Dapr and comes with new output bindings. This could drop some adapter logic.
The Dapr Event Hub binding is currently in alpha state and supports only a fetch per message at the moment. But there are already open issues for that. So we should wait for a stable state.

Overall, Azure Container Apps are a huge improvement. I like it very much that it is based only on open source solutions with big communities behind!

On my wish list for Container Apps is a live logging view and a way to view the KEDA logs. That would have helped in some situations.

Top comments (3)

Akshay Shaha • Apr 18 '22

Thanks Christian for this amazing article on your migration story from AKS to Container Apps. This is very helpful for our usecases too. Learnt so many things from this article especially Dapr Bidings for Azure Functions which will make lot of difference in implementation in containerized environment.

Andrea Grillo • Oct 27 '23

Hello, after two years how do you consider this move? I'm planning to move one product from AKS to Container Apps (CA) and I only see pros at the moment. My only concern is about performance on heavy load since I haven't had that experience yet with CA. Do you have any suggestion or feedback? Thanks

Narayana R • Aug 8 '22

Very nice article... Wondering, what is your experience with cold start times for container apps, when you scale down to 0?