Russ Hammett

Posted on Dec 18, 2020 • Originally published at blog.kritner.com on Dec 4, 2020

Prettifying HealthChecks

#programming #csharp #netcore #infrastructure

Previously I wrote about creating Health Checks for Microsoft Orleans, but the JSON response was too minimal. In this post we’ll see about prettifying that output!

In the previous post we learned a bit about health checks, how to create them, and view their “health” from the perspective of Microsoft Orleans. The end result was a single word response of “Healthy”, “Degraded”, or “Unhealthy”; not very exciting stuff.

In this post, I’d like to quickly go over how you’d go about not only reporting on the “overarching status”, but giving details on the individual health checks that make up that overarching status.

(Note: I did have an issue out there with a hacktoberfest label, that I did get a PR for, but I wanted to go a slightly different route in the end, though I did have it integrated into master for some time.)

The Health check documentation does go into some detail about how to accomplish this health check prettifying, but I’m not a huge fan of “manually” writing out JSON; instead I opted for an anonymous object.

Prettifying Property Potentials

A few new things I want to report on from the response to the health check GET endpoint:

Health Check Name
Health Check Description
Individual Health Check Status
Some additional information specific to the health check that describes what makes the health check return “Degraded” or “Unhealthy”

Luckily all of this information can be made available to us via the HealthReport generated as a part of the health check.

Health Check Response Writer

We’re going to introduce a new method that writes a custom response for our health check endpoint. From startup, we’ll want to provide a custom ResponseWriter within MapHealthChecks:

app.UseEndpoints(endpoints =>
    {
        endpoints.MapHealthChecks(
                "/health",
                new HealthCheckOptions
                {
                    AllowCachingResponses = false,
                    ResponseWriter = HealthCheckResponseWriter.WriteResponse
                })
            .WithMetadata(new AllowAnonymousAttribute());
        endpoints.MapControllers();
    });

Where the referenced HealthCheckResponseWriter is a new static class we’ll be introducing next.

the ResponseWriter expects a method with the following signature:

You’ll notice above the method receives an HttpContext as well as HealthReport. This HealthReport will make available to us several pieces of data that we can report on, specific to each individual health check.

As for our actual response writer implementation, here is the original that was merged into master from L-Dogg‘s PR:

private static Task WriteResponse(HttpContext context, HealthReport result)
{
    context.Response.ContentType = "application/json; charset=utf-8";

    var options = new JsonWriterOptions
    {
        Indented = true
    };

    using var stream = new MemoryStream();
    using (var writer = new Utf8JsonWriter(stream, options))
    {
        writer.WriteStartObject();
        writer.WriteString("status", result.Status.ToString());
        writer.WriteStartObject("results");
        foreach (var entry in result.Entries)
        {
            writer.WriteStartObject(entry.Key);
            writer.WriteString("status", entry.Value.Status.ToString());
            writer.WriteString("description", entry.Value.Description);
            writer.WriteStartObject("data");
            foreach (var item in entry.Value.Data)
            {
                writer.WritePropertyName(item.Key);
                JsonSerializer.Serialize(
                    writer, item.Value, item.Value?.GetType() ??
                                        typeof(object));
            }
            writer.WriteEndObject();
            writer.WriteEndObject();
        }
        writer.WriteEndObject();
        writer.WriteEndObject();
    }

    var json = Encoding.UTF8.GetString(stream.ToArray());

    return context.Response.WriteAsync(json);
}

The above definitely works, but I’m not huge on writing the json “manually” (if that makes sense). I wanted to write another blog post on this anyway, as I already had a branch going (and didn’t actually expect a PR :O), so here’s my solution:

internal static class HealthCheckResponseWriter
{
    public static Task WriteResponse(HttpContext context, HealthReport healthReport)
    {
        context.Response.ContentType = "application/json; charset=utf-8";

        var result = JsonConvert.SerializeObject(new
        {
            status = healthReport.Status.ToString(),
            details = healthReport.Entries.Select(e => new
            {
                key = e.Key,
                description = e.Value.Description,
                status = e.Value.Status.ToString(),
                data = e.Value.Data
            })
        }, Formatting.Indented);

        return context.Response.WriteAsync(result);
    }
}

I find it a bit more concise working with the anonymous object.

Health Check updates

We’re not currently generating “data” information from the health checks that the HealthCheckResponseWriter would be able to make use of, so let’s take a look at what we could do there.

My intention for the “data” property of the anonymous object is to describe what would make the specific health check return a “Degraded” or “Unhealthy”, anything aside from those two statuses can be assumed to be “Healthy”.

If you recall, we already built thresholds into the health checks to represent the degraded and unhealthy statuses, now we’ll just need to provide those available to the health report.

Taking a look at the HealthCheckResult class:

you’ll see that method takes in an optional IReadOnlyDictionary<string, object> data = null, which happens to be the “data” member we made sure to return from our WriteResponse method in the previous section of the post.

We will make use of this IReadonlyDictionary to provide our “threshold” information on a per grain basis. I will be putting this threshold information into both the CPU and Memory grains, but just as an example here’s what one of those will look like:

[StatelessWorker(1)]
public class CpuHealthCheckGrain : Grain, ICpuHealthCheckGrain
{
    private const float UnhealthyThreshold = 90;
    private const float DegradedThreshold = 70;
    private readonly ReadOnlyDictionary<string, object> HealthCheckData = new ReadOnlyDictionary<string, object>(
        new Dictionary<string, object>()
        {
            { "Unhealthy Threshold",  UnhealthyThreshold},
            { "Degraded Threshold",  DegradedThreshold}
        });

    private readonly IHostEnvironmentStatistics _hostEnvironmentStatistics;

    public CpuHealthCheckGrain(IHostEnvironmentStatistics hostEnvironmentStatistics)
    {
        _hostEnvironmentStatistics = hostEnvironmentStatistics;
    }

    public Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context, CancellationToken cancellationToken = new CancellationToken())
    {
        if (_hostEnvironmentStatistics.CpuUsage == null)
        {
            return Task.FromResult(HealthCheckResult.Unhealthy("Could not determine CPU usage.", data: HealthCheckData));
        }

        if (_hostEnvironmentStatistics.CpuUsage > UnhealthyThreshold)
        {
            return Task.FromResult(HealthCheckResult.Unhealthy(
                $"CPU utilization is unhealthy at {_hostEnvironmentStatistics.CpuUsage:0.00}%.", data: HealthCheckData));
        }

        if (_hostEnvironmentStatistics.CpuUsage > DegradedThreshold)
        {
            return Task.FromResult(HealthCheckResult.Degraded(
                $"CPU utilization is degraded at {_hostEnvironmentStatistics.CpuUsage:0.00}%.", data: HealthCheckData));
        }

        return Task.FromResult(HealthCheckResult.Healthy(
            $"CPU utilization is healthy at {_hostEnvironmentStatistics.CpuUsage:0.00}%.", data: HealthCheckData));
    }
}

You should notice in the above that we introduce a ReadOnlyDictionary with the thresholds for degraded and unhealthy, then passed that ReadOnlyDictionary to the data parameter of the static method within HealthCheckResult.

Testing it out

The only thing left to do is test it out! You may have seen the cover image which contained spoilers, but just to wrap things up, here’s what it looks like when hitting the “/health” endpoint after our changes:

References

Top comments (2)

Azura Bennett • Feb 4 '24

Appwrite Health Service ensures the reliability of your server components. Learn to integrate it into React Native for better health monitoring.

Arslan Wakeel • Feb 26 '24

The insights shared in this article on enhancing Health Checks for Microsoft Orleans are enlightening! It's fascinating to see how the author explores improving system health checks by providing more detailed insights. As we delve into optimizing system performance, it's a timely reminder to also prioritize our dental health. Just like fine-tuning system thresholds, monitoring our sugar intake can help prevent those unwanted smooth surface cavities. It's all about maintaining balance, both in our systems and our health!

Some comments may only be visible to logged-in visitors. Sign in to view all comments.

DEV Community

Prettifying HealthChecks

Prettifying Property Potentials

Health Check Response Writer

Health Check updates

Testing it out

References

Top comments (2)

Read next

Day 5: Operators in C++ – Building Blocks of Logic

AI Test Case Generators: Revolutionizing Software Testing

Replicache

What’s New in Flutter 3.27.1: In-Depth Look with Code Examples