Hardly anybody argues the importance of instrumenting your code and there are some built-in log options in the .Net Framework, not to mention countless great 3rd party libraries like Nlog, Serilog and Log4Net. But when it comes to reporting metrics (Think: requests per minute, usage, message rate and things like that) the built-in support is very limited. Of course we have the Performance Counters but they cannot be used cross-platform as they are Windows only. Luckily there is a relative simple way to report metrics using the EventSource class that is cross-platform and a part of the Framework so there is no need for external dependencies.
Introducing EventCounters
ETW (Event Tracing for Windows) and the EventSource
class have long been a part of the .Net Framework. EventSources are heavily used by the framework itself, and by a lot of (3rd party) libraries as well. It allows you to use structured logging in your application code with a minimal performance overhead. The data can be collected in-process or out-of-process using tools like PerfView.
In this blogpost we will take a closer look to the not so well-known EventCounter
class that lets us define and report metrics using an EventSource
. I specifically want to show how you can read those counters in-process using an EventListener
as this is not very well documented.
As a starter, let us see what the CoreFx team has to say about EventCounters (taken from their tutorial):
While EventSource is fast, logging too many events for very frequent events is still a performance hit. In this tutorial, we will introduce EventCounter, a mechanism for measuring performance for very frequent events.
For event that happen very frequently (for example, if it happen every few milliseconds), in general, you will want the performance overhead per event to be very low (e.g. less than a millisecond), otherwise it is going to cost a significant performance overhead. Logging an event, at the end of the day, need to write something to the disk. If the disk is not fast enough, you will lost events. We need a solution other than logging the event itself.
When dealing with large number of events, knowing the measure per event is not very useful either. Most of the time all we need is just some statistics out of it. So we could crank the statistics within the process itself and then write an event once in a while to report the statistics, that's what EventCounter will do for us.
Sounds great, doesn't it? It is a matter of defining and reporting metrics and the framework will do the aggregation and calculation in memory. All we need to do is to specify the interval at which the data is outputted.
The Recipe
First, we need to define an EventSource
that will contain the EventCounter
instances. You can dynamically create EventCounter
instances at runtime but, due to a bug I documented here, you will have the create at least one EventCounter
instance in the constructor.
[EventSource(Name = "My-CustomMetricsEventSource-Minimal")]
public sealed class CustomMetricsEventSource : EventSource
{
private EventCounter methodDurationCounter;
private Dictionary<string, EventCounter> dynamicCounters =
new Dictionary<string, EventCounter>();
public static CustomMetricsEventSource Log = new CustomMetricsEventSource();
public CustomMetricsEventSource()
{
methodDurationCounter = new EventCounter(nameof(methodDurationCounter), this);
}
public void ReportMethodDurationInMs(long milliseconds)
{
methodDurationCounter.WriteMetric(milliseconds);
}
public void ReportMetric(string name, float value)
{
if (!dynamicCounters.TryGetValue(name, out EventCounter counterInstance))
{
counterInstance = new EventCounter(name, this);
dynamicCounters.Add(name, counterInstance);
}
counterInstance.WriteMetric(value);
}
}
We will use CustomMetricsEventSource
to log our metrics in a console application. But we also need a listener that picks up the logged metrics and sends them to a destination. In this example we will write the data to the console using a custom implementation of an EventListener
:
internal class CustomMetricsEventListener : EventListener
{
protected override void OnEventWritten(EventWrittenEventArgs eventData)
{
var counterData = eventData.ToEventCounterData();
// Only write to console if actual data has been reported
if (counterData?.Count == 0)
return;
Console.WriteLine(
$"Counter {counterData.Name} reported values " +
$"Min: {counterData.Min}, " +
$"Max: {counterData.Max}, " +
$"Count {counterData.Count}, " +
$"Mean {counterData.Mean}, " +
$"StandardDeviation: {counterData.StandardDeviation}, " +
$"IntervalSec: {counterData.IntervalSec}");
}
}
The code listened above uses the ToEventCounterData
method. This is actually an extension method. It returns an EventCounterData
instance which makes it easier to work with the reported metric data:
public static class EventWrittenEventArgsExtensions
{
public static bool IsEventCounter(this EventWrittenEventArgs eventData)
{
return eventData.EventName == "EventCounters";
}
public static EventCounterData ToEventCounterData(this EventWrittenEventArgs eventData)
{
if (!eventData.IsEventCounter())
return null;
return new EventCounterData(eventData);
}
}
public class EventCounterData
{
public EventCounterData(EventWrittenEventArgs eventData)
{
var payload = (IDictionary<string, object>) eventData.Payload[0];
Name = payload["Name"].ToString();
Mean = (float)payload["Mean"];
StandardDeviation = (float)payload["StandardDeviation"];
Count = (int)payload["Count"];
IntervalSec = (float)payload["IntervalSec"];
Min = (float)payload["Min"];
Max = (float)payload["Max"];
}
public string Name { get; }
public float Mean { get; }
public float StandardDeviation { get; }
public int Count { get; }
public float IntervalSec { get; }
public float Min { get; }
public float Max { get; }
}
Finally we can create the console application that will call CustomMetricsEventSource
to log metric values. The CustomMetricsEventListener
is constructed in such a way that the metrics are outputted once per second.
It is important to understand that it is the configuration of the CustomMetricsEventListener
that determines the metric reporting interval. This is done by passing a numeric value specifying the interval in seconds to the EventCounterIntervalSec
argument.
class Program
{
static void Main(string[] args)
{
var reader = new CustomMetricsEventListener();
var arguments = new Dictionary<string, string>
{
{"EventCounterIntervalSec", "1"}
};
reader.EnableEvents(CustomMetricsEventSource.Log, EventLevel.LogAlways, EventKeywords.All, arguments);
var random = new Random();
for (int i = 0; i <= 10; i++)
{
SleepingBeauty(random.Next(10, 200));
}
Console.ReadKey();
}
static void SleepingBeauty(int sleepTimeInMs)
{
var stopwatch = Stopwatch.StartNew();
Thread.Sleep(sleepTimeInMs);
stopwatch.Stop();
CustomMetricsEventSource.Log.ReportMethodDurationInMs(stopwatch.ElapsedMilliseconds);
CustomMetricsEventSource.Log.ReportMetric("someCounter", DateTime.Now.Millisecond);
}
}
When we run the program we will see the metric values being reported to the console:
Counter methodDurationCounter reported values Min: 23, Max: 215, Count 8, Mean 113,125, StandardDeviation: 64,10429, IntervalSec: 1,026394
Counter someCounter reported values Min: 87, Max: 982, Count 9, Mean 603,1111, StandardDeviation: 269,2832, IntervalSec: 1,026394
Do mind that the metrics will always be reported at the specified interval (using EventCounterIntervalSec
) even if no call to the WriteMetric
method of an EventCounter
instance is made. All numeric values, except IntervalSec, will have the default value 0.
You can find the whole working example in my GitHub repository:
Blog
Placeholder for code I blogged about
Folders in this repository
- Metrics: Reporting Metrics Using .Net (Core)
What is next?
Of course, reporting those values to the console does not add much value, it is up to you to send the data to your own logging backend. You could for example send them as custom metrics to Application Insights using the EventCounterCollectionModule, log to your ELK stack or whatever log system you are using.
I really wish reporting metrics gets a bit more attention since having metrics data available is very important in order to have a detailed insight in how your application is behaving. For example, you can use metrics to scale in or out or get insights in the usage of certain parts of your application. Most logging frameworks I know of do not cater for metrics.
There was a lot of discussion about whether or not the support writing metric using the new ILogger
interface of .Net Core but in the end there is no support for it right know. More details about the how and the why can be read in the issue created for this discussion:
Add support for reporting application metrics via logging #708
Overview
We want to enable applications to emit arbitrary named metrics and have them captured by telemetry/logging systems (App Insights, Elasticsearch/ELK, etc.) as well as other metrics systems (Statsd, InfluxDB, etc.). An abstraction is desired in order to allow applications to emit metrics without being coupled to the destination of the metrics, particularly with features like App Insights auto-lightup where the app doesn't directly depend upon App Insights.
Proposal
We will add support for metrics reporting to ILogger
. It will be done via a new companion interface IMetricsLogger
, which loggers may optionally implement. Applications can emit metrics through this interface (see API below), and logger providers which support metrics can receive and process those metrics. Metrics are (name, value)
tuples, where value
is always floating point number. Metrics are expected to be viewed in aggregate, hence it doesn't make sense to use a generic data type like object
for the metric, so metric values are always double
s. Supporting arbitrary types of metric values is a non-goal. Reporting metrics should be as efficient as possible, especially in the case where no logger provider is registered to listen for them, or they are filtered out.
ILogger
?
Why add to Initially, I considered developing a new abstraction for Metrics. However, I encountered a few concerns:
- We would need to develop a structure for naming/organizing metrics (likely using type/namespace names as a base)
- We would need to develop a structure for filtering metrics (by categories, names, granularities, etc.)
- Many telemetry systems provide support for both log and metric collection, and would have to provide a new system
- Consumers have to pull in two different diagnostics reporters from DI (
ILogger<T>
,IMetricReporter<T>
, etc.) - This would create Yet Another Diagnostics Abstraction (on top of EventSource, EventCounters, DiagnosticSource, Perf Counters, yada yada yada (see what I did there? ;P))
The fact that 1 and 2 above already exist in the Logging pipeline, and many metric sinks are also logging sinks, led to the idea of integrating metrics into Logging rather than creating a completely new abstraction.
The main reason why we would not want to add this to ILogger
would be a risk
The API
IMetricLogger
Optional companion interface for ILogger
implementations to indicate their support for metrics. The ILogger
instances returned by ILoggerProvider
are expected to implement this interface if and only if they wish to receive metrics.
namespace Microsoft.Extensions.Logging
{
public interface IMetricLogger
{
/// <summary>
/// Define a new metric with the provided name and return an <see cref="IMetric"/> that can be used to report values for that metric.
/// </summary>
/// <param name="name">The name of the metric to define</param>
/// <returns>An <see cref="IMetric"/> that can be used to report values for the metric</returns>
IMetric DefineMetric(string name);
}
}
IMetric
Interface that allows metric data to be reported. Metric values
namespace Microsoft.Extensions.Logging
{
public interface IMetric
{
/// <summary>
/// Record a new value for this metric.
/// </summary>
/// <param name="value">The value to record for this metric</param>
void RecordValue(double value);
}
}
LoggerMetricsExtensions
Provides an extension method on ILogger
to enable consumers to emit metrics, even if the underlying logger provider doesn't support it (of course, the metrics will be ignored in that case).
namespace Microsoft.Extensions.Logging
{
public static class LoggerMetricsExtensions
{
/// <summary>
/// Define a new metric with the provided name and return an <see cref="IMetric"/> that can be used to report values for that metric.
/// </summary>
/// <remarks>
/// If none of the registered logger providers support metrics, values recorded by this metric will be lost.
/// </remarks>
/// <param name="name">The name of the metric to define</param>
/// <returns>An <see cref="IMetric"/> that can be used to report values for the metric</returns>
public static IMetric DefineMetric(this ILogger logger, string name)
{
if(logger is IMetricLogger metricLogger)
{
return metricLogger.DefineMetric(name);
}
return NullMetric.Instance;
}
}
}
MetricValueExtensions
Extension methods for IMetric
to support other common metric value types (as mentioned above, the underlying type is still double
, so the value must be representable as a double
).
Suggestion: We could have a .Time()
API that returns an IDisposable
which uses a Stopwatch to do the timing for you as well. That would be easy to add later though.
using System;
namespace Microsoft.Extensions.Logging
{
public static class MetricValueExtensions
{
/// <summary>
/// Record a new value for this metric.
/// </summary>
/// <remarks>
/// This is a convenience method that will convert the <see cref="TimeSpan"/> to a <see cref="double"/> via
/// the <see cref="TimeSpan.TotalMilliseconds"/> property.
/// </remarks>
/// <param name="metric">The metric to record the value on.</param>
/// <param name="value">The value to record for this metric.</param>
public static void RecordValue(this IMetric metric, TimeSpan value) => metric.RecordValue(value.TotalMilliseconds);
}
}
Logger
updates
The aggregate Logger
class we return from LoggerFactory.CreateLogger
would be updated to support IMetricsLogger
, even if none of the ILogger
s it aggregates support it. It would only send metrics to the loggers that are known to support the interface.
Define
/Record
pattern
In order to keep the performance cost as low as possible for recording the actual metric value, the API uses a Define
/Record
pattern. Consumers should expect to call DefineMetric("foo")
once for a particular value of "foo" and store the result as globally as possible (in singleton types, etc.). It is DefineMetric
that does the bulk of the work to set up the metric recording. After calling DefineMetric
, the RecordValue
call on the IMetric
interface can choose how to store the value based on the providers needs. For example, in a provider that plans to send pre-aggregated metrics, the RecordValue
could simply update rolling aggregate values and then drop the actual recording (thus having constant storage requirements for each metric). If a provider needs to report individual metrics, it can use dynamically-allocated buffers or ring buffers, depending upon the configuration and the needs of the provider. As an example of an IMetric
implementation based on pre-aggregating data, see https://gist.github.com/anurse/03c6746204317bd3616c372b4df9dbba
Filtering Metrics
Metrics should be filterable, and they should participate in the existing Logger filtering pipeline. My current proposal is that Metrics are treated like LogLevel.Critical
messages, in that if the category is enabled at all, the metrics are written. We could consider capturing a LogLevel
in .DefineMetric
and using that to provide different "levels" of metrics. Since the existing filters only let you filter by Provider, Category, and Level, it is very difficult to filter out specific metrics by name (as that would require a new Filter structure). This may make LogLevel
support for metrics more important.
@pakrym and I talked and came up with a specific suggestion
Metrics can be filtered at the provider and category name, but there is no concept of a LogLevel
for metrics. However, filters can disable all metrics for a particular provider/category combination. To do this, we'll add an AllowMetrics
boolean to LoggerFilterRule
which defaults to true
. When writing a metric, we'll check the provider and category name against the filter options as well as checking this boolean.
Scopes and Property Bags
FEEDBACK REQUESTED
My initial thought is that Metrics exist outside of scopes, as they are generally aggregated and Scopes imply per-event data. Providers could, in theory, give their IMetric
instances access to read active scope data, and attach them to metrics if they choose (for example, systems that report unaggregated metrics may want to). For similar reasons, property bags (i.e. Log Values/Structured Logging, etc.) don't really make sense for metrics in general as they are usually aggregated. Having said that, many providers do support arbitrary properties (InfluxDB, App Insights), so some level of support may be useful. We could easily add an optional parameter to .RecordValue
to allow adding arbitrary properties to each metric, and providers could choose to disregard it if it doesn't make sense for that provider.
Impact on existing Providers
There is minimal impact on existing Providers. Because the interface provides an opt-in model for a provider to receive metrics, existing providers are unaffected. I'd imagine providers like Serilog would not participate in this system as they don't have an infrastructure for metrics (/cc @nblumhardt , I may be wrong!). Providers like App Insights may choose to either add metrics support to their existing providers or add new providers that only handle metrics (and ignore log messages) that are co-registered.
Top comments (8)
Great post - just what I was looking for to help with metrics in a .Net Standard library... But... I think that there might be another bug in there.
Have you ever tried having multiple EventSources?
While WriteEvent() calls do appear to be filtered based on the EventSource used to enable to EventListener, I appear to be seeing all EventCounter events on all my EventListeners regardless of how I enabled them.
I've tried to reproduce it but I can't. Can you share some details (or code) how you defined and enabled the EventSources?
I've forked your repo to github.com/SteveHarveyUK/Blog.git
In there I've moved things around to provide a factory base example that demonstrates my usage pattern and the issue I see.
If you change the enableFiltering parameter on the factory it'll turn on my own implementation of filtering in the EventListener.
So let me get this clear: you want to be able to enable or disable collection metrics of a specific EventSource?
I want to be able to have a number of independent EventSources with EventCounters collecting different metrics with each having its own EventListener that only receives the events from the EventSource and EventCounter that was enabled against it at potentially different frequencies.
So for instance I can have one instance monitoring cache table usage and delivering me the update rate per second and perhaps another giving me the stats against all queues in an application every five seconds.
I believe that I've achieved that with my example code, BUT I've had to manually filter the EventCounter events as they don't seem to be filtered by the framework based on the EventSource passed into EventListener.EnableEvents() method.
I'm just trying to check that I've not missed something as I'd have expected the built in filtering that appears to work on EventSource.WriteEvent() to work for the EventCounter events too.
If you do not want to collect metrics just don't pass the EventCounterIntervalSec argument in the dictionary. Otherwise just use another interval. I made some changes to your repo but can't push them to your repo. Can you add me as a contributer? (I can't fork your repo since it is a fork of mine)
Added as collaborator. Thanks for your help!!
I've created a pull request in which I demonstrate how I would do it. Probably better to communicate using the PR from now on.