Alexandr Bandurchin for Uptrace

Posted on May 12

Stop Drowning in Log Noise: How Grouping Rules Turn Chaos into Signal

#automation #devops #monitoring #sre

You open your observability dashboard at 2am. 847 log groups. Alerts firing. You start triaging — and realize half of it is the same problem repeated across hundreds of slightly different messages.

User browsing product TSLA   → Group #1
User browsing product AAPL   → Group #2
User browsing product MSFT   → Group #3
...

Same code path. Same bug. 300 different groups. Zero useful signal.

This is what log noise looks like in production. And it's not a logging problem — it's a grouping problem. The fix isn't rewriting your instrumentation. It's a single rule that takes 30 seconds to create.

What Are Grouping Rules?

Grouping rules tell Uptrace how to normalize log messages before assigning them to a group. You define a pattern — literal words mixed with typed placeholders — and all matching messages collapse into one group, with variable parts extracted as structured attributes.

The pattern for the example above:

User browsing product %{IDENT:product}

Result: one group, with product as a queryable attribute on every event. You can now filter by product, aggregate, alert — without touching a line of application code.

The Syntax

A pattern is a sequence of literal words and typed placeholders.

Literals match exactly:

error connecting to database

Typed placeholders match variable parts by type:

%{INT:status_code}      → captures an integer as "status_code"
%{UUID}                 → matches a UUID, discards it
%{IP:remote_addr}       → captures an IP address
%{LOG_LEVEL:severity}   → captures INFO / WARN / ERROR / etc.

The format is always %{TYPE} or %{TYPE:capture_name}. Add a name and you get a structured attribute. Skip the name and the value is matched but discarded.

Available Types — At a Glance

Category	Types
Text	`IDENT`, `WORD`, `QUOTED`, `ANY`, `ANY+`
Numbers	`INT`, `FLOAT`, `BYTE_SIZE`, `NUMBER`
Network	`IP`, `URI`, `EMAIL`, `HOST_PORT`
System	`LOG_LEVEL`, `HTTP_STATUS`, `HTTP_METHOD`, `UUID`
Time	`TIMESTAMP`, `DATE`, `TIME`
Structured	`JSON`, `ATTR` (key=value pairs)

NUMBER and IP are virtual types — they expand to cover all their subtypes automatically. ANY+ matches one or more tokens of any kind, which is useful for catching variable-length tails.

→ Full type reference

Extracted Attributes: Where the Real Value Is

Matching logs into groups is the basic use case. The real power is what happens to the captured values.

Take a cart log:

User a3f2c1d4-... adding 12 of product NVDA to cart

Pattern:

User %{UUID} adding %{INT:num_products} of product %{IDENT:product} to cart

Now every log in this group has num_products and product as structured attributes. In Uptrace you can immediately:

Filter by product=NVDA
Chart max(num_products) over time
Set an alert if avg(num_products) drops below a threshold

The %{UUID} without a capture name just absorbs the user ID — you don't need it in the group, so you skip it.

Fingerprinting: One Group Per Unique Value

By default, all logs matching a pattern land in one group. Sometimes you need the opposite — a separate group per unique value.

Add # before the capture name:

%{IDENT:#function_name} failed

Now SendEmail failed and ParseConfig failed become two separate groups, not one. This matters for alerting — you want to be paged once per failing function, not once for all failures combined.

The # prefix includes the value in the fingerprint hash. Without a capture name at all, use the fingerprint option:

%{IDENT,fingerprint} failed

Same result.

PostgreSQL example — separate group per unknown column:

%{LOG_LEVEL:log_severity} column %{QUOTED:#column} does not exist %{ATTR:sqlstate}

Raw logs:

ERROR: column "event.created_at" does not exist (SQLSTATE=42703)
ERROR: column "updated_at" does not exist (SQLSTATE=42703)

Result: two groups, each alertable independently. log_severity and sqlstate captured as attributes on every event.

Units: Normalize Numeric Values

Got duration logs with mixed units across services? The unit option normalizes everything automatically:

%{NUMBER:duration,unit=ms}

Uptrace converts the value to a base unit and stores it consistently — so you can aggregate durations across services that log in milliseconds, seconds, or microseconds without any manual conversion.

Supported units: ms, s, us, ns, bytes, kb, mb, gb, %, count, and more.

Advanced: Regex Inside Quoted Strings

When a log contains a quoted message with structured data inside, the extract option lets you reach in with a regex:

ERROR %{QUOTED:msg,extract=`(?P<name>\w+) is (?P<age>\d+)`}

Given ERROR "Alice is 25" — this captures msg=Alice is 25, name=Alice, and age=25. Useful for legacy logs that pack multiple values into a quoted string.

Optional Matchers and Alternatives

Patterns don't have to be rigid. Make any placeholder optional with ?:

error code %{NUMBER:code}? occurred

Matches both error code 500 occurred and error code occurred.

For logs that phrase the same error differently across services:

(%{LOG_LEVEL:level}|%{WORD:level}) %{WORD:msg}

Or declare multiple patterns in a single rule — any match fires the rule:

can't find item %{NUMBER:item_id}
can not find item %{NUMBER:item_id}
%{NUMBER:item_id} not found

Setting Fingerprints Programmatically

If you need full control, set grouping.fingerprint directly when creating a log event — it overrides the automatically derived fingerprint entirely:

span.AddEvent("exception", trace.WithAttributes(
    attribute.String("exception.type", "*exec.ExitError"),
    attribute.String("exception.message", "exit status 1"),
    attribute.String("grouping.fingerprint", "exec.ExitError"),
))

Useful for cases where the pattern-based approach isn't precise enough.

When to Use Grouping Rules

✅ High-cardinality messages with IDs, names, or values baked in

✅ You want numeric aggregations on log data (durations, counts, sizes)

✅ Per-value alerting — one alert per failing function, one per unknown column

✅ Legacy or third-party logs you can't re-instrument

Grouping rules are not a replacement for structured logging — if you can emit product=TSLA as a proper OpenTelemetry attribute from your application, do that. But for everything else, they close the gap in minutes.

Try It in Uptrace

Grouping rules are available in Uptrace under Logs & Errors → Grouping Rules. Paste a raw log message, click Extract Pattern, and Uptrace generates the pattern for you — you just add capture names to the parts you want to keep.

Uptrace is an OpenTelemetry-native APM that handles traces, metrics, and logs in one place. You can get started in minutes:

Cloud (Free trial, no credit card): uptrace.dev/get
Self-hosted (open source): uptrace.dev/get/hosted/install

DEV Community