<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: abyo software LLC</title>
    <description>The latest articles on DEV Community by abyo software LLC (@abyo-software).</description>
    <link>https://dev.to/abyo-software</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F4006751%2Fd9aafe6b-cb8d-4511-98e4-95e31a761f98.png</url>
      <title>DEV Community: abyo software LLC</title>
      <link>https://dev.to/abyo-software</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/abyo-software"/>
    <language>en</language>
    <item>
      <title>Cutting CloudWatch Logs ingest costs with an OSS Rust gateway</title>
      <dc:creator>abyo software LLC</dc:creator>
      <pubDate>Sun, 28 Jun 2026 16:51:02 +0000</pubDate>
      <link>https://dev.to/abyo-software/cutting-cloudwatch-logs-ingest-costs-with-an-oss-rust-gateway-5h1o</link>
      <guid>https://dev.to/abyo-software/cutting-cloudwatch-logs-ingest-costs-with-an-oss-rust-gateway-5h1o</guid>
      <description>&lt;p&gt;If you've ever stared at a CloudWatch Logs bill for an app that writes a lot of events nobody reads, you've probably noticed that &lt;strong&gt;for write-heavy, rarely-read workloads ingest dominates storage&lt;/strong&gt;. AWS bills CloudWatch Logs ingest at $0.50/GB of raw bytes plus a 26 B/event metered overhead (us-east-1 list pricing, 2026-06), and in those workloads the expensive path is writing logs that almost no one queries afterwards.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/abyo-software/s4-logs" rel="noopener noreferrer"&gt;S4 Logs&lt;/a&gt;&lt;/strong&gt; is an Apache-2.0 Rust workspace for that pattern. It has two independent modes. This post walks through what each one does, the code surface needed to deploy them, and the measured numbers from a real AWS validation run.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mode A — drain existing log groups to S3
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;s4logs drain&lt;/code&gt; pulls events from CloudWatch Logs via &lt;code&gt;FilterLogEvents&lt;/code&gt; and writes them to S3 as &lt;strong&gt;standard RFC 8878 zstd frames&lt;/strong&gt; containing one JSONL event per line. Work is bucketed into UTC-grid-aligned windows (one hour by default, configurable via &lt;code&gt;--window&lt;/code&gt;), each window produces one manifest, and re-runs skip windows that already have a manifest. That makes drains idempotent.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Read-only plan: project per-group savings across the account&lt;/span&gt;
s4logs plan &lt;span class="nt"&gt;--all&lt;/span&gt;

&lt;span class="c"&gt;# Real drain to S3 Glacier Instant Retrieval (use --dry-run to preview without writing)&lt;/span&gt;
s4logs drain &lt;span class="nt"&gt;--log-group&lt;/span&gt; /aws/lambda/payments &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--bucket&lt;/span&gt; my-archive-bucket &lt;span class="nt"&gt;--prefix&lt;/span&gt; s4logs &lt;span class="nt"&gt;--account&lt;/span&gt; 123456789012 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--storage-class&lt;/span&gt; glacier-ir
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Mode A only reclaims the &lt;em&gt;storage&lt;/em&gt; line on what's already been ingested — the ingest you've already paid is sunk. Until you shorten retention, Mode A adds an S3 archive copy on top of the existing CloudWatch storage; the net storage saving appears only after the fail-closed retention shrink. Once every window older than your proposed cutoff has a verified manifest, you can shorten the log group's retention. That's a separate step (&lt;code&gt;--apply-retention&lt;/code&gt;), report-only by default.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mode B — bypass ingest with a CloudWatch Logs API-subset gateway
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;s4logs serve&lt;/code&gt; speaks a subset of the CloudWatch Logs AWS JSON 1.1 API:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PutLogEvents
CreateLogGroup
CreateLogStream
DescribeLogGroups
DescribeLogStreams
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Point Fluent Bit, the CloudWatch Agent, or any SDK that stays within that subset at the gateway by overriding the endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="c"&gt;# fluent-bit.conf
&lt;/span&gt;&lt;span class="nn"&gt;[OUTPUT]&lt;/span&gt;
    &lt;span class="err"&gt;Name&lt;/span&gt;              &lt;span class="err"&gt;cloudwatch_logs&lt;/span&gt;
    &lt;span class="err"&gt;Match&lt;/span&gt;             &lt;span class="err"&gt;*&lt;/span&gt;
    &lt;span class="err"&gt;Region&lt;/span&gt;            &lt;span class="err"&gt;us-east-1&lt;/span&gt;
    &lt;span class="err"&gt;log_group_name&lt;/span&gt;    &lt;span class="err"&gt;/app/api&lt;/span&gt;
    &lt;span class="err"&gt;log_stream_prefix&lt;/span&gt; &lt;span class="err"&gt;node-&lt;/span&gt;
    &lt;span class="err"&gt;auto_create_group&lt;/span&gt; &lt;span class="err"&gt;On&lt;/span&gt;
    &lt;span class="err"&gt;Endpoint&lt;/span&gt;          &lt;span class="err"&gt;s4logs-gateway.internal&lt;/span&gt;
    &lt;span class="err"&gt;Port&lt;/span&gt;              &lt;span class="err"&gt;8080&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;(Fluent Bit's &lt;code&gt;cloudwatch_logs&lt;/code&gt; plugin documents a custom &lt;code&gt;endpoint&lt;/code&gt;; the host + port split is the form shown in its LocalStack docs. Verify against your exact Fluent Bit image — the plugin behavior has shifted across versions.)&lt;/p&gt;

&lt;p&gt;For traffic routed &lt;code&gt;s3&lt;/code&gt; (not &lt;code&gt;both&lt;/code&gt;, which keeps the CloudWatch copy), you go from CloudWatch ingest charges to $0 on that traffic — you pay S3 PUT and S3 storage instead. No application payload transform is required on the ingest path; clients still speak the CloudWatch Logs API subset, with an endpoint override (and SigV4 mode needs matching static credentials).&lt;/p&gt;

&lt;p&gt;Routing is a first-match TOML config so you can keep your alert-driven log groups on real CloudWatch and bypass the rest:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="c"&gt;# routing.toml&lt;/span&gt;
&lt;span class="py"&gt;default_action&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"s3"&lt;/span&gt;          &lt;span class="c"&gt;# s3 | cloudwatch | both | drop&lt;/span&gt;

&lt;span class="nn"&gt;[[rule]]&lt;/span&gt;
&lt;span class="py"&gt;log_group&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"/aws/lambda/payments-*"&lt;/span&gt;
&lt;span class="py"&gt;action&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"cloudwatch"&lt;/span&gt;        &lt;span class="c"&gt;# keep alerting paths intact&lt;/span&gt;

&lt;span class="nn"&gt;[[rule]]&lt;/span&gt;
&lt;span class="py"&gt;log_group&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"/aws/lambda/batch-*"&lt;/span&gt;
&lt;span class="py"&gt;action&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"s3"&lt;/span&gt;                &lt;span class="c"&gt;# archive only&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Crash durability is opt-in. With &lt;code&gt;--wal-dir&lt;/code&gt;, events are fsynced before ack and replayed on restart (at-least-once; duplicates possible after a crash, but no silent loss of acknowledged events on a normal ext4/xfs disk; FUSE/NFS mounts that reject directory fsync degrade to best-effort and are surfaced in a metric). Without it, anything below the flush thresholds can be lost on a hard kill.&lt;/p&gt;

&lt;p&gt;Request auth defaults to off — the gateway accepts unsigned requests to the supported API subset and trusts the caller, so deploy it behind TLS + a network boundary regardless. If you want request signing on top, &lt;code&gt;--auth-mode sigv4&lt;/code&gt; verifies against a single static key pair (not IAM-backed). Use &lt;code&gt;S4LOGS_AUTH_SECRET&lt;/code&gt; rather than the &lt;code&gt;--auth-secret&lt;/code&gt; flag because process lists leak flag values; &lt;code&gt;S4LOGS_AUTH_ACCESS_KEY&lt;/code&gt; is also supported.&lt;/p&gt;

&lt;h2&gt;
  
  
  v1.1.1 additions
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ACME (Let's Encrypt) CLI surface&lt;/strong&gt; for the gateway. Uses TLS-ALPN-01, so cert acquisition and renewal share the same bound TLS socket (typically &lt;code&gt;:443&lt;/code&gt;). Requires &lt;code&gt;--acme-domain&lt;/code&gt;, &lt;code&gt;--acme-contact&lt;/code&gt;, and &lt;code&gt;--acme-cache-dir&lt;/code&gt;; mutually exclusive with &lt;code&gt;--tls-cert/--tls-key&lt;/code&gt;. Use &lt;code&gt;--acme-staging&lt;/code&gt; to verify against Let's Encrypt's staging environment before production. The cache directory is mandatory because re-registering on every restart trips Let's Encrypt rate limits. Because this uses TLS-ALPN-01, deploy it where the ACME validator can reach the TLS listener (typically public TCP/443), and make sure the cache directory is writable by the s4logs process.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;s4-emf::EmfDocument::samples_bounded()&lt;/code&gt;&lt;/strong&gt; — a pre-materialization expansion guard for untrusted EMF input. EMF flattens (directive × metric × dimension_set), so a ~100 KB document with a 10k-element metric value array and 10k dimension sets expands to ~800 MB of f64 values. &lt;code&gt;samples_bounded(max_samples, max_values)&lt;/code&gt; computes the expansion size and rejects over-cap inputs with &lt;code&gt;EmfError::ExpansionTooLarge&lt;/code&gt; &lt;em&gt;before&lt;/em&gt; allocating the flattened &lt;code&gt;Vec&amp;lt;EmfSample&amp;gt;&lt;/code&gt; or cloned f64 value/count arrays. Use it after an upstream body-size cap on any boundary that accepts untrusted EMF — it bounds f64 expansion, not total document bytes, so the upstream cap is the first line of defence.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  No proprietary archive format
&lt;/h2&gt;

&lt;p&gt;The contract is in &lt;a href="https://github.com/abyo-software/s4-logs/blob/main/DESIGN.md#14-v10-format-stability-contract-2026-06-12" rel="noopener noreferrer"&gt;DESIGN.md §14&lt;/a&gt;: the persisted on-disk format is frozen for the 1.x line. Data objects are plain standard zstd over JSONL — no S4-only container. The archive remains readable without the S4 Logs binary:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws s3 &lt;span class="nb"&gt;cp &lt;/span&gt;s3://bucket/path/file.jsonl.zst - | zstd &lt;span class="nt"&gt;-dc&lt;/span&gt; | jq &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Sidecars (&lt;code&gt;.s4index&lt;/code&gt;, &lt;code&gt;.s4lts&lt;/code&gt;) and manifest JSON are S4-specific but documented. The repository ships an Athena DDL recipe in &lt;a href="https://github.com/abyo-software/s4-logs/blob/main/docs/athena.md" rel="noopener noreferrer"&gt;&lt;code&gt;docs/athena.md&lt;/code&gt;&lt;/a&gt;: the explicit &lt;code&gt;ADD PARTITION&lt;/code&gt; form is what's verified end-to-end; a partition-projection variant over &lt;code&gt;account/loggroup/dt&lt;/code&gt; is documented but only parse-validated so far.&lt;/p&gt;

&lt;h2&gt;
  
  
  Validation against real AWS
&lt;/h2&gt;

&lt;p&gt;Numbers measured against a real &lt;code&gt;us-east-1&lt;/code&gt; account with seeded synthetic data. We didn't run this against production-organic logs and we label it as such.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mode A&lt;/strong&gt; (2026-06-10): 33,163,647 events / 5.00 GiB of message bytes seeded across 16 streams. Drain over 5 windows at &lt;code&gt;--concurrency 4&lt;/code&gt; completed in 94.6 min with 0 &lt;code&gt;ThrottlingException&lt;/code&gt;. Drain output and Athena full count agreed at &lt;strong&gt;33,163,613 events&lt;/strong&gt;; the 34-event difference vs the seeder count was not observed in CloudWatch reads either (i.e. CloudWatch never returned them via &lt;code&gt;FilterLogEvents&lt;/code&gt;) and remains unattributed. JSONL output 9.7 GiB → archive &lt;strong&gt;1.6 GiB zstd (6.2×)&lt;/strong&gt;, 41 objects.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mode B + restore&lt;/strong&gt; (2026-06-12, KB-scale): the live validation path used the AWS CLI / SDK endpoint override. Fluent Bit and CloudWatch Agent should work within the same five-action subset, but their exact plugin/version combinations were not part of this real-AWS run. With the AWS CLI: gateway-routed &lt;code&gt;PutLogEvents&lt;/code&gt; landed at the correct S3 layout (&lt;code&gt;dt=...&lt;/code&gt;); &lt;code&gt;both&lt;/code&gt; reached real CloudWatch and S3; &lt;code&gt;s3&lt;/code&gt;-only created no real CloudWatch log group; &lt;code&gt;restore --to-log-group&lt;/code&gt; re-ingested at current time with the original timestamp wrapped, because CW rejects timestamps older than 14 days or older than the group retention.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LocalStack soak&lt;/strong&gt; (separate, 2 hours): 715,817 requests / 7,158,170 events acked and durable / 0 loss / 0 failures / RSS +2.3 MiB.&lt;/p&gt;

&lt;p&gt;Mode A real-AWS experiment cost ~$2.60 by list-price math (Athena and S3 pennies observed; the CloudWatch ingest line had not yet posted before teardown). Mode B validation was at KB scale.&lt;/p&gt;

&lt;h2&gt;
  
  
  Limitations worth knowing
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The gateway is a five-action CloudWatch Logs compatibility subset, not a full CloudWatch Logs emulator. Unsupported actions are out of scope.&lt;/li&gt;
&lt;li&gt;Single AWS account per deployment; multi-account / Organizations support is not implemented here.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--shard-streams&lt;/code&gt; is the parallelism knob for groups with many streams. It defaults to 1 (unsharded). When you set it &amp;gt;1, the partitioner respects AWS's 100-stream filter limit by auto-growing the shard count if needed. We have not benchmarked the path at thousands-of-streams scale yet.&lt;/li&gt;
&lt;li&gt;Late-arriving / backdated events can be missed if you drain a window before they index. The default &lt;code&gt;--to&lt;/code&gt; cutoff already builds in a 15-minute ingestion-lag buffer (measured visibility lag for backdated events was 3–5.5 min in the real-AWS run), but you should still drain mature windows for agent-delivered or backdated stragglers, or pass &lt;code&gt;--reconcile&lt;/code&gt; to re-page a manifested range and append only what's new (dedup by event identity). &lt;code&gt;--reconcile&lt;/code&gt; re-pages the full window, so it costs the same CloudWatch time as a fresh drain of that range — it's a repair tool, not a cheap verification sweep.&lt;/li&gt;
&lt;li&gt;Glacier Instant Retrieval is millisecond-access but has the usual S3 GIR catches: 90-day minimum storage duration, per-GB retrieval charges, and 128 KiB minimum billable object size. &lt;code&gt;--storage-class glacier-ir&lt;/code&gt; applies only to data objects; sidecars and manifests stay on S3 Standard.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;samples_bounded()&lt;/code&gt; bounds the f64 expansion, not total document bytes or per-sample string clones (dimension key/value, namespace, metric name, unit). Pair it with an upstream request-body size cap.&lt;/li&gt;
&lt;li&gt;Default request auth is off (as noted above). SigV4 mode is single static access key/secret only: no IAM policy evaluation, no STS / session tokens, no presigned URLs. &lt;code&gt;/health&lt;/code&gt;, &lt;code&gt;/ready&lt;/code&gt;, and &lt;code&gt;/metrics&lt;/code&gt; remain unauthenticated regardless, so the network boundary stays load-bearing.&lt;/li&gt;
&lt;li&gt;The gateway applies backpressure at &lt;code&gt;--max-buffered-bytes&lt;/code&gt; (default 256MiB); clients receive 503 once the buffer fills, and are expected to retry.&lt;/li&gt;
&lt;li&gt;Drain output is byte-deterministic across re-runs only at &lt;code&gt;--shard-streams 1&lt;/code&gt;. Shards &amp;gt;1 trade byte-determinism for speed; the record set is the same but the order inside data objects isn't.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Installing
&lt;/h2&gt;

&lt;p&gt;Linux x86_64 / aarch64 static musl binaries from GitHub Releases, or:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cargo &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--git&lt;/span&gt; https://github.com/abyo-software/s4-logs s4logs-cli
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Docker: &lt;code&gt;docker build -t s4logs .&lt;/code&gt; (~176 MB runtime).&lt;/p&gt;

&lt;p&gt;Repo, README, DESIGN.md, and the full validation tables at &lt;a href="https://github.com/abyo-software/s4-logs" rel="noopener noreferrer"&gt;github.com/abyo-software/s4-logs&lt;/a&gt;. Apache-2.0.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>cloudwatchlogs</category>
      <category>rust</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
