<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Arya Gorjipour</title>
    <description>The latest articles on DEV Community by Arya Gorjipour (@arysmart).</description>
    <link>https://dev.to/arysmart</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3894767%2Fe25e68fb-6b49-4efa-b77a-a2621b7e6b5c.jpeg</url>
      <title>DEV Community: Arya Gorjipour</title>
      <link>https://dev.to/arysmart</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/arysmart"/>
    <language>en</language>
    <item>
      <title>logdive v0.3.0 — the one where I finally added parens (and four more things my heart wanted)</title>
      <dc:creator>Arya Gorjipour</dc:creator>
      <pubDate>Sun, 07 Jun 2026 14:00:00 +0000</pubDate>
      <link>https://dev.to/arysmart/logdive-v030-the-one-where-i-finally-added-parens-and-four-more-things-my-heart-wanted-1gbe</link>
      <guid>https://dev.to/arysmart/logdive-v030-the-one-where-i-finally-added-parens-and-four-more-things-my-heart-wanted-1gbe</guid>
      <description>&lt;p&gt;v0.2.0 was a good release. I was happy with it. I used it.&lt;/p&gt;

&lt;p&gt;Then I kept trying to write &lt;code&gt;(level=error OR level=warn) AND service=payments&lt;/code&gt; and the tool just... didn't know what parens were. Three separate times. Same query. Same sigh. Same manual rewrite to flatten it.&lt;/p&gt;

&lt;p&gt;I shipped logdive to scratch my own itch. My itch wasn't done itching.&lt;/p&gt;

&lt;p&gt;v0.3.0 is five things my heart kept asking for while using v0.2. Parenthesised queries are the headline — I literally promised that in the v0.2 article. But there's also pagination in both the CLI and the API, case-insensitive level queries (because &lt;code&gt;level=ERROR&lt;/code&gt; and &lt;code&gt;level=error&lt;/code&gt; should absolutely be the same thing), a distrreleases/tag/v0.3.0oless Docker image that doesn't need curl to healthcheck itself, and a website that now exists.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# The thing I kept trying to write&lt;/span&gt;
logdive query &lt;span class="s1"&gt;'(level=error OR level=warn) AND service=payments'&lt;/span&gt;

&lt;span class="c"&gt;# Page through results instead of drowning in them&lt;/span&gt;
logdive query &lt;span class="s1"&gt;'service=payments'&lt;/span&gt; &lt;span class="nt"&gt;--limit&lt;/span&gt; 50 &lt;span class="nt"&gt;--offset&lt;/span&gt; 100

&lt;span class="c"&gt;# Or through the API — same thing, different surface&lt;/span&gt;
curl &lt;span class="s1"&gt;'http://localhost:4000/query?q=service%3Dpayments&amp;amp;limit=50&amp;amp;offset=100'&lt;/span&gt;

&lt;span class="c"&gt;# These are finally identical. They were not.&lt;/span&gt;
logdive query &lt;span class="s1"&gt;'level=ERROR'&lt;/span&gt;
logdive query &lt;span class="s1"&gt;'level=error'&lt;/span&gt;

&lt;span class="c"&gt;# Smaller container. No curl. Healthchecks itself.&lt;/span&gt;
docker pull ghcr.io/aryagorjipour/logdive:0.3.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cargo &lt;span class="nb"&gt;install &lt;/span&gt;logdive logdive-api &lt;span class="nt"&gt;--force&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;417 tests passing. Five milestones. Binaries still 3.9 MB and 4.2 MB.&lt;/p&gt;




&lt;h2&gt;
  
  
  M1 — Parenthesised queries
&lt;/h2&gt;

&lt;p&gt;This was the one. The v0.2 article literally put it at the top of the contributions list and called it "the v0.3 flagship." Accountability shipped.&lt;/p&gt;

&lt;p&gt;The v0.2 grammar had two levels: &lt;code&gt;OR &amp;gt; AND&lt;/code&gt;. AND binds tighter. Good enough until you need OR of multiple conditions grouped with AND of something else. Then you're doing De Morgan algebra on your query and that's not what you want at midnight.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;query     := or_expr [ TIME_RANGE ]
or_expr   := and_expr (OR and_expr)*
and_expr  := clause (AND clause)*
clause    := field OP value
           | field CONTAINS string
           | "(" or_expr ")"    ← new
           | TIME_RANGE
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;(level=error OR level=warn) AND service=payments&lt;/code&gt; generates:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="k"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;level&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;level&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;json_extract&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fields&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'$.service'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The inner group gets its own SQL sub-expression. You can't construct something that silently breaks precedence — the generator always parenthesises each nesting level.&lt;/p&gt;

&lt;p&gt;The new AST variant is &lt;code&gt;Clause::Group(Box&amp;lt;QueryNode&amp;gt;)&lt;/code&gt;. The &lt;code&gt;Box&lt;/code&gt; is there because Rust won't let you have a recursively-sized type without it, which is a very Rust thing to be strict about. An arena would be cleaner if query parsing were a hot path. It isn't, so — &lt;code&gt;Box&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  M2 — &lt;code&gt;--offset&lt;/code&gt; and the rename I had to make
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The additive part:&lt;/strong&gt; &lt;code&gt;--offset&lt;/code&gt; is now a real flag.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;logdive query &lt;span class="s1"&gt;'service=payments'&lt;/span&gt; &lt;span class="nt"&gt;--limit&lt;/span&gt; 50
logdive query &lt;span class="s1"&gt;'service=payments'&lt;/span&gt; &lt;span class="nt"&gt;--limit&lt;/span&gt; 50 &lt;span class="nt"&gt;--offset&lt;/span&gt; 50
logdive query &lt;span class="s1"&gt;'service=payments'&lt;/span&gt; &lt;span class="nt"&gt;--limit&lt;/span&gt; 50 &lt;span class="nt"&gt;--offset&lt;/span&gt; 100
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;--offset 0&lt;/code&gt; and no flag at all are the same thing. Default limit is still 1000. &lt;code&gt;--limit 0&lt;/code&gt; still means "all of them, good luck."&lt;/p&gt;

&lt;p&gt;Adding offset meant &lt;code&gt;execute(query, conn, Option&amp;lt;usize&amp;gt;)&lt;/code&gt; had to become &lt;code&gt;execute(query, conn, QueryOptions)&lt;/code&gt;. A bare &lt;code&gt;Option&amp;lt;usize&amp;gt;&lt;/code&gt; for one parameter is fine. For two it starts getting philosophical. The struct should have been there from v0.1. It's there now.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The breaking part:&lt;/strong&gt; &lt;code&gt;--format&lt;/code&gt; on the query subcommand is now &lt;code&gt;--output&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# v0.2&lt;/span&gt;
logdive query &lt;span class="s1"&gt;'level=error'&lt;/span&gt; &lt;span class="nt"&gt;--format&lt;/span&gt; json

&lt;span class="c"&gt;# v0.3&lt;/span&gt;
logdive query &lt;span class="s1"&gt;'level=error'&lt;/span&gt; &lt;span class="nt"&gt;--output&lt;/span&gt; json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;--format&lt;/code&gt; already existed on &lt;code&gt;ingest&lt;/code&gt; to pick the &lt;em&gt;input&lt;/em&gt; log format (JSON, logfmt, plain). Two different &lt;code&gt;--format&lt;/code&gt; flags doing two different things on two different subcommands is a documentation problem that keeps getting worse. One word to fix it. I did not add a deprecated alias — a deprecated alias that silently works is just confusion that lives for three more versions.&lt;/p&gt;




&lt;h2&gt;
  
  
  M3 — HTTP pagination
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;GET /query&lt;/code&gt; now takes &lt;code&gt;?offset=&lt;/code&gt;. Mirrors &lt;code&gt;--offset&lt;/code&gt; exactly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s1"&gt;'http://localhost:4000/query?q=level%3Derror&amp;amp;limit=50&amp;amp;offset=0'&lt;/span&gt;
curl &lt;span class="s1"&gt;'http://localhost:4000/query?q=level%3Derror&amp;amp;limit=50&amp;amp;offset=50'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The benchmark number I didn't expect: page 1 at 100k rows costs ~42 ms, deep page at offset 2450 costs ~50 ms. 8 ms overhead to skip 2450 rows. That's because &lt;code&gt;LIMIT x OFFSET y&lt;/code&gt; in SQLite counts forward from zero — no scroll cursor, no magic. For building a UI on top of the API it's fine. For "give me rows 500,000 through 500,050" — use a time range query, it'll be faster and make more sense anyway.&lt;/p&gt;




&lt;h2&gt;
  
  
  M4 — &lt;code&gt;level=ERROR&lt;/code&gt; and &lt;code&gt;level=error&lt;/code&gt; are the same query now
&lt;/h2&gt;

&lt;p&gt;This one seems like it should have been there from the start. It wasn't. If your service logs &lt;code&gt;WARN&lt;/code&gt; and you searched &lt;code&gt;level=warn&lt;/code&gt;, you got nothing and probably thought the tool was broken.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# All three hit the same index, return the same rows&lt;/span&gt;
logdive query &lt;span class="s1"&gt;'level=ERROR'&lt;/span&gt;
logdive query &lt;span class="s1"&gt;'level=warn'&lt;/span&gt;
logdive query &lt;span class="s1"&gt;'level=Warning'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Implementation: a functional expression index.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="n"&gt;idx_level_norm&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;log_entries&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;level&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The executor routes every level field lookup through &lt;code&gt;lower(level) = ?&lt;/code&gt; with a Rust-lowercased bind value.&lt;/p&gt;

&lt;p&gt;The wrong path I went down first: &lt;code&gt;ALTER TABLE ADD COLUMN level_norm TEXT GENERATED ALWAYS AS (lower(level)) STORED&lt;/code&gt;. This works in SQLite! But it means existing databases need a migration and new installs use &lt;code&gt;CREATE TABLE&lt;/code&gt;. You need a version guard to tell them apart. The functional index approach needs none of that — &lt;code&gt;CREATE INDEX IF NOT EXISTS&lt;/code&gt; is idempotent, runs on every &lt;code&gt;Indexer::open()&lt;/code&gt;, picks up existing databases automatically.&lt;/p&gt;

&lt;p&gt;It's in &lt;code&gt;docs/traps.md&lt;/code&gt; now. That file is starting to earn its name.&lt;/p&gt;

&lt;p&gt;The benchmark result: lowercase, uppercase, mixed-case level queries on 100k rows — all three at ~51 ms. Identical. The index is doing exactly what it's supposed to do.&lt;/p&gt;




&lt;h2&gt;
  
  
  M5 — Distroless Docker and &lt;code&gt;--health-check&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;The v0.2 Dockerfile had this healthcheck:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;HEALTHCHECK&lt;/span&gt;&lt;span class="s"&gt; CMD curl -fs http://localhost:4000/version || exit 1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;curl is 3.6 MB. It's in the image to make one TCP connection every 30 seconds. I finally stopped accepting this.&lt;/p&gt;

&lt;p&gt;Runtime stage is now &lt;code&gt;gcr.io/distroless/cc-debian12:nonroot&lt;/code&gt;. No shell, no package manager, no curl, uid 65532. Container drops to ~15 MB.&lt;/p&gt;

&lt;p&gt;Since distroless has no shell, &lt;code&gt;CMD curl ...&lt;/code&gt; in the Dockerfile gets rejected at build time. Good. That rejection forced me to do the right thing.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;logdive-api&lt;/code&gt; now takes &lt;code&gt;--health-check&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;logdive-api &lt;span class="nt"&gt;--health-check&lt;/span&gt;
&lt;span class="c"&gt;# opens TcpStream::connect("127.0.0.1:&amp;lt;port&amp;gt;")&lt;/span&gt;
&lt;span class="c"&gt;# exits 0 if the server is up, exits 1 if it isn't&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;HEALTHCHECK&lt;/span&gt;&lt;span class="s"&gt; CMD ["/usr/local/bin/logdive-api", "--health-check"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The binary checks itself. No curl. No shell.&lt;/p&gt;

&lt;p&gt;One trap: you can't do &lt;code&gt;RUN mkdir -p /data&lt;/code&gt; in a distroless runtime stage. No shell to interpret it. You have to create the directory in the builder stage and copy it across:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="c"&gt;# Builder&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; /data

&lt;span class="c"&gt;# Runtime&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; --from=builder /data /data&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The error message when you forget this is not particularly helpful. Now it's here instead.&lt;/p&gt;




&lt;h2&gt;
  
  
  Breaking changes
&lt;/h2&gt;

&lt;p&gt;Three things that will break something:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scope&lt;/th&gt;
&lt;th&gt;Old&lt;/th&gt;
&lt;th&gt;New&lt;/th&gt;
&lt;th&gt;Fix&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;CLI&lt;/td&gt;
&lt;td&gt;&lt;code&gt;query --format json&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;query --output json&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;one word in your scripts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;logdive-core&lt;/code&gt; lib&lt;/td&gt;
&lt;td&gt;&lt;code&gt;execute(q, conn, Option&amp;lt;usize&amp;gt;)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;execute(q, conn, QueryOptions)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;see below&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Docker&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;curl GET /version&lt;/code&gt; healthcheck&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;--health-check&lt;/code&gt; TCP flag&lt;/td&gt;
&lt;td&gt;update compose / k8s probes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Library migration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="c1"&gt;// v0.2&lt;/span&gt;
&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;Some&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// v0.3&lt;/span&gt;
&lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;logdive_core&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;QueryOptions&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;QueryOptions&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;Some&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;offset&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;None&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;QueryOptions&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;default&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// no limit, no offset&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;QueryOptions::default()&lt;/code&gt; is zero-offset, no-limit. If you were passing &lt;code&gt;None&lt;/code&gt; before, it's a drop-in.&lt;/p&gt;




&lt;h2&gt;
  
  
  Benchmarks
&lt;/h2&gt;

&lt;p&gt;New groups for v0.3 features (100k rows):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;What&lt;/th&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Number&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;OR queries&lt;/td&gt;
&lt;td&gt;2-branch, 50% match&lt;/td&gt;
&lt;td&gt;68 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OR queries&lt;/td&gt;
&lt;td&gt;4-branch, 100% match&lt;/td&gt;
&lt;td&gt;99 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OR queries&lt;/td&gt;
&lt;td&gt;JSON field&lt;/td&gt;
&lt;td&gt;2.5 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Paren groups&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;(A OR B) AND C&lt;/code&gt;, 12.5% match&lt;/td&gt;
&lt;td&gt;45 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Case-insensitive level&lt;/td&gt;
&lt;td&gt;lowercase / UPPERCASE / Mixed&lt;/td&gt;
&lt;td&gt;~51 ms, identical&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pagination&lt;/td&gt;
&lt;td&gt;page 1&lt;/td&gt;
&lt;td&gt;42 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pagination&lt;/td&gt;
&lt;td&gt;deep page (offset 2450)&lt;/td&gt;
&lt;td&gt;50 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Ingest numbers haven't changed (v0.2 ingest paths weren't touched):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;What&lt;/th&gt;
&lt;th&gt;Number&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Batched insert, 10k rows&lt;/td&gt;
&lt;td&gt;~189k rows/s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Parse + insert, 10k rows&lt;/td&gt;
&lt;td&gt;~150k rows/s&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The 4-branch OR at 99 ms looks alarming until you remember it's returning every single row from a 100k-row corpus. The bottleneck is serialisation. The query engine itself is fine.&lt;/p&gt;




&lt;h2&gt;
  
  
  Tradeoffs, I'll be honest
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;--format&lt;/code&gt; → &lt;code&gt;--output&lt;/code&gt; will break your scripts.&lt;/strong&gt; No alias. One word, then it's clean. If you want to be annoyed at me about it — fair, but this was the right call.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Distroless means no shell in the container.&lt;/strong&gt; You can't &lt;code&gt;docker exec -it mycontainer bash&lt;/code&gt; anymore. Use &lt;code&gt;gcr.io/distroless/cc-debian12:debug&lt;/code&gt; if you need to poke around. The tradeoff is worth it, but it's a real operational change worth knowing about before you're in an incident.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Deep offset pagination has a cost.&lt;/strong&gt; SQLite walks from row zero every time. For page 50 it's fine. For page 50,000, consider a time-range query instead — it'll be faster and is usually what you actually wanted anyway.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The full landing page redesign is still pending.&lt;/strong&gt; The site has v0.3.0 content — accurate numbers, updated terminal preview. The full Astro 5 + Tailwind 4 redesign is waiting on a design file. Not v0.3's problem.&lt;/p&gt;




&lt;h2&gt;
  
  
  The website is live
&lt;/h2&gt;

&lt;p&gt;Speaking of which: &lt;a href="https://aryagorjipour.github.io/logdive" rel="noopener noreferrer"&gt;aryagorjipour.github.io/logdive&lt;/a&gt; is updated and real.&lt;/p&gt;

&lt;p&gt;Stat cards reflect the current benchmarks. Terminal preview shows &lt;code&gt;--output json&lt;/code&gt; and a paren query. The roadmap section is accurate. There's a GitHub stars counter that does a client-side fetch and falls back to &lt;code&gt;—&lt;/code&gt; gracefully if the API is having a day.&lt;/p&gt;

&lt;p&gt;Go look at it. Tell me what's wrong with it.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's next — and I'm taking a break after this
&lt;/h2&gt;

&lt;p&gt;v0.4.0 planned scope:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Benchmark suite at 500k rows (100k isn't stressful enough for the executor's real hot paths)&lt;/li&gt;
&lt;li&gt;Query latency improvements&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--output yaml&lt;/code&gt; and &lt;code&gt;--output csv&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Windows &lt;code&gt;--follow&lt;/code&gt; — the &lt;code&gt;(dev, ino)&lt;/code&gt; rotation check has been Unix-only since v0.2&lt;/li&gt;
&lt;li&gt;Configurable retention by source/tag&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But honestly: after this release I'm stepping back from logdive for a bit to work on some other projects. v0.3.0 is in a clean state. &lt;code&gt;prerelease-check.sh&lt;/code&gt; passes all 11 steps. 417 tests green. The breaking changes are documented.&lt;/p&gt;

&lt;p&gt;Good place to breathe.&lt;/p&gt;

&lt;p&gt;If something genuinely breaks (security issue, data loss) — file an issue, I'll look. Everything else waits for v0.4.0.&lt;/p&gt;




&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Repo:&lt;/strong&gt; &lt;a href="https://github.com/Aryagorjipour/logdive" rel="noopener noreferrer"&gt;github.com/Aryagorjipour/logdive&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Website:&lt;/strong&gt; &lt;a href="https://aryagorjipour.github.io/logdive" rel="noopener noreferrer"&gt;aryagorjipour.github.io/logdive&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Crates:&lt;/strong&gt; &lt;a href="https://crates.io/crates/logdive" rel="noopener noreferrer"&gt;logdive&lt;/a&gt; · &lt;a href="https://crates.io/crates/logdive-core" rel="noopener noreferrer"&gt;logdive-core&lt;/a&gt; · &lt;a href="https://crates.io/crates/logdive-api" rel="noopener noreferrer"&gt;logdive-api&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docker:&lt;/strong&gt; &lt;code&gt;ghcr.io/aryagorjipour/logdive&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Arya Gorjipour — backend engineer, logdive maintainer.&lt;br&gt;
&lt;a href="https://github.com/Aryagorjipour" rel="noopener noreferrer"&gt;@Aryagorjipour&lt;/a&gt; · &lt;a href="https://twitter.com/Arysmart1" rel="noopener noreferrer"&gt;@Arysmart1&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you run &lt;code&gt;cargo bench&lt;/code&gt; on your machine and the numbers are interesting — I want to see them. If you debug a real incident with v0.3.0 — I &lt;em&gt;really&lt;/em&gt; want to hear about that.&lt;/p&gt;

</description>
      <category>rust</category>
      <category>opensource</category>
      <category>showdev</category>
      <category>devops</category>
    </item>
    <item>
      <title>logdive v0.2.0: OR queries, follow mode, and the features I shipped after I tried using my own tool</title>
      <dc:creator>Arya Gorjipour</dc:creator>
      <pubDate>Tue, 26 May 2026 18:00:00 +0000</pubDate>
      <link>https://dev.to/arysmart/logdive-v020-or-queries-follow-mode-and-the-features-i-shipped-after-i-tried-using-my-own-tool-2b7l</link>
      <guid>https://dev.to/arysmart/logdive-v020-or-queries-follow-mode-and-the-features-i-shipped-after-i-tried-using-my-own-tool-2b7l</guid>
      <description>&lt;p&gt;Two weeks after I shipped logdive v0.1.0, I tried using it.&lt;br&gt;
Within an hour I wanted to write &lt;code&gt;level=error OR level=warn&lt;/code&gt;. The README said no.&lt;br&gt;
Within a day I wanted to tail a growing log file. The README said no.&lt;br&gt;
Within a week I had a 1.8 GB index from a CI pipeline and no way to trim it. The README didn't say anything, because I hadn't thought about it.&lt;br&gt;
logdive v0.2.0 ships everything the v0.1 README told you not to ask for.&lt;/p&gt;


&lt;div class="ltag_asciinema"&gt;
  
&lt;/div&gt;


&lt;p&gt;&lt;a href="https://asciinema.org/a/6N30YZro5O7Z7kXJ" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fasciinema.org%2Fa%2F6N30YZro5O7Z7kXJ.svg" alt="asciicast|0" width="1667" height="933"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Twenty seconds: ingest a couple of structured lines, then query them with &lt;code&gt;OR&lt;/code&gt;. No daemon. No config file. One binary.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What's new at a glance
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Tail a live log file — finally.&lt;/span&gt;
logdive ingest &lt;span class="nt"&gt;--file&lt;/span&gt; ./logs/app.log &lt;span class="nt"&gt;--follow&lt;/span&gt;

&lt;span class="c"&gt;# logfmt and plain text, not just JSON.&lt;/span&gt;
logdive ingest &lt;span class="nt"&gt;--file&lt;/span&gt; ./nginx-access.log &lt;span class="nt"&gt;--format&lt;/span&gt; logfmt &lt;span class="nt"&gt;--tag&lt;/span&gt; nginx
logdive ingest &lt;span class="nt"&gt;--file&lt;/span&gt; ./app-old.log &lt;span class="nt"&gt;--format&lt;/span&gt; plain &lt;span class="nt"&gt;--timestamp-now&lt;/span&gt;

&lt;span class="c"&gt;# OR. Just OR.&lt;/span&gt;
logdive query &lt;span class="s1"&gt;'level=error OR level=warn'&lt;/span&gt;

&lt;span class="c"&gt;# Trim the index before it eats your disk.&lt;/span&gt;
logdive prune &lt;span class="nt"&gt;--older-than&lt;/span&gt; 30d

&lt;span class="c"&gt;# Run the whole thing in Docker. Multi-arch. /data persists.&lt;/span&gt;
docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-v&lt;/span&gt; logdive-data:/data &lt;span class="nt"&gt;-p&lt;/span&gt; 4000:4000 ghcr.io/aryagorjipour/logdive

&lt;span class="c"&gt;# Tell a client what the running server can do, with one call.&lt;/span&gt;
curl http://localhost:4000/version
&lt;span class="c"&gt;# → {"version":"0.2.0","formats":["json","logfmt","plain"],"capabilities":["query","stats","version"]}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Available now:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cargo &lt;span class="nb"&gt;install &lt;/span&gt;logdive logdive-api &lt;span class="nt"&gt;--force&lt;/span&gt;
&lt;span class="c"&gt;# or&lt;/span&gt;
docker pull ghcr.io/aryagorjipour/logdive:0.2.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Six numbered milestones. 330 tests passing. Both binaries still under 10 MB (&lt;code&gt;logdive&lt;/code&gt; 3.9 MB, &lt;code&gt;logdive-api&lt;/code&gt; 4.2 MB). The compiled Docker image lands at 97 MB.&lt;/p&gt;

&lt;h2&gt;
  
  
  "v1 non-goals" was an aspirational list. Then I read it back.
&lt;/h2&gt;

&lt;p&gt;The v0.1 README had a tidy little section calling six things explicit non-goals. Some of them genuinely should be non-goals (multi-machine clustering, log shipping daemons). Three of them were just "things I didn't want to write yet, in a costume."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OR queries.&lt;/strong&gt; I argued in the v0.1 post that AND covered the dominant query pattern and OR would roughly double the parser. Both were true. Both were also irrelevant the first time I needed &lt;code&gt;level=error OR level=warn&lt;/code&gt; during a real incident and couldn't write it. "I don't want to grow the parser" is a feeling, not an engineering position.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Follow mode.&lt;/strong&gt; v0.1 was a batch tool. Pipe a file in, query it, walk away. The first time I wanted to watch a container that was actively misbehaving, the missing &lt;code&gt;--follow&lt;/code&gt; flag turned a five-second loop into "re-ingest every thirty seconds." That's not a tool — that's a chore the tool is now causing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Non-JSON ingestion.&lt;/strong&gt; "v1 is JSON-only" sounded principled. It stopped sounding principled the first time someone tried logdive against nginx access logs or a legacy Java service emitting Apache-style plaintext with a &lt;code&gt;[ERROR]&lt;/code&gt; prefix.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pruning.&lt;/strong&gt; This one I just didn't think about. Then I ran logdive against a CI pipeline for two weeks and the index hit 1.8 GB. The fix in v0.1 was &lt;code&gt;rm ~/.logdive/index.db &amp;amp;&amp;amp; start over&lt;/code&gt;. The kind of UX I'd mock in someone else's tool.&lt;/p&gt;

&lt;p&gt;So: v0.2.0 ships all four. Plus a Docker image, plus a versioned API, because if you're going to break your own scope rules you may as well do it once.&lt;/p&gt;

&lt;h2&gt;
  
  
  M1 — OR
&lt;/h2&gt;

&lt;p&gt;The query language now has a real two-level grammar:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;query    := and_expr (OR and_expr)*
and_expr := clause (AND clause)*
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;AND binds tighter than OR, the way it does in SQL and the way your fingers expect:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;logdive query &lt;span class="s1"&gt;'level=error OR level=warn'&lt;/span&gt;
logdive query &lt;span class="s1"&gt;'level=error AND service=payments OR level=warn AND tag=worker'&lt;/span&gt;
&lt;span class="c"&gt;# Reads as: (level=error AND service=payments) OR (level=warn AND tag=worker)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Parenthesised groups (&lt;code&gt;(a OR b) AND c&lt;/code&gt;) are still out — that's the headline v0.3 milestone. AND+OR covers ~95% of real queries; the remaining 5% you can reshape with De Morgan's laws and a small sigh.&lt;/p&gt;

&lt;p&gt;Implementation note for anyone curious: it's still a hand-written recursive-descent parser, no parser-combinator library, ~340 lines of pure Rust in &lt;code&gt;crates/core/src/query.rs&lt;/code&gt;. Two new methods (&lt;code&gt;parse_or_expr&lt;/code&gt;, &lt;code&gt;parse_and_expr&lt;/code&gt;), one new AST variant (&lt;code&gt;AndGroup { clauses: Vec&amp;lt;Clause&amp;gt; }&lt;/code&gt;), and a SQL generator that always parenthesises each AND-group so &lt;code&gt;WHERE&lt;/code&gt; clauses come out unambiguous:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;level&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;json_extract&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fields&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'$.service'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
   &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;level&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;tag&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One breaking change for library users: &lt;code&gt;QueryNode::And(Vec&amp;lt;Clause&amp;gt;)&lt;/code&gt; is now &lt;code&gt;QueryNode::Or(Vec&amp;lt;AndGroup&amp;gt;)&lt;/code&gt;. Even single-clause queries wrap in the two-level structure — uniformity for the executor, slight clumsiness if you were pattern-matching. CLI users see nothing.&lt;/p&gt;

&lt;h2&gt;
  
  
  M2 — logfmt and plain
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;logdive ingest&lt;/code&gt; gained a &lt;code&gt;--format&lt;/code&gt; flag with three values:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Default — JSON, identical to v0.1&lt;/span&gt;
logdive ingest &lt;span class="nt"&gt;--file&lt;/span&gt; app.log

&lt;span class="c"&gt;# logfmt — service=payments user_id=4812 duration_ms=120&lt;/span&gt;
logdive ingest &lt;span class="nt"&gt;--file&lt;/span&gt; legacy.log &lt;span class="nt"&gt;--format&lt;/span&gt; logfmt

&lt;span class="c"&gt;# Plain text — whole line becomes the `message` field&lt;/span&gt;
logdive ingest &lt;span class="nt"&gt;--file&lt;/span&gt; random-app.log &lt;span class="nt"&gt;--format&lt;/span&gt; plain
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The logfmt parser is hand-written: roughly 200 lines, no &lt;code&gt;nom&lt;/code&gt;, no &lt;code&gt;pest&lt;/code&gt;. It handles escaped quotes (&lt;code&gt;\"&lt;/code&gt;, &lt;code&gt;\\&lt;/code&gt;), bareword booleans (&lt;code&gt;debug&lt;/code&gt; → &lt;code&gt;debug=true&lt;/code&gt;, the Heroku convention), hyphenated and dotted keys (&lt;code&gt;request-id&lt;/code&gt;, &lt;code&gt;user.id&lt;/code&gt;), and the last-write-wins rule on duplicate keys that &lt;code&gt;go-kit/log&lt;/code&gt; uses. A malformed token inside an otherwise-fine line is skipped to the next whitespace boundary; only a truly fatal condition — empty input, no parseable pairs, an unterminated quote — drops the whole line. Real-world logfmt is messy; the parser tries to be the more polite party.&lt;/p&gt;

&lt;p&gt;Plain is what it sounds like. The entire line becomes &lt;code&gt;LogEntry::message&lt;/code&gt;. No timestamp parsing, no level extraction, no heuristics — because every plaintext format encodes those differently and a wrong guess silently corrupts your &lt;code&gt;last 2h&lt;/code&gt; queries.&lt;/p&gt;

&lt;p&gt;For formats without their own timestamps, a new &lt;code&gt;--timestamp-now&lt;/code&gt; flag stamps each entry with the current ingestion time:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker logs my-container | logdive ingest &lt;span class="nt"&gt;--format&lt;/span&gt; plain &lt;span class="nt"&gt;--timestamp-now&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Opt-in. v0.1's skip-if-no-timestamp policy is still the default, because fabricating timestamps without explicit consent is the kind of foot-gun you discover at 3am.&lt;/p&gt;

&lt;h2&gt;
  
  
  M3 — &lt;code&gt;--follow&lt;/code&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;logdive ingest &lt;span class="nt"&gt;--file&lt;/span&gt; ./logs/app.log &lt;span class="nt"&gt;--follow&lt;/span&gt;
&lt;span class="c"&gt;# Ctrl-C to exit. Detects rotation. Handles truncation.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The implementation lives in &lt;code&gt;crates/core/src/follow.rs&lt;/code&gt;, gated behind &lt;code&gt;#[cfg(unix)]&lt;/code&gt;. It's built on the &lt;code&gt;notify&lt;/code&gt; crate for inotify/FSEvents/kqueue plus a &lt;code&gt;FileTailer&lt;/code&gt; struct that tracks &lt;code&gt;(dev, ino, offset)&lt;/code&gt; and reacts to two conditions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Rotation&lt;/strong&gt; — &lt;code&gt;(dev, ino)&lt;/code&gt; on the watched path changes. The handle gets dropped, the path is reopened, the offset resets to zero.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Truncation&lt;/strong&gt; — same inode, but the file size shrank below the tracked offset. &lt;code&gt;someone-ran &amp;gt; /var/log/app.log&lt;/code&gt;. Offset resets, reading continues from the top.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Both checks run on every read. The tailer starts at EOF (matches &lt;code&gt;tail -f&lt;/code&gt;), buffers partial lines until the newline arrives, and strips &lt;code&gt;\r\n&lt;/code&gt; for the Windows-line-ending crowd. Fifteen unit tests in &lt;code&gt;follow.rs&lt;/code&gt; cover the boring edges: 20 KiB single line spanning three read buffers, unicode, blank lines, mid-rotation gap where the file briefly doesn't exist.&lt;/p&gt;

&lt;p&gt;The CLI loop wires it together with &lt;code&gt;ctrlc&lt;/code&gt; for clean SIGINT/SIGTERM handling and &lt;code&gt;std::sync::mpsc&lt;/code&gt; for the watcher → ingest channel. The CLI stays fully synchronous — no &lt;code&gt;tokio&lt;/code&gt; dependency, because watching a file and reading lines does not need an async runtime, and adding one for the sake of fashion was tempting and I resisted.&lt;/p&gt;

&lt;p&gt;One real limitation: Windows. The &lt;code&gt;(dev, ino)&lt;/code&gt; trick uses &lt;code&gt;std::os::unix::fs::MetadataExt&lt;/code&gt;. The module is gated; on Windows the flag is rejected with an error. Cross-platform rotation detection is a clean v0.3 contribution.&lt;/p&gt;

&lt;h2&gt;
  
  
  M4 — &lt;code&gt;prune&lt;/code&gt; and &lt;code&gt;LOGDIVE_DB&lt;/code&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;logdive prune &lt;span class="nt"&gt;--older-than&lt;/span&gt; 30d
logdive prune &lt;span class="nt"&gt;--before&lt;/span&gt; 2026-01-01
logdive prune &lt;span class="nt"&gt;--older-than&lt;/span&gt; 7d &lt;span class="nt"&gt;--yes&lt;/span&gt;  &lt;span class="c"&gt;# skip [y/N] prompt for cron&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two mutually-exclusive flags, one of which is required (clap's &lt;code&gt;ArgGroup&lt;/code&gt; enforces this). &lt;code&gt;--older-than&lt;/code&gt; accepts a count plus a unit (&lt;code&gt;m&lt;/code&gt;/&lt;code&gt;h&lt;/code&gt;/&lt;code&gt;d&lt;/code&gt;); &lt;code&gt;--before&lt;/code&gt; accepts RFC 3339, naive UTC datetime, or a bare date. By default &lt;code&gt;prune&lt;/code&gt; counts the doomed rows, shows you the number, and asks for confirmation. &lt;code&gt;--yes&lt;/code&gt; skips for scripts.&lt;/p&gt;

&lt;p&gt;Under the hood it's &lt;code&gt;DELETE FROM log_entries WHERE timestamp &amp;lt; ?1&lt;/code&gt; followed by &lt;code&gt;VACUUM&lt;/code&gt;. The &lt;code&gt;VACUUM&lt;/code&gt; is a separate statement because SQLite refuses to run it inside an explicit transaction — a thing I learned by writing the obvious code first, hitting the error, and reading the SQLite docs second.&lt;/p&gt;

&lt;p&gt;Comparison is strict &lt;code&gt;&amp;lt;&lt;/code&gt;, so a row whose timestamp exactly equals the cutoff is kept. Surprising? Yes, slightly. Documented? Now it is.&lt;/p&gt;

&lt;p&gt;Also in M4: the API used to honour &lt;code&gt;LOGDIVE_DB&lt;/code&gt; as an environment-variable fallback for &lt;code&gt;--db&lt;/code&gt;. The CLI did not. v0.2 fixes the asymmetry — both binaries now respect it, with the command-line flag taking precedence when both are set.&lt;/p&gt;

&lt;h2&gt;
  
  
  M5 — &lt;code&gt;GET /version&lt;/code&gt; and configurable CORS
&lt;/h2&gt;

&lt;p&gt;The HTTP API got a third endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;curl http://localhost:4000/version
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"version"&lt;/span&gt;:&lt;span class="s2"&gt;"0.2.0"&lt;/span&gt;,&lt;span class="s2"&gt;"formats"&lt;/span&gt;:[&lt;span class="s2"&gt;"json"&lt;/span&gt;,&lt;span class="s2"&gt;"logfmt"&lt;/span&gt;,&lt;span class="s2"&gt;"plain"&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;,&lt;span class="s2"&gt;"capabilities"&lt;/span&gt;:[&lt;span class="s2"&gt;"query"&lt;/span&gt;,&lt;span class="s2"&gt;"stats"&lt;/span&gt;,&lt;span class="s2"&gt;"version"&lt;/span&gt;&lt;span class="o"&gt;]}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three fields, all compile-time constants. &lt;code&gt;version&lt;/code&gt; is &lt;code&gt;env!("CARGO_PKG_VERSION")&lt;/code&gt;. &lt;code&gt;formats&lt;/code&gt; is &lt;code&gt;LogFormat::ALL.iter().map(|f| f.name()).collect()&lt;/code&gt; — adding a new format to core automatically propagates here, no manual maintenance. &lt;code&gt;capabilities&lt;/code&gt; is hard-coded today, sorted alphabetically with a test pinning that ordering so future additions can't accidentally break a client that did &lt;code&gt;assert_eq!&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The endpoint never touches the database. It's reachable before the first ingest, after a &lt;code&gt;prune --yes&lt;/code&gt;, during a fresh Docker volume's first millisecond — which is exactly what makes it a good &lt;code&gt;HEALTHCHECK&lt;/code&gt; target (see M6).&lt;/p&gt;

&lt;p&gt;CORS is now configurable via &lt;code&gt;--cors-origins&lt;/code&gt; or &lt;code&gt;LOGDIVE_API_CORS_ORIGINS&lt;/code&gt;. Disabled by default (same-origin only). Pass &lt;code&gt;*&lt;/code&gt; to allow any origin, or a comma-separated list for specific ones. Mixing &lt;code&gt;*&lt;/code&gt; with explicit origins is rejected at startup because it's meaningless and almost certainly a typo:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;logdive-api &lt;span class="nt"&gt;--cors-origins&lt;/span&gt; &lt;span class="s1"&gt;'https://app.example.com,https://staging.example.com'&lt;/span&gt;
logdive-api &lt;span class="nt"&gt;--cors-origins&lt;/span&gt; &lt;span class="s1"&gt;'*'&lt;/span&gt;                  &lt;span class="c"&gt;# development convenience&lt;/span&gt;
logdive-api                                      &lt;span class="c"&gt;# production default — no CORS at all&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The implementation is &lt;code&gt;tower-http&lt;/code&gt;'s &lt;code&gt;CorsLayer&lt;/code&gt;, restricted to &lt;code&gt;GET&lt;/code&gt; only because the API is and stays read-only. The router wiring is the only place the methods list lives, so adding write endpoints would force a deliberate code change rather than a one-line config fix. This is the kind of friction you want.&lt;/p&gt;

&lt;p&gt;One trap I hit during M5 worth flagging: &lt;code&gt;tower-http = "0.6"&lt;/code&gt; checks &lt;code&gt;axum&lt;/code&gt; version compatibility at the trait-bound level. My first &lt;code&gt;cargo add tower-http --features cors&lt;/code&gt; got &lt;code&gt;axum 0.7.5&lt;/code&gt; from cache where 0.7.6+ is needed. The error said &lt;code&gt;Service&lt;/code&gt; trait not satisfied. &lt;code&gt;cargo update -p axum&lt;/code&gt; fixed it but the error message gave me zero hints about the root cause.&lt;/p&gt;

&lt;h2&gt;
  
  
  M6 — Multi-arch Docker
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker pull ghcr.io/aryagorjipour/logdive:0.2.0   &lt;span class="c"&gt;# or :latest&lt;/span&gt;

docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; logdive &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; logdive-data:/data &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; 4000:4000 &lt;span class="se"&gt;\&lt;/span&gt;
  ghcr.io/aryagorjipour/logdive

curl http://localhost:4000/version
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Multi-stage build with &lt;code&gt;cargo-chef&lt;/code&gt; so dependency compilation is cached across source-only changes. &lt;code&gt;debian:bookworm-slim&lt;/code&gt; runtime. Non-root user (&lt;code&gt;logdive&lt;/code&gt;, UID 1000). &lt;code&gt;/data&lt;/code&gt; volume for index persistence. &lt;code&gt;EXPOSE 4000&lt;/code&gt;. &lt;code&gt;HEALTHCHECK&lt;/code&gt; curls &lt;code&gt;/version&lt;/code&gt; every 30 seconds — no DB access, no false negatives during a long query.&lt;/p&gt;

&lt;p&gt;Both binaries ship in one image. Default &lt;code&gt;ENTRYPOINT&lt;/code&gt; is &lt;code&gt;logdive-api&lt;/code&gt;; the CLI is reachable through &lt;code&gt;--entrypoint logdive&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; logdive-data:/data &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; /var/log/app:/logs:ro &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--entrypoint&lt;/span&gt; logdive &lt;span class="se"&gt;\&lt;/span&gt;
  ghcr.io/aryagorjipour/logdive &lt;span class="se"&gt;\&lt;/span&gt;
  ingest &lt;span class="nt"&gt;--file&lt;/span&gt; /logs/app.log &lt;span class="nt"&gt;--tag&lt;/span&gt; production
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two image-level environment defaults that don't exist on host installs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;LOGDIVE_DB=/data/index.db&lt;/code&gt; — points both binaries at the persistent volume.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;LOGDIVE_API_HOST=0.0.0.0&lt;/code&gt; — overrides the binary's loopback default so &lt;code&gt;-p 4000:4000&lt;/code&gt; actually does something.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The second one matters: on a host install the API binds &lt;code&gt;127.0.0.1&lt;/code&gt; for security. In a container, that would make the published port unreachable. The override is the right default &lt;em&gt;for the container&lt;/em&gt;. It is also why the README is explicit that exposing port 4000 with no reverse proxy puts a read-only API on the public internet. Read-only is defence in depth, not access control.&lt;/p&gt;

&lt;p&gt;GitHub Actions builds for &lt;code&gt;linux/amd64&lt;/code&gt; and &lt;code&gt;linux/arm64&lt;/code&gt; via &lt;code&gt;docker buildx&lt;/code&gt; + QEMU, pushes to GHCR with the workflow's built-in &lt;code&gt;GITHUB_TOKEN&lt;/code&gt; (no PAT to rotate). Builds run on every push to &lt;code&gt;main&lt;/code&gt;, every push to a &lt;code&gt;release/v*&lt;/code&gt; branch, every &lt;code&gt;v*&lt;/code&gt; tag, and on PRs without pushing.&lt;/p&gt;

&lt;p&gt;One CI gotcha I'll save you the debug session for: &lt;code&gt;cache-to: type=gha, mode=max&lt;/code&gt; reliably 502s on cache export for this workspace. The cache backend disagrees with how large the intermediate-layer export is. &lt;code&gt;mode=min&lt;/code&gt; works. PR builds skip cache write entirely. The current &lt;code&gt;.github/workflows/docker.yml&lt;/code&gt; has the working config.&lt;/p&gt;

&lt;p&gt;While I was in there, I also fixed a v0.1 UX paper-cut: &lt;code&gt;logdive-api&lt;/code&gt; used to refuse to start if the database file was missing. That made sense for &lt;code&gt;--db /home/me/typo.db&lt;/code&gt;; it was hostile for &lt;code&gt;docker run -v fresh-volume:/data&lt;/code&gt;. v0.2 auto-creates an empty index with the right schema on first run, prints a one-line note to stderr explaining what just happened, and otherwise lets the server come up. Genuinely bad paths (non-existent parent, permission denied) still fail fast.&lt;/p&gt;

&lt;h2&gt;
  
  
  The updated query grammar, in one place
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;query    := and_expr (OR and_expr)*
and_expr := clause (AND clause)*
clause   := field OP value
          | field CONTAINS string
          | TIME_RANGE
field    := [a-zA-Z_][a-zA-Z0-9_.]*
OP       := "=" | "!=" | "&amp;gt;" | "&amp;lt;"
TIME_RANGE := "last" duration | "since" datetime
duration := number ("m" | "h" | "d")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;AND&lt;/code&gt;, &lt;code&gt;OR&lt;/code&gt;, &lt;code&gt;CONTAINS&lt;/code&gt;, &lt;code&gt;last&lt;/code&gt;, &lt;code&gt;since&lt;/code&gt;, &lt;code&gt;true&lt;/code&gt;, and &lt;code&gt;false&lt;/code&gt; are case-insensitive. Known fields (&lt;code&gt;timestamp&lt;/code&gt;, &lt;code&gt;level&lt;/code&gt;, &lt;code&gt;message&lt;/code&gt;, &lt;code&gt;tag&lt;/code&gt;) hit indexed columns. Everything else routes through &lt;code&gt;json_extract(fields, '$.&amp;lt;key&amp;gt;')&lt;/code&gt; — slower than a real index, but works against arbitrary JSON shapes without forcing a schema. SQL output always parenthesises each AND-group so the WHERE clause is unambiguous regardless of how many disjuncts you stack.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tradeoffs, honestly
&lt;/h2&gt;

&lt;p&gt;A few v0.1 tradeoffs are resolved. New ones replaced them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multi-format ingestion adds a small dispatch cost.&lt;/strong&gt; Every line goes through a &lt;code&gt;LogFormat&lt;/code&gt; match arm before hitting its parser. ~1–2% overhead on the benchmark suite, which I'd characterise as "not a thing" but felt obliged to mention.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Follow mode is Unix-only.&lt;/strong&gt; The rotation check uses &lt;code&gt;(dev, ino)&lt;/code&gt; from &lt;code&gt;std::os::unix::fs::MetadataExt&lt;/code&gt;. Windows compiles fine but rejects &lt;code&gt;--follow&lt;/code&gt; at runtime. A &lt;code&gt;notify&lt;/code&gt;-based fallback for the rotation half is a real contribution opportunity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Docker image is 97 MB.&lt;/strong&gt; &lt;code&gt;debian:bookworm-slim&lt;/code&gt; + two binaries + &lt;code&gt;curl&lt;/code&gt; + &lt;code&gt;ca-certificates&lt;/code&gt;. A &lt;code&gt;musl&lt;/code&gt;-static build against a &lt;code&gt;distroless&lt;/code&gt; runtime would cut this to ~10 MB but adds linker complexity with bundled &lt;code&gt;rusqlite&lt;/code&gt; and &lt;code&gt;blake3&lt;/code&gt;. Deferred.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;LOGDIVE_API_HOST=0.0.0.0&lt;/code&gt; in the container is correct but dangerous.&lt;/strong&gt; The README repeats the warning twice; the API stays read-only specifically so a misconfigured deployment leaks data, not the world.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OR without parens covers ~95%.&lt;/strong&gt; &lt;code&gt;(level=error OR level=warn) AND service=payments&lt;/code&gt; requires duplicating the service clause as &lt;code&gt;level=error AND service=payments OR level=warn AND service=payments&lt;/code&gt;. Parenthesised expressions are the v0.3 headline.&lt;/p&gt;

&lt;h2&gt;
  
  
  Performance
&lt;/h2&gt;

&lt;p&gt;Fresh &lt;code&gt;cargo bench&lt;/code&gt; numbers from v0.2.0. Ingest and query paths weren't touched in v0.2, so these track v0.1 to within run-to-run variance:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;Throughput / Latency&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Ingestion, batched insert (10k rows)&lt;/td&gt;
&lt;td&gt;~206k lines/sec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ingestion, parse + insert end-to-end (10k rows)&lt;/td&gt;
&lt;td&gt;~164k lines/sec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Query on known field, empty result (100k rows)&lt;/td&gt;
&lt;td&gt;~20 µs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Query on known field, 25% match (100k rows, LIMIT 1000)&lt;/td&gt;
&lt;td&gt;~51 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Query on JSON field, 25% match (100k rows, LIMIT 1000)&lt;/td&gt;
&lt;td&gt;~4.0 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Query on JSON field, 0% match — full scan (100k rows)&lt;/td&gt;
&lt;td&gt;~68 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;CONTAINS&lt;/code&gt; full-table scan (100k rows)&lt;/td&gt;
&lt;td&gt;~37–41 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3-clause &lt;code&gt;AND&lt;/code&gt; chain, mixed known + JSON (100k rows)&lt;/td&gt;
&lt;td&gt;~23 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Reproducible from &lt;code&gt;crates/core/benches/&lt;/code&gt;. &lt;code&gt;cargo bench&lt;/code&gt; runs the suite; criterion writes HTML reports to &lt;code&gt;target/criterion&lt;/code&gt;. The 25%-match number is the most variance-sensitive (it walks the result set rather than short-circuiting on a count); your hardware will differ.&lt;/p&gt;

&lt;p&gt;Release binary sizes: &lt;code&gt;logdive&lt;/code&gt; 3.9 MB, &lt;code&gt;logdive-api&lt;/code&gt; 4.2 MB. The "&amp;lt;10 MB" budget from v0.1 still holds, with room.&lt;/p&gt;

&lt;h2&gt;
  
  
  Upgrading from v0.1.0
&lt;/h2&gt;

&lt;p&gt;CLI users:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cargo &lt;span class="nb"&gt;install &lt;/span&gt;logdive logdive-api &lt;span class="nt"&gt;--force&lt;/span&gt;
&lt;span class="c"&gt;# or&lt;/span&gt;
docker pull ghcr.io/aryagorjipour/logdive:0.2.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No CLI flags were removed or renamed. Every v0.1 command still works.&lt;/p&gt;

&lt;p&gt;Library users (&lt;code&gt;logdive-core&lt;/code&gt; directly): two breaking changes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Before&lt;/span&gt;
&lt;span class="k"&gt;match&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nn"&gt;QueryNode&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;And&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;clauses&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="cm"&gt;/* ... */&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// After&lt;/span&gt;
&lt;span class="k"&gt;match&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nn"&gt;QueryNode&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;Or&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;groups&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;group&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;groups&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;clause&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;group&lt;/span&gt;&lt;span class="py"&gt;.clauses&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="cm"&gt;/* ... */&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Before&lt;/span&gt;
&lt;span class="nf"&gt;parse_line&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;// After&lt;/span&gt;
&lt;span class="nf"&gt;parse_line&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;LogFormat&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Json&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;// explicit format selector&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both are mechanical migrations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Call for contributions
&lt;/h2&gt;

&lt;p&gt;v0.1's contribution list had seven items. Four shipped in v0.2 (OR, non-JSON formats, follow mode, Docker image). Three remain, plus the work v0.2 exposed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Parenthesised query expressions.&lt;/strong&gt; The v0.3 flagship. Recursive descent on the two-level grammar, SQL generator update that emits balanced parens correctly. Well-scoped.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A browser UI.&lt;/strong&gt; The &lt;code&gt;GET /version&lt;/code&gt; endpoint exists specifically to make feature detection easy. The API is CORS-configurable. A weekend with React/Svelte/HTMX and you have a real UI.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generated columns for hot JSON fields.&lt;/strong&gt; The big win on JSON-field query speed. Mark a field as "promote to indexed column" and get known-field performance for it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Windows rotation detection.&lt;/strong&gt; A &lt;code&gt;notify&lt;/code&gt;-based fallback to replace the &lt;code&gt;(dev, ino)&lt;/code&gt; check that's currently Unix-only.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Distroless / musl Docker image.&lt;/strong&gt; 97 MB → ~10 MB. Real engineering, real win.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;More log formats.&lt;/strong&gt; Apache common log format, syslog RFC 5424, journalctl JSON. &lt;code&gt;LogFormat&lt;/code&gt; is set up for additions — one enum variant, one parser module, one dispatcher arm.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Benchmarks on more hardware.&lt;/strong&gt; Run &lt;code&gt;cargo bench&lt;/code&gt; on your machine, PR the README table.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The repo has CI, integration tests against tempfile-backed SQLite, conventional-commit history, and a release process that's a documented sequence of one-line commands. Bug reports and PRs at &lt;a href="https://github.com/Aryagorjipour/logdive" rel="noopener noreferrer"&gt;github.com/Aryagorjipour/logdive&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  A short field guide to shipping v0.2 of anything
&lt;/h2&gt;

&lt;p&gt;Five things v0.2 taught me that the v0.1 release didn't:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Auto-create on first run beats fail-fast in containers.&lt;/strong&gt; Host CLIs should fail loudly on missing files. Containers with fresh volumes should not. Same code, two correct behaviours, switched by whether you trust the path.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;cargo publish --workspace&lt;/code&gt; is a Cargo 1.90+ feature.&lt;/strong&gt; Before that, dry-running all three crates simultaneously fails because the downstream ones can't resolve their unpublished dep. Publish core, wait 30 seconds, publish the rest. The &lt;code&gt;scripts/prerelease-check.sh&lt;/code&gt; in the repo handles both versions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;tower-http&lt;/code&gt; and &lt;code&gt;axum&lt;/code&gt; are coupled at the trait-bound level.&lt;/strong&gt; Bumping one without the other compiles, then explodes when you &lt;code&gt;.layer()&lt;/code&gt; it. &lt;code&gt;cargo update -p axum&lt;/code&gt; after &lt;code&gt;cargo add tower-http&lt;/code&gt; is muscle memory now.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GHA cache &lt;code&gt;mode=max&lt;/code&gt; is a 502 generator on real workspaces.&lt;/strong&gt; &lt;code&gt;mode=min&lt;/code&gt; works. PR builds should skip cache write entirely.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;QEMU multi-arch Rust builds are slow but reliable.&lt;/strong&gt; ~12 minutes cold, ~2 minutes warm with &lt;code&gt;cargo-chef&lt;/code&gt;. Native &lt;code&gt;arm64&lt;/code&gt; runners would halve this. Until then, you wait.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each of these took a CI run or a stack trace to learn. None of them are in any "publishing a Rust workspace" tutorial I can find. They're in this article now, which is the only reason I'm slightly less annoyed about having learned them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Repository:&lt;/strong&gt; &lt;a href="https://github.com/Aryagorjipour/logdive" rel="noopener noreferrer"&gt;github.com/Aryagorjipour/logdive&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Crates:&lt;/strong&gt; &lt;a href="https://crates.io/crates/logdive" rel="noopener noreferrer"&gt;logdive&lt;/a&gt;, &lt;a href="https://crates.io/crates/logdive-core" rel="noopener noreferrer"&gt;logdive-core&lt;/a&gt;, &lt;a href="https://crates.io/crates/logdive-api" rel="noopener noreferrer"&gt;logdive-api&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docker:&lt;/strong&gt; &lt;code&gt;ghcr.io/aryagorjipour/logdive&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docs:&lt;/strong&gt; &lt;a href="https://docs.rs/logdive-core" rel="noopener noreferrer"&gt;docs.rs/logdive-core&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;v0.1 article:&lt;/strong&gt; &lt;a href="https://dev.to/arysmart/i-wanted-jq-with-memory-time-ranges-and-filters-so-i-built-logdive-2j13"&gt;I wanted jq with memory, time ranges, and filters. So I built logdive&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  About
&lt;/h2&gt;

&lt;p&gt;Arya Gorjipour — backend engineer, logdive maintainer.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/Aryagorjipour" rel="noopener noreferrer"&gt;@Aryagorjipour&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;X / Twitter: &lt;a href="https://twitter.com/Arysmart1" rel="noopener noreferrer"&gt;@Arysmart1&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;LinkedIn: &lt;a href="https://www.linkedin.com/in/arysmart/" rel="noopener noreferrer"&gt;arysmart&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you ship a v0.3 contribution from the list above, I want to hear about it. If you use logdive to debug a real incident, I want to hear about that too. The most interesting bug reports get a hat tip in the next release post.&lt;/p&gt;

</description>
      <category>rust</category>
      <category>opensource</category>
      <category>showdev</category>
      <category>devops</category>
    </item>
    <item>
      <title>I wanted jq with memory, time ranges, and filters. So I built logdive</title>
      <dc:creator>Arya Gorjipour</dc:creator>
      <pubDate>Tue, 28 Apr 2026 18:00:00 +0000</pubDate>
      <link>https://dev.to/arysmart/i-wanted-jq-with-memory-time-ranges-and-filters-so-i-built-logdive-2j13</link>
      <guid>https://dev.to/arysmart/i-wanted-jq-with-memory-time-ranges-and-filters-so-i-built-logdive-2j13</guid>
      <description>&lt;p&gt;Your app is in production. Something broke at 2am. Your options are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;grep&lt;/code&gt; through a rotated log file, squinting at terminal output.&lt;/li&gt;
&lt;li&gt;Chain together half a dozen &lt;code&gt;jq&lt;/code&gt; pipes until the command line becomes unreadable.&lt;/li&gt;
&lt;li&gt;Page an SRE to query your observability stack, assuming you have one.&lt;/li&gt;
&lt;li&gt;Spin up Loki or Elastic locally, spend two hours on config, and then do the actual investigation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All four of these suck. Either you're limited to flat text tooling, or you're paying for infrastructure complexity you don't need.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;logdive&lt;/strong&gt; is what sits in the gap. It's a single Rust binary. You drop it anywhere, point it at a log file or pipe Docker output into it, and you get a fast, queryable index on your local machine. No daemon. No cloud. No YAML. Just &lt;code&gt;cargo install logdive&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Ingest logs from a file or pipe from stdin.&lt;/span&gt;
logdive ingest &lt;span class="nt"&gt;--file&lt;/span&gt; ./logs/app.log
docker logs my-container | logdive ingest &lt;span class="nt"&gt;--tag&lt;/span&gt; my-container

&lt;span class="c"&gt;# Query the index.&lt;/span&gt;
logdive query &lt;span class="s1"&gt;'level=error AND service=payments last 2h'&lt;/span&gt;
logdive query &lt;span class="s1"&gt;'message contains "timeout"'&lt;/span&gt; &lt;span class="nt"&gt;--format&lt;/span&gt; json | jq

&lt;span class="c"&gt;# Inspect what you've indexed.&lt;/span&gt;
logdive stats

&lt;span class="c"&gt;# Optionally expose a read-only HTTP API for remote querying.&lt;/span&gt;
logdive-api &lt;span class="nt"&gt;--db&lt;/span&gt; ./logdive.db &lt;span class="nt"&gt;--port&lt;/span&gt; 4000
curl &lt;span class="s1"&gt;'http://127.0.0.1:4000/query?q=level%3Derror&amp;amp;limit=100'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the whole product surface.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why logdive exists
&lt;/h2&gt;

&lt;p&gt;Every backend engineer has hit the wall this is built for: your application emits perfectly good structured JSON logs. Tools for &lt;em&gt;querying&lt;/em&gt; that JSON locally are stuck in the extremes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;jq&lt;/code&gt; is for a single file, one-shot, no memory, no time ranges, no filters-across-files.&lt;/li&gt;
&lt;li&gt;Loki, Datadog, Elastic, Splunk all demand infrastructure, cost, and configuration that's overkill for a side project, small team, or personal investigation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The target user is a backend engineer who wants &lt;code&gt;jq&lt;/code&gt; with memory, filters, and time ranges — without YAML files, without a running daemon they didn't ask for, without a monthly bill.&lt;/p&gt;

&lt;p&gt;Rust makes this credible in a way no other language quite does: a single self-contained binary with no runtime, zero-copy parsing, SQLite bundled directly into the binary, and real concurrency for ingestion. This is the kind of tool Rust is genuinely good at.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who this is for (and who it isn't)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Good fit:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Backend engineers debugging production incidents from local log copies.&lt;/li&gt;
&lt;li&gt;Small teams without a dedicated observability budget.&lt;/li&gt;
&lt;li&gt;Anyone who's ever built a 4-stage &lt;code&gt;jq&lt;/code&gt; pipeline and wished it was searchable afterward.&lt;/li&gt;
&lt;li&gt;Folks running Docker locally who want &lt;code&gt;docker logs my-container | logdive ingest&lt;/code&gt; and instant querying.&lt;/li&gt;
&lt;li&gt;CI pipelines that need to grep through structured output of a previous step.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Bad fit (and I'll be explicit about this):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multi-machine, networked indexes. logdive is single-host by design.&lt;/li&gt;
&lt;li&gt;Real-time log tailing / &lt;code&gt;tail -f&lt;/code&gt; style follow mode. Not in v0.1.0.&lt;/li&gt;
&lt;li&gt;Anything needing authentication on the HTTP endpoint. The v1 API assumes the network layer handles access control.&lt;/li&gt;
&lt;li&gt;Massive enterprise-scale log volumes. SQLite handles a lot, but if you're indexing 100GB/day, you want Loki.&lt;/li&gt;
&lt;li&gt;Non-JSON log formats (plaintext, logfmt, syslog). v1 is JSON-only.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The whole scope is deliberately small. v0.1.0 ships what a side project or small team needs, nothing more.&lt;/p&gt;

&lt;h2&gt;
  
  
  The query language
&lt;/h2&gt;

&lt;p&gt;Small enough to fit in your head, expressive enough to be useful:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;level=error
level=error AND service=payments
message contains "database timeout"
level=error last 2h
tag=api AND status &amp;gt; 499 since 2026-04-15
user_id=4812 AND duration_ms &amp;gt; 500
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Operators: &lt;code&gt;=&lt;/code&gt;, &lt;code&gt;!=&lt;/code&gt;, &lt;code&gt;&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;&lt;/code&gt;, &lt;code&gt;CONTAINS&lt;/code&gt;. Time ranges: &lt;code&gt;last Nm/Nh/Nd&lt;/code&gt; or &lt;code&gt;since &amp;lt;datetime&amp;gt;&lt;/code&gt;. Clauses chain with &lt;code&gt;AND&lt;/code&gt;. Known fields (&lt;code&gt;timestamp&lt;/code&gt;, &lt;code&gt;level&lt;/code&gt;, &lt;code&gt;message&lt;/code&gt;, &lt;code&gt;tag&lt;/code&gt;) hit SQLite indexes directly. Unknown fields go through &lt;code&gt;json_extract()&lt;/code&gt; on a JSON blob — slower but works for arbitrary JSON shapes.&lt;/p&gt;

&lt;p&gt;No &lt;code&gt;OR&lt;/code&gt; in v0.1.0 — it's the single biggest v1 non-goal. AND covers the dominant query pattern, and adding OR requires a two-level grammar plus precedence handling that would roughly double the parser. Deferred to v2 deliberately.&lt;/p&gt;

&lt;h2&gt;
  
  
  Under the hood
&lt;/h2&gt;

&lt;p&gt;For readers who like the implementation details:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Three-crate Cargo workspace.&lt;/strong&gt; &lt;code&gt;logdive-core&lt;/code&gt; is pure library (parser, indexer, query engine — no I/O at module level), &lt;code&gt;logdive&lt;/code&gt; is the CLI binary, &lt;code&gt;logdive-api&lt;/code&gt; is the HTTP server binary. Each is independently publishable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SQLite via &lt;code&gt;rusqlite&lt;/code&gt;&lt;/strong&gt; with the &lt;code&gt;bundled&lt;/code&gt; feature. Zero infrastructure, battle-tested, ships inside the binary at ~500KB.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hybrid storage.&lt;/strong&gt; Known fields (&lt;code&gt;timestamp&lt;/code&gt;, &lt;code&gt;level&lt;/code&gt;, &lt;code&gt;message&lt;/code&gt;, &lt;code&gt;tag&lt;/code&gt;) are real indexed columns. Everything else is stored in a JSON blob queryable via SQLite's &lt;code&gt;json_extract()&lt;/code&gt;. This is the only way to handle arbitrary JSON shapes without a schema-bound design.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hand-written recursive descent query parser.&lt;/strong&gt; ~200 lines of Rust enums. No parser-combinator dependency. Better error messages than generated parsers, and honestly, it was one of the most satisfying parts of the project to write.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Blake3 row hashing for deduplication.&lt;/strong&gt; &lt;code&gt;INSERT OR IGNORE&lt;/code&gt; on a unique hash column means re-ingesting a file (or dealing with log rotation) is free. No duplicate rows. The hash is cheap — negligible per-line cost.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Batched inserts at 1000 rows per transaction.&lt;/strong&gt; Standard SQLite throughput pattern.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Axum HTTP API.&lt;/strong&gt; Read-only via &lt;code&gt;SQLITE_OPEN_READ_ONLY&lt;/code&gt;, blocking SQLite work wrapped in &lt;code&gt;tokio::task::spawn_blocking&lt;/code&gt; so it doesn't block Tokio's worker threads, graceful shutdown on Ctrl-C and SIGTERM.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The full architecture is documented in the repo's README.&lt;/p&gt;

&lt;h2&gt;
  
  
  Performance
&lt;/h2&gt;

&lt;p&gt;Representative numbers on an Acer Nitro 5 laptop, measured via criterion:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;Throughput / Latency&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Ingestion, batched insert (10k rows)&lt;/td&gt;
&lt;td&gt;~210k lines/sec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ingestion, parse + insert end-to-end (10k rows)&lt;/td&gt;
&lt;td&gt;~166k lines/sec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Query on known field, empty result (100k rows)&lt;/td&gt;
&lt;td&gt;~17 μs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Query on known field, 25% match (100k rows, LIMIT 1000)&lt;/td&gt;
&lt;td&gt;~39 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Query on JSON field, 25% match (100k rows, LIMIT 1000)&lt;/td&gt;
&lt;td&gt;~3.6 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Query on JSON field, 0% match (full scan, 100k rows)&lt;/td&gt;
&lt;td&gt;~68 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;CONTAINS&lt;/code&gt; full-table scan (100k rows)&lt;/td&gt;
&lt;td&gt;~36–40 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3-clause &lt;code&gt;AND&lt;/code&gt; chain (100k rows)&lt;/td&gt;
&lt;td&gt;~22 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Release binaries at 3.7 MB (&lt;code&gt;logdive&lt;/code&gt;) and 4.1 MB (&lt;code&gt;logdive-api&lt;/code&gt;) — well under the 10 MB target, thanks to LTO + strip + panic=abort in the release profile.&lt;/p&gt;

&lt;p&gt;Run &lt;code&gt;cargo bench&lt;/code&gt; in the repo to get your own baseline.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tradeoffs worth being honest about
&lt;/h2&gt;

&lt;p&gt;A few design decisions have real downsides that users should know:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Timestamps compared as lexical TEXT.&lt;/strong&gt; This is correct for ISO-8601-shaped timestamps (they sort chronologically when compared as strings), but any exotic timestamp format will silently misorder. Default timestamps from modern structured loggers are ISO-8601, so in practice this is rarely a problem — but it's a real sharp edge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No index on json_extract expressions.&lt;/strong&gt; Queries on unknown JSON fields fall back to full table scans. 100k rows scans in ~68ms which is still fast, but if you're hammering the same unknown field constantly, it's slower than a known-column query by 1000x. A future version could promote frequently-queried JSON fields to real columns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Single-host only.&lt;/strong&gt; There's no clustering story. If you need distributed query across machines, you want Loki or Elastic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No authentication on the HTTP API.&lt;/strong&gt; Deliberate for v1. If you expose &lt;code&gt;logdive-api&lt;/code&gt; beyond localhost, put a reverse proxy with auth in front of it. The binary defaults to binding &lt;code&gt;127.0.0.1&lt;/code&gt; for a reason.&lt;/p&gt;

&lt;h2&gt;
  
  
  Install and try it
&lt;/h2&gt;

&lt;p&gt;From crates.io:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cargo &lt;span class="nb"&gt;install &lt;/span&gt;logdive logdive-api
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From prebuilt binaries: grab the tarball for your platform from the &lt;a href="https://github.com/Aryagorjipour/logdive/releases" rel="noopener noreferrer"&gt;GitHub Releases&lt;/a&gt; page. Linux x86_64 and macOS arm64 are built on every tag push.&lt;/p&gt;

&lt;p&gt;From source:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/Aryagorjipour/logdive
&lt;span class="nb"&gt;cd &lt;/span&gt;logdive
cargo build &lt;span class="nt"&gt;--release&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;MSRV: Rust 1.85 (edition 2024). Dual-licensed MIT OR Apache-2.0.&lt;/p&gt;

&lt;p&gt;Try the included examples:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;logdive &lt;span class="nt"&gt;--db&lt;/span&gt; /tmp/demo.db ingest &lt;span class="nt"&gt;--file&lt;/span&gt; examples/app.log
logdive &lt;span class="nt"&gt;--db&lt;/span&gt; /tmp/demo.db ingest &lt;span class="nt"&gt;--file&lt;/span&gt; examples/nginx.log
logdive &lt;span class="nt"&gt;--db&lt;/span&gt; /tmp/demo.db stats
logdive &lt;span class="nt"&gt;--db&lt;/span&gt; /tmp/demo.db query &lt;span class="s1"&gt;'level=error AND service=payments'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Call for contributions
&lt;/h2&gt;

&lt;p&gt;v0.1.0 is deliberately small, but there's a clear set of high-value v2 features that would benefit hugely from community help. If you want to contribute, these are genuine needs:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;OR&lt;/code&gt; operator in the query language.&lt;/strong&gt; The single most-requested feature implied by v1's scope. Extends the parser to handle a two-level grammar (clauses joined by OR, OR-groups joined by AND) and the SQL generator to emit parenthesized disjunctions. Non-trivial but well-scoped.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Non-JSON log format support.&lt;/strong&gt; Plaintext and logfmt are the obvious next formats. Would plug in as additional parser implementations alongside &lt;code&gt;parse_line&lt;/code&gt; in &lt;code&gt;logdive-core&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Follow mode (&lt;code&gt;-f&lt;/code&gt; / tail-and-index).&lt;/strong&gt; Watch a log file for new lines and ingest them as they appear. Good use of &lt;code&gt;tokio::fs&lt;/code&gt; and &lt;code&gt;notify&lt;/code&gt; crate patterns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A browser UI.&lt;/strong&gt; The HTTP API is ready for one — someone with frontend chops could build a single-page React/Svelte/HTMX UI that talks to &lt;code&gt;logdive-api&lt;/code&gt; and gives people a browser-based query interface.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Generated columns for frequently-queried JSON fields.&lt;/strong&gt; The big performance win. Would let users mark certain JSON fields as "promote to indexed column" and get known-field query performance for those.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Benchmarks on more hardware.&lt;/strong&gt; If you run the existing &lt;code&gt;cargo bench&lt;/code&gt; suite on your machine, an issue/PR updating the README's Performance section with a broader sample would be genuinely useful.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Docker image for the HTTP API.&lt;/strong&gt; Dockerfile for &lt;code&gt;logdive-api&lt;/code&gt; with a volume mount for the index database. Natural next step for users who want to run the API as a service.&lt;/p&gt;

&lt;p&gt;The repo has CI, benchmarks, clean test coverage, and a documented contribution workflow. Issues and pull requests at &lt;a href="https://github.com/Aryagorjipour/logdive" rel="noopener noreferrer"&gt;github.com/Aryagorjipour/logdive&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  A note on context
&lt;/h2&gt;

&lt;p&gt;logdive started as the final project in a Rust learning journey. The framing mattered — I wanted a project that was small enough to finish, demanding enough to exercise real Rust (parsers, SQLite, async, concurrency, CLI, HTTP), and useful enough that I'd actually keep using it afterward.&lt;/p&gt;

&lt;p&gt;What I underestimated: how much of the effort lives in the parts that aren't writing code. Setting up a clean workspace. Choosing the right abstractions between core and binaries. Writing benchmarks that actually measure what you think they measure. Testing an HTTP server with &lt;code&gt;tower::ServiceExt::oneshot&lt;/code&gt;. Packaging a three-crate workspace for crates.io when one crate depends on another and you're publishing for the first time. Each of these had at least one subtle gotcha.&lt;/p&gt;

&lt;p&gt;The project is open source because the next person hitting the "jq vs Datadog" wall might as well benefit from it, and because Rust has given me enough that I want to give something back.&lt;/p&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Repository:&lt;/strong&gt; &lt;a href="https://github.com/Aryagorjipour/logdive" rel="noopener noreferrer"&gt;github.com/Aryagorjipour/logdive&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Crates:&lt;/strong&gt; &lt;a href="https://crates.io/crates/logdive" rel="noopener noreferrer"&gt;logdive&lt;/a&gt;, &lt;a href="https://crates.io/crates/logdive-core" rel="noopener noreferrer"&gt;logdive-core&lt;/a&gt;, &lt;a href="https://crates.io/crates/logdive-api" rel="noopener noreferrer"&gt;logdive-api&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docs:&lt;/strong&gt; &lt;a href="https://docs.rs/logdive-core" rel="noopener noreferrer"&gt;docs.rs/logdive-core&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  About
&lt;/h2&gt;

&lt;p&gt;Arya Gorjipour — backend engineer, Rust learner, logdive maintainer.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/Aryagorjipour" rel="noopener noreferrer"&gt;@Aryagorjipour&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Twitter/X: &lt;a href="https://twitter.com/Arysmart1" rel="noopener noreferrer"&gt;@Arysmart1&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;LinkedIn: &lt;a href="https://www.linkedin.com/in/arysmart/" rel="noopener noreferrer"&gt;arysmart&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Issues, bug reports, and pull requests welcome. If you end up using logdive to debug a real production incident, I'd love to hear about it.&lt;/p&gt;

</description>
      <category>rust</category>
      <category>opensource</category>
      <category>showdev</category>
      <category>devops</category>
    </item>
  </channel>
</rss>
