<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: LyricalString</title>
    <description>The latest articles on DEV Community by LyricalString (@lyricalstring).</description>
    <link>https://dev.to/lyricalstring</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1193639%2F5f6d7609-52c4-4060-b2d3-02a326a6cc2b.png</url>
      <title>DEV Community: LyricalString</title>
      <link>https://dev.to/lyricalstring</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/lyricalstring"/>
    <language>en</language>
    <item>
      <title>How I serve 12,237 law pages in 0.3 seconds with Astro and zero client JavaScript</title>
      <dc:creator>LyricalString</dc:creator>
      <pubDate>Mon, 06 Apr 2026 10:16:53 +0000</pubDate>
      <link>https://dev.to/lyricalstring/how-i-serve-12237-law-pages-in-03-seconds-with-astro-and-zero-client-javascript-512k</link>
      <guid>https://dev.to/lyricalstring/how-i-serve-12237-law-pages-in-03-seconds-with-astro-and-zero-client-javascript-512k</guid>
      <description>&lt;p&gt;Spanish law is public. Reading it shouldn't cost €200/month.&lt;/p&gt;

&lt;p&gt;That's why I built &lt;a href="https://leyabierta.es" rel="noopener noreferrer"&gt;Ley Abierta&lt;/a&gt;, an open source platform indexing every Spanish law from 1835 to today. 12,237 laws. 42,000 Git commits tracking every reform. Lighthouse score: 100 Performance.&lt;/p&gt;

&lt;p&gt;Here's how it works.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem
&lt;/h2&gt;

&lt;p&gt;Spain's official gazette (BOE) publishes legislation as XML. If you want the consolidated text of a law with all reforms applied, you either:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Read the BOE website (good luck navigating it)&lt;/li&gt;
&lt;li&gt;Pay for Westlaw, Aranzadi, or similar services&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;There's no free, searchable, version-controlled source of Spanish law. So I built one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture overview
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;BOE API → Pipeline (Bun) → Git repo (Markdown) → Astro (SSG) → Cloudflare Pages
                         → SQLite + FTS5       → Elysia API  → Hetzner Docker
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The same data lives in three places, each for a different reason:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;JSON cache&lt;/td&gt;
&lt;td&gt;12,231 JSON files&lt;/td&gt;
&lt;td&gt;Pipeline source of truth&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Git repo&lt;/td&gt;
&lt;td&gt;Markdown + YAML frontmatter&lt;/td&gt;
&lt;td&gt;Human readable, version control&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SQLite&lt;/td&gt;
&lt;td&gt;14 tables + FTS5 index&lt;/td&gt;
&lt;td&gt;Fast queries, full text search&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The website is fully static. The API handles search and the dynamic stuff (email alerts, omnibus detection). They deploy independently.&lt;/p&gt;

&lt;h2&gt;
  
  
  Astro content collections: 12K pages, one build
&lt;/h2&gt;

&lt;p&gt;Each law is a Markdown file in a public Git repo (&lt;a href="https://github.com/leyabierta/leyes" rel="noopener noreferrer"&gt;leyabierta/leyes&lt;/a&gt;). At build time, Astro checks out this repo and treats every file as a content collection entry.&lt;/p&gt;

&lt;p&gt;The frontmatter carries all the metadata:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Ley&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;35/2006,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;de&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;28&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;de&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;noviembre,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;del&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Impuesto&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;sobre&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;la&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Renta&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;de&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;las&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Personas&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Físicas"&lt;/span&gt;
&lt;span class="na"&gt;rank&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ley"&lt;/span&gt;
&lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vigente"&lt;/span&gt;
&lt;span class="na"&gt;published_at&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2006-11-29"&lt;/span&gt;
&lt;span class="na"&gt;jurisdiction&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;es"&lt;/span&gt;
&lt;span class="na"&gt;materias&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;IRPF"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hacienda&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Pública"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Impuestos"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;span class="na"&gt;reforms_count&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;47&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Astro generates a static HTML page for each entry. The build takes ~45-60 seconds for all 12,231 pages using Astro 6.1.1's queued rendering with 4-worker concurrency.&lt;/p&gt;

&lt;p&gt;What comes out the other side is pure HTML on Cloudflare's CDN. There's nothing to hydrate, nothing to parse on the client. Load time is basically TCP overhead.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Performance:  100
FCP:          0.3s
LCP:          0.7s
TBT:          0ms
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The daily pipeline
&lt;/h2&gt;

&lt;p&gt;Every morning, a GitHub Actions workflow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Discovers new laws from the BOE API&lt;/li&gt;
&lt;li&gt;Fetches XML in parallel (6 workers, rate limited)&lt;/li&gt;
&lt;li&gt;Parses metadata, articles, reform history&lt;/li&gt;
&lt;li&gt;Commits each law as a Markdown file with the real publication date as the commit date&lt;/li&gt;
&lt;li&gt;Pushes to the leyes repo&lt;/li&gt;
&lt;li&gt;Triggers an Astro rebuild if anything changed&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;On Sundays, a full re-check catches updates to existing laws. Weekdays are incremental, only new publications.&lt;/p&gt;

&lt;h3&gt;
  
  
  The pre-1970 problem
&lt;/h3&gt;

&lt;p&gt;Git stores dates as Unix timestamps. The oldest law in the database is from 1835. That's before Unix.&lt;/p&gt;

&lt;p&gt;My workaround: commit date is set to &lt;code&gt;1970-01-02&lt;/code&gt; (earliest safe date), but the real publication date lives in YAML frontmatter and a custom Git trailer (&lt;code&gt;Source-Date: 1835-05-24&lt;/code&gt;). The web and API always use the real date. Git history shows the placeholder.&lt;/p&gt;

&lt;p&gt;This affects ~334 laws. Not ideal, but it preserves the commit-per-reform model that makes &lt;code&gt;git diff&lt;/code&gt; work across the entire corpus.&lt;/p&gt;

&lt;h3&gt;
  
  
  BOE API quirks (hard won knowledge)
&lt;/h3&gt;

&lt;p&gt;A few things the documentation won't tell you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Accept: application/json&lt;/code&gt; returns 400 on the &lt;code&gt;/texto&lt;/code&gt; endpoint. You must parse XML.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;limit=-1&lt;/code&gt; silently caps at 10,000 results. Always paginate with explicit offsets.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;/analisis&lt;/code&gt; endpoint returns a subset of subject categories. For the full list, you need to scrape ELI meta tags from the HTML version.&lt;/li&gt;
&lt;li&gt;Regional laws use IDs from regional bulletins (BOA, BOJA, DOGV), not BOE. Jurisdiction must be extracted from the ELI URL pattern (&lt;code&gt;/eli/es-pv/&lt;/code&gt; → Basque Country).&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Full text search with SQLite FTS5
&lt;/h2&gt;

&lt;p&gt;Search needs to be fast and accent insensitive ("politica" should match "política"). SQLite's FTS5 extension handles this natively.&lt;/p&gt;

&lt;p&gt;The search index covers law titles, full text, and citizen friendly tags:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;VIRTUAL&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;norms_fts&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;fts5&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;norm_id&lt;/span&gt; &lt;span class="n"&gt;UNINDEXED&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;citizen_tags&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Queries use a two-pass approach: title matches rank higher than content matches. Results are paginated with chunked ID filtering to avoid SQLite's variable limit on large result sets (splitting into 5K-item chunks).&lt;/p&gt;

&lt;p&gt;The API (Elysia on Bun) exposes this as a REST endpoint with filters for rank, status, jurisdiction, and subject category. Swagger docs at &lt;code&gt;api.leyabierta.es/swagger&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Omnibus law detection
&lt;/h2&gt;

&lt;p&gt;An "omnibus" law bundles unrelated topics into a single piece of legislation. Governments use them to slip unpopular measures past public scrutiny, and in Spain it happens all the time. A tax reform hidden inside a natural disaster decree, that kind of thing. Nobody was tracking it, so I built a detector.&lt;/p&gt;

&lt;h3&gt;
  
  
  How detection works
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;If a law touches 15+ distinct subject categories (after filtering generic ones like "Public Administration"), flag it as omnibus&lt;/li&gt;
&lt;li&gt;Extract the law's structure (titles, chapters, articles) and send to Gemini Flash&lt;/li&gt;
&lt;li&gt;The model generates a label, headline, summary, article count, and a &lt;code&gt;sneaked_in&lt;/code&gt; boolean for each topic&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The &lt;code&gt;sneaked_in&lt;/code&gt; flag is the interesting part. It catches topics that have nothing to do with the law's official title. Energy regulation buried in a social security update, that sort of thing.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"topic_label"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Energía (medida encubierta)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"headline"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"New renewable energy requirements"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"article_count"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sneaked_in"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cost: ~$0.01/day using Gemini Flash through OpenRouter.&lt;/p&gt;

&lt;p&gt;Results are served via API, rendered on the &lt;code&gt;/omnibus&lt;/code&gt; page, and available as an RSS feed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Email notifications
&lt;/h2&gt;

&lt;p&gt;Citizens can subscribe to topics they care about. When a law affecting those topics gets reformed, they get an email with a plain language summary.&lt;/p&gt;

&lt;p&gt;The system is event driven:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Daily cron generates AI summaries for new reforms&lt;/li&gt;
&lt;li&gt;Match subscriber topics against reform subjects&lt;/li&gt;
&lt;li&gt;Send via Resend (transactional email)&lt;/li&gt;
&lt;li&gt;Track in &lt;code&gt;notified_reforms&lt;/code&gt; to prevent duplicates&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Double opt-in uses HMAC-signed confirmation links. No authentication needed, subscriptions are managed by email token.&lt;/p&gt;

&lt;h2&gt;
  
  
  Things I'd do differently
&lt;/h2&gt;

&lt;p&gt;SQLite from day one. I spent weeks querying the Git repo directly before accepting that Git is not a database. &lt;code&gt;git log --grep&lt;/code&gt; is not a substitute for &lt;code&gt;WHERE materia = 'IRPF' AND status = 'vigente'&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;I also shouldn't have trusted the BOE API docs. They're incomplete and in some places just wrong. Would have saved time starting by scraping the endpoints and figuring out the actual behavior.&lt;/p&gt;

&lt;p&gt;One Astro gotcha: content collections with 12K+ files will eat your memory during builds if you're not careful. Queued rendering in Astro 6 fixed this but I burned a few afternoons on OOM crashes before finding it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Web: &lt;a href="https://leyabierta.es" rel="noopener noreferrer"&gt;leyabierta.es&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;API: &lt;a href="https://api.leyabierta.es/swagger" rel="noopener noreferrer"&gt;api.leyabierta.es/swagger&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Code: &lt;a href="https://github.com/leyabierta/leyabierta" rel="noopener noreferrer"&gt;github.com/leyabierta/leyabierta&lt;/a&gt; (MIT)&lt;/li&gt;
&lt;li&gt;Laws: &lt;a href="https://github.com/leyabierta/leyes" rel="noopener noreferrer"&gt;github.com/leyabierta/leyes&lt;/a&gt; (Public domain)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you have ideas, spot bugs, or want to adapt this for your country's legislation, issues and PRs are welcome.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I'm Alex, a solo developer from Spain. You can find me on &lt;a href="https://linkedin.com/in/lyricalstring" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>astro</category>
      <category>webdev</category>
      <category>opensource</category>
      <category>performance</category>
    </item>
    <item>
      <title>Building an AI Coworker That Asks Questions Instead of Guessing</title>
      <dc:creator>LyricalString</dc:creator>
      <pubDate>Fri, 20 Mar 2026 11:47:23 +0000</pubDate>
      <link>https://dev.to/lyricalstring/building-an-ai-coworker-that-asks-questions-instead-of-guessing-32lh</link>
      <guid>https://dev.to/lyricalstring/building-an-ai-coworker-that-asks-questions-instead-of-guessing-32lh</guid>
      <description>&lt;p&gt;You tell your AI coworker: "create a task for the new feature."&lt;/p&gt;

&lt;p&gt;It creates the task. Assigns it to nobody. Sets priority to medium. Picks a random project.&lt;/p&gt;

&lt;p&gt;Nothing is technically wrong. But everything is useless.&lt;/p&gt;

&lt;p&gt;The AI didn't have context. And instead of asking, it guessed.&lt;/p&gt;

&lt;p&gt;This is the default behavior of every LLM tool system I've seen. Missing parameter? Use a default. Ambiguous input? Pick the most likely interpretation. The AI never stops and says "hey, who should I assign this to?"&lt;/p&gt;

&lt;p&gt;So I built a system that does exactly that.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Design: AskUserQuestion as a First-Class Tool
&lt;/h2&gt;

&lt;p&gt;The idea is simple: give the LLM a tool called &lt;code&gt;ask_user_question&lt;/code&gt; that it can call like any other tool. Instead of creating a task, sending a message, or querying a database — it asks the human a question.&lt;/p&gt;

&lt;p&gt;Here's the tool definition the LLM sees:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ask_user_question&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Ask the user a clarifying question with a rich interactive UI.
    Use when you need user input before proceeding. Supports free-text,
    single/multi-choice, and yes/no questions.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nl"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;The question to ask&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;question_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;free_text | single_choice | multi_choice | yes_no&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;label&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Option A&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}],&lt;/span&gt;
    &lt;span class="c1"&gt;// Or for sequences:&lt;/span&gt;
    &lt;span class="nx"&gt;questions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Who should own this?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;question_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;single_choice&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;options&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[...]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;What priority?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;question_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;single_choice&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;options&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[...]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The LLM decides when to use it. Not the user. Not the system. The AI recognizes it's missing information and proactively asks before proceeding.&lt;/p&gt;

&lt;p&gt;The AI isn't a chatbot waiting for input. It's an agent executing a task that chooses to pause because it needs clarification.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Hard Part: Blocking Execution
&lt;/h2&gt;

&lt;p&gt;When the LLM calls &lt;code&gt;ask_user_question&lt;/code&gt;, the tool needs to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Show the question to the user&lt;/li&gt;
&lt;li&gt;Wait for their answer&lt;/li&gt;
&lt;li&gt;Return the answer as the tool result&lt;/li&gt;
&lt;li&gt;Let the LLM continue in the same execution context&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Steps 1 and 4 are easy. Steps 2 and 3 are the interesting engineering problem.&lt;/p&gt;

&lt;p&gt;The LLM is running inside a tool execution pipeline. When it calls a tool, the pipeline expects a synchronous result. But our "result" depends on a human doing something in a browser — which could take seconds or minutes.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Redis Pub/Sub Parking Pattern
&lt;/h3&gt;

&lt;p&gt;Here's how we solved it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AskUserQuestionService&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;parkAndWaitForAnswer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nx"&gt;questionId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;questionData&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;StoredQuestion&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;QuestionAnswer&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// 1. Store the question in Redis with a 5-minute TTL&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;redisService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="s2"&gt;`ask-question:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;questionId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="nx"&gt;questionData&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// 5 minutes&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// 2. Create a dedicated Redis subscriber for this question&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;subscriber&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;redisUrl&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;channel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`ask-question-answer:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;questionId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="c1"&gt;// 3. Block until we receive the answer (or timeout)&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;QuestionAnswer&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;timeout&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nf"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Timed out&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="nx"&gt;_000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// 5 minutes&lt;/span&gt;

        &lt;span class="nx"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;subscribe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;channel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nf"&gt;clearTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
          &lt;span class="nf"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
        &lt;span class="p"&gt;});&lt;/span&gt;
      &lt;span class="p"&gt;});&lt;/span&gt;

      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;unsubscribe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;channel&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;subscriber&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;quit&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;redisService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;deleteState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`ask-question:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;questionId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The tool execution blocks on a Promise that resolves when the user answers. Redis pub/sub acts as the bridge between the user's browser and the waiting tool.&lt;/p&gt;

&lt;p&gt;When the user submits their answer, the API endpoint publishes to that specific channel:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;submitAnswer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;questionId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Validate caller matches the intended recipient&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;stored&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;redisService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`ask-question:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;questionId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;stored&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;workspaceId&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="nx"&gt;callerWorkspaceId&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;stored&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;memberId&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="nx"&gt;callerMemberId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;success&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Unauthorized&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// Publish -&amp;gt; the waiting subscriber resolves -&amp;gt; tool returns -&amp;gt; LLM continues&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;publisher&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;publish&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s2"&gt;`ask-question-answer:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;questionId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;answeredAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;toISOString&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The dedicated subscriber per question is important. You can't use a shared connection because Redis subscriptions are per-connection. Each pending question gets its own subscriber, its own channel, and its own cleanup.&lt;/p&gt;

&lt;h2&gt;
  
  
  Delivering Questions via WebSocket
&lt;/h2&gt;

&lt;p&gt;The question needs to appear in the user's chat in real time. We use our existing WebSocket infrastructure to push a &lt;code&gt;ask-user-question&lt;/code&gt; event that the frontend listens for.&lt;/p&gt;

&lt;p&gt;On the frontend, when a question arrives:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A Zustand store maps &lt;code&gt;conversationId&lt;/code&gt; to &lt;code&gt;pendingQuestion&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The chat input component is replaced with the question card&lt;/li&gt;
&lt;li&gt;The user interacts with the card (selects options, types text)&lt;/li&gt;
&lt;li&gt;On submit, a POST to &lt;code&gt;/ai-ask-question/answer&lt;/code&gt; sends the answer back&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The input replacement is key UX. The question doesn't appear as a message in the chat — it takes over the input area. This makes it clear that the AI is waiting for you, and you can't do anything else in that conversation until you answer (or dismiss).&lt;/p&gt;

&lt;h2&gt;
  
  
  Multi-Question Sequences
&lt;/h2&gt;

&lt;p&gt;Sometimes the AI needs to ask multiple related things. Instead of calling the tool three times (which would show three separate cards), it can send a sequence:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nf"&gt;ask_user_question&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;questions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Which project?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;question_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;single_choice&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;options&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[...]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Who should own it?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;question_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;single_choice&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;options&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[...]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Any additional context?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;question_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;free_text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The user sees a paginated card with arrow navigation. Answers are collected locally and submitted all at once. The LLM receives all answers in a single tool result.&lt;/p&gt;

&lt;p&gt;This is better than multiple tool calls because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One round-trip instead of three&lt;/li&gt;
&lt;li&gt;The user sees all questions upfront (progress indicator: "2 of 3")&lt;/li&gt;
&lt;li&gt;They can skip questions they don't want to answer&lt;/li&gt;
&lt;li&gt;The LLM gets all context at once, not incrementally&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Mobile: Same Feature, Different Challenges
&lt;/h2&gt;

&lt;p&gt;We built the same feature in React Native. Same WebSocket delivery, same Zustand store pattern, same question types. But mobile has its own quirks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Keyboard management: the input replacement needs to handle the software keyboard showing/hiding&lt;/li&gt;
&lt;li&gt;Haptic feedback: option selection triggers &lt;code&gt;Haptics.impactAsync()&lt;/code&gt; for tactile confirmation&lt;/li&gt;
&lt;li&gt;Scroll behavior: the question card needs to stay visible above the keyboard&lt;/li&gt;
&lt;li&gt;Offline: if the user is offline when the question arrives, WebSocket reconnect needs to re-deliver&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What the LLM Actually Receives
&lt;/h2&gt;

&lt;p&gt;When the user answers, the tool returns a plain string. For single questions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Project Alpha"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For sequences, it's formatted as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Q1 (Which project?): Project Alpha
Q2 (Who should own it?): Maria
Q3 (Additional context?): This is for the Q2 release
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Simple text. No JSON. The LLM reads it naturally and continues its task with full context.&lt;/p&gt;

&lt;p&gt;If the user times out (5 minutes), the tool returns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"userAnswer"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"timedOut"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The user did not respond within the time limit"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The LLM then decides what to do — usually it falls back to reasonable defaults and mentions it assumed values.&lt;/p&gt;

&lt;h2&gt;
  
  
  When Should AI Ask vs. Infer?
&lt;/h2&gt;

&lt;p&gt;This is the real design question. You don't want an AI that asks about everything — that's worse than one that guesses.&lt;/p&gt;

&lt;p&gt;Our heuristic: &lt;strong&gt;ask when the wrong guess has meaningful consequences&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Assigning a task to the wrong person? Ask.&lt;/li&gt;
&lt;li&gt;Picking the wrong project? Ask.&lt;/li&gt;
&lt;li&gt;Choosing between high and medium priority? Infer (low stakes).&lt;/li&gt;
&lt;li&gt;Formatting a message slightly differently? Infer.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The tool description tells the LLM: "Use when you need user input &lt;strong&gt;before proceeding&lt;/strong&gt;." The emphasis on "before proceeding" signals that this is for blockers, not preferences.&lt;/p&gt;

&lt;p&gt;In practice, the LLM uses it about once every 10-15 tool calls. Just enough to be helpful without being annoying.&lt;/p&gt;

&lt;h2&gt;
  
  
  Security Considerations
&lt;/h2&gt;

&lt;p&gt;Every question is scoped:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Stored in Redis with &lt;code&gt;workspaceId + memberId&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Answer submission validates the caller matches the stored recipient&lt;/li&gt;
&lt;li&gt;Questions auto-expire after 5 minutes (Redis TTL)&lt;/li&gt;
&lt;li&gt;All transport is via authenticated WebSocket&lt;/li&gt;
&lt;li&gt;No question data persists after the flow completes&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What I'd Do Differently
&lt;/h2&gt;

&lt;p&gt;If I were building this again:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Batch questions more aggressively.&lt;/strong&gt; The LLM sometimes asks one question, gets the answer, then realizes it needs to ask another. I'd add a system prompt nudge to gather all unknowns before asking.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Persistent questions.&lt;/strong&gt; If the user closes the app and reopens, the pending question is gone. The Redis TTL is 5 minutes. For async workflows, this should be longer and stored in the database, not just Redis.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Question templates.&lt;/strong&gt; The LLM generates the question text every time. Pre-defined templates for common patterns (assignee selection, project picker) would be faster and more consistent.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;An AI tool system is incomplete if the AI can't ask questions. Every other tool (create task, send message, query data) assumes the AI has enough context. This one covers the case where it doesn't.&lt;/p&gt;

&lt;p&gt;Small addition to the tool set. Big difference in how the AI actually works with you.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I'm building &lt;a href="https://trilo.app" rel="noopener noreferrer"&gt;Trilo&lt;/a&gt;, a workspace that unifies tasks, chat, and notes for solopreneurs — with an AI coworker that actually understands your work. If you're interested in AI-powered productivity tools, let's connect.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Find me on &lt;a href="https://www.linkedin.com/in/lyricalstring/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or &lt;a href="https://github.com/lyricalstring" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>architecture</category>
    </item>
  </channel>
</rss>
