<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Rishabh Kharyal</title>
    <description>The latest articles on DEV Community by Rishabh Kharyal (@rishabh_kharyal_5d4e2610b).</description>
    <link>https://dev.to/rishabh_kharyal_5d4e2610b</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3706862%2F231635ec-f23a-4427-af67-d9b41bd496d3.png</url>
      <title>DEV Community: Rishabh Kharyal</title>
      <link>https://dev.to/rishabh_kharyal_5d4e2610b</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/rishabh_kharyal_5d4e2610b"/>
    <language>en</language>
    <item>
      <title>Building a Bit-Accurate Fused QKV + RoPE Kernel for Qwen 2.5 in Triton</title>
      <dc:creator>Rishabh Kharyal</dc:creator>
      <pubDate>Thu, 23 Apr 2026 16:27:23 +0000</pubDate>
      <link>https://dev.to/rishabh_kharyal_5d4e2610b/building-a-bit-accurate-fused-qkv-rope-kernel-for-qwen-25-in-triton-1901</link>
      <guid>https://dev.to/rishabh_kharyal_5d4e2610b/building-a-bit-accurate-fused-qkv-rope-kernel-for-qwen-25-in-triton-1901</guid>
      <description>&lt;p&gt;How to replace 10+ PyTorch operations with a single GPU kernel while keeping the output identical to the original model – down to the last decimal.&lt;/p&gt;

&lt;p&gt;If you’ve ever profiled a small Transformer on a consumer GPU, you know the pain: every decode step launches a swarm of tiny kernels, and Python dispatch overhead eats away your token rate. The solution is kernel fusion – but getting it right, especially with Rotary Position Embeddings, isn’t trivial.&lt;/p&gt;

&lt;p&gt;This post walks through triton_fused_attention_v3.py, a self‑contained Triton kernel that fuses QKV projection + RoPE + KV cache write into a single launch for Qwen 2.5‑0.5B. It delivers a 4.5–5× speedup while maintaining cosine similarity = 1.000000 against the reference HuggingFace output. No special hardware needed – this runs on an RTX 3050 laptop GPU.&lt;/p&gt;

&lt;p&gt;We’ll cover:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Why the kernel exists

The two design rules that guarantee bit‑perfect output

A line‑by‑line walkthrough of the kernel

The benchmark setup that proves both speed and accuracy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;Why Fuse? The Death‑by‑a‑Thousand‑Kernels Problem&lt;/p&gt;

&lt;p&gt;Here’s what happens when you call a standard PyTorch attention block:&lt;br&gt;
python&lt;/p&gt;

&lt;p&gt;q = linear(hidden, W_q)  # kernel launch 1&lt;br&gt;
k = linear(hidden, W_k)  # launch 2&lt;br&gt;
v = linear(hidden, W_v)  # launch 3&lt;br&gt;
q, k = rope(q, k)        # 2 more launches (reshape + complex op)&lt;/p&gt;
&lt;h1&gt;
  
  
  then KV cache write, SDPA, O projection...
&lt;/h1&gt;

&lt;p&gt;For Qwen 2.5‑0.5B (24 layers), that’s 240 separate GPU kernel launches per token. Each launch costs ~80 µs of CPU–GPU handshaking – not much alone, but 240 × 80 µs = 19 ms of pure overhead. The RTX 3050 can theoretically process a token in under 6 ms; we were spending over 30 ms because the GPU sat idle waiting for work.&lt;/p&gt;

&lt;p&gt;The fix is to pack as many of those operations as possible into one kernel. And that’s exactly what fused_qkv_rope_v3 does.&lt;br&gt;
What the Kernel Does (In One Shot)&lt;/p&gt;

&lt;p&gt;Inside a single Triton program we do:&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Tiled matrix‑vector multiply against a concatenated weight matrix [W_q ; W_k ; W_v] → Q, K, V

Apply RoPE on the Q and K heads using the local accumulator (no separate store‑reload)

Write rotated K and V directly into the persistent KV cache

Write rotated Q into an output buffer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;After that, only two more operations remain: the attention softmax (done with PyTorch’s scaled_dot_product_attention) and the output GEMV. The whole decode step collapses from 10+ kernels to just 3.&lt;/p&gt;

&lt;p&gt;And we don’t stop there: the script captures all three into a CUDA Graph, so subsequent tokens are fired with a single graph.replay() – virtually zero dispatch overhead.&lt;br&gt;
The Two Critical Design Rules&lt;/p&gt;

&lt;p&gt;Fusing RoPE inside a projection kernel is where things get slippery. RoPE rotates pairs of elements (2k, 2k+1) within each attention head. If you break those pairs across thread blocks, or let the partner value get rounded to FP16, you introduce numerical drift. This kernel uses two clean rules to avoid that completely.&lt;br&gt;
Rule 1: BLOCK_M = head_dim – Never Split a RoPE Pair&lt;/p&gt;

&lt;p&gt;Qwen 2.5‑0.5B has a head dimension of 64. So we set:&lt;br&gt;
python&lt;/p&gt;

&lt;p&gt;BLOCK_M = hd  # 64&lt;/p&gt;

&lt;p&gt;This ensures that every Triton program instance processes one entire attention head (or one head‑sized chunk of V). All 32 RoPE pairs inside that head are adjacent in the same local accumulator – zero cross‑block reads.&lt;/p&gt;

&lt;p&gt;The grid is simply M_total // BLOCK_M = 18 blocks, which maps beautifully onto the GPU’s SMs.&lt;br&gt;
Rule 2: Use an FP32 Scratchpad for RoPE Partner Values&lt;/p&gt;

&lt;p&gt;To compute a rotation for element 2k you need the partner value at 2k+1 (and vice versa). In Triton you can’t just index acc[i+1] directly – you have to shift the data. The naive approach stores the accumulator to an FP16 buffer and reloads a shifted version. That FP16 round‑trip adds a small error that grows with sequence length.&lt;/p&gt;

&lt;p&gt;Instead, the kernel uses a tiny FP32 temporary buffer allocated before the kernel launch:&lt;br&gt;
python&lt;/p&gt;

&lt;p&gt;fp32_tmp = torch.empty(q_dim + k_dim, device='cuda', dtype=torch.float32)&lt;/p&gt;

&lt;p&gt;Inside the kernel, after computing QKV values into the FP32 accumulator, we store them without rounding:&lt;br&gt;
python&lt;/p&gt;

&lt;p&gt;tl.store(fp32_tmp_ptr + rows, acc, mask=mask)   # Full FP32&lt;/p&gt;

&lt;p&gt;Then we compute the partner indices (even reads rows + 1, odd reads rows - 1) and load from that FP32 buffer:&lt;br&gt;
python&lt;/p&gt;

&lt;p&gt;partner_rows = tl.where(is_even, rows + 1, rows - 1)&lt;br&gt;
partner = tl.load(fp32_tmp_ptr + partner_rows, mask=mask)  # FP32 load&lt;/p&gt;

&lt;p&gt;Now both acc and partner are in full precision for the rotation:&lt;br&gt;
python&lt;/p&gt;

&lt;p&gt;roped = tl.where(is_even,&lt;br&gt;
                 acc * cos_val - partner * sin_val,&lt;br&gt;
                 partner * sin_val + acc * cos_val)&lt;/p&gt;

&lt;p&gt;Only the final rotated value is cast to FP16 when written to the output or KV cache. This eliminates the precision loss entirely.&lt;br&gt;
A Walk Through the Kernel Launch Parameters&lt;/p&gt;

&lt;p&gt;The kernel is annotated with @triton.jit and takes a long list of parameters, but here’s what matters:&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;W_ptr: The concatenated QKV weight matrix of shape (q_dim + k_dim + v_dim, d).

x_ptr: The input hidden state of size d.

b_ptr: Concatenated bias of the same shape as the output rows.

q_ptr, k_cache_ptr, v_cache_ptr: Output buffers for Q and the KV cache.

cos_ptr, sin_ptr: Pre‑computed RoPE frequencies for the current position.

fp32_tmp_ptr: The FP32 scratchpad used only for RoPE partner access.

M: Total number of Q+K+V rows (1152 for this model).

head_dim: 64. half_hd: 32.

BLOCK_M: Set to 64 (head‑aligned).

BLOCK_K: Tile size along the input dimension (128).
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;The kernel iterates over the input dimension in chunks of BLOCK_K, accumulating the dot‑product results into registers.&lt;/p&gt;

&lt;p&gt;After the accumulation, it checks whether the current block of rows belongs to Q, K, or V, applies RoPE if needed, and writes to the appropriate destination. The code uses masks (is_q, is_k, is_v) to steer the output without branches that diverge heavily.&lt;br&gt;
How the Script Benchmarks and Proves Correctness&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;main&lt;/strong&gt; block in the file runs a thorough comparison for four cache lengths: 64, 128, 256, 512. For each length it:&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Runs a standard() function that implements the exact HuggingFace attention logic in pure PyTorch. It times this over 300 iterations after 30 warmup runs, reporting the median kernel time in milliseconds.

Runs triton_v3() in eager mode – the three separate kernel launches – and times it identically. It computes the speedup and, crucially, calculates the cosine similarity between the standard output and the Triton output. The result is 1.000000 at every tested length.

Captures a CUDA Graph with the same three operations. After a few warmup replays, it times 500 graph replays and reports the even lower latency.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;A representative output snippet:&lt;br&gt;
text&lt;/p&gt;

&lt;p&gt;--- cache_len=64 ---&lt;br&gt;
  Standard PyTorch: 2.123ms&lt;br&gt;
  Triton V3 (eager): 0.468ms&lt;br&gt;
  Speedup: 4.54x | cos: 1.000000&lt;br&gt;
  Triton V3 + Graph: 0.393ms (5.40x)&lt;/p&gt;

&lt;p&gt;The graph‑captured version consistently adds an extra 15–20% improvement over the eager mode, because it eliminates the remaining Python transitions between the three fused kernels.&lt;br&gt;
Key Takeaways for Triton Developers&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Tile alignment is a correctness concern, not just a performance one. Setting BLOCK_M = head_dim ensured RoPE pairs were never split, which was the foundation for bit‑perfect output.

A tiny FP32 buffer (a few KB) can save you from silent precision drift. When the correctness of a rotation depends on a partner value, don’t let that value pass through FP16 – keep it full precision until the final write.

CUDA Graphs amplify the benefit of fusion. After fusing from 10 kernels to 3, a graph capture removes the last bit of Python overhead, squeezing out every bit of memory bandwidth.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;Try the Code Yourself&lt;/p&gt;

&lt;p&gt;The file is self‑contained. Install the dependencies and run:&lt;br&gt;
bash&lt;/p&gt;

&lt;p&gt;pip install torch triton transformers&lt;br&gt;
python triton_fused_attention_v3.py&lt;/p&gt;

&lt;p&gt;It will download Qwen 2.5‑0.5B (if not already cached), print the model dimensions, and run the four‑length benchmark. No other setup needed.&lt;/p&gt;

&lt;p&gt;The full repository – including a CuPy‑based Windows kernel and a batched throughput exploit – is on GitHub, but the script we’ve explored here is the beating heart: a clean demonstration of how to fuse attention with RoPE while preserving 100% of the reference model’s output.&lt;/p&gt;

&lt;p&gt;Did you spot an optimisation I missed? Or have your own fused‑kernel war story? Drop a comment – I’d love to hear what the dev.to GPU community is building.&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/Skills03" rel="noopener noreferrer"&gt;
        Skills03
      &lt;/a&gt; / &lt;a href="https://github.com/Skills03/triton-fused-attention" rel="noopener noreferrer"&gt;
        triton-fused-attention
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      
    &lt;/h3&gt;
  &lt;/div&gt;
&lt;/div&gt;



</description>
      <category>triton</category>
      <category>gpu</category>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>I Built a 4.75 Faster Qwen 2.5 Engine for a $200 GPU – Here’s How</title>
      <dc:creator>Rishabh Kharyal</dc:creator>
      <pubDate>Thu, 23 Apr 2026 16:10:58 +0000</pubDate>
      <link>https://dev.to/rishabh_kharyal_5d4e2610b/i-built-a-475x-faster-qwen-25-engine-for-a-200-gpu-heres-how-2pj1</link>
      <guid>https://dev.to/rishabh_kharyal_5d4e2610b/i-built-a-475x-faster-qwen-25-engine-for-a-200-gpu-heres-how-2pj1</guid>
      <description>&lt;p&gt;My RTX 3050 laptop GPU was crawling at 30 tokens per second with Qwen 2.5‑0.5B. So I tore apart the attention block, wrote some custom Triton kernels, and got it to 140 tok/s – completely lossless. No paid APIs, no datacenter hardware required.&lt;/p&gt;

&lt;p&gt;Local AI agents need sub‑second responses. When you hit 30 tok/s on a machine that can theoretically process one token in 5.88 ms, something is wrong. That was my starting point: Qwen 2.5‑0.5B, HuggingFace Transformers, an RTX 3050 6 GB laptop GPU – and terribly sluggish generation.&lt;/p&gt;

&lt;p&gt;I didn’t want to sacrifice quality, quantise aggressively, or switch to a smaller model. So I dived into the GPU execution model, fused a pile of kernels, and ended up with a 4.75× lossless speedup. This post walks through exactly what I built and what every developer can learn from it.&lt;br&gt;
The Baseline: Why HuggingFace Leaves Speed on the Table&lt;/p&gt;

&lt;p&gt;A naive forward pass of a single Transformer attention block launches 10+ separate GPU kernels:&lt;br&gt;
text&lt;/p&gt;

&lt;p&gt;Q projection → K projection → V projection → &lt;br&gt;
RoPE on Q → RoPE on K → &lt;br&gt;
GQA expand → SDPA → reshape → O projection → residual add&lt;/p&gt;

&lt;p&gt;For Qwen 2.5‑0.5B (24 layers), that’s over 240 kernel launches per token. Each launch wastes about 80 µs on Python‑side dispatch. Multiply that through 24 layers and you’re burning 13.4 ms per token just waiting for the GPU to start working.&lt;/p&gt;

&lt;p&gt;The RTX 3050’s memory bandwidth (168 GB/s) sets a hard physics ceiling of 5.88 ms/token (170 tok/s). At 33 ms/token we were using only 18% of the hardware.&lt;/p&gt;

&lt;p&gt;The fix is clear: fewer kernel launches. That’s exactly what this engine does.&lt;br&gt;
The Tech: 3 Kernel Launches and a CUDA Graph&lt;/p&gt;

&lt;p&gt;I wrote a single custom Triton kernel that replaces all the pre‑attention operations. Then I capture the whole decode step in a CUDA graph. The final pipeline is:&lt;br&gt;
text&lt;/p&gt;

&lt;p&gt;Fused QKV+RoPE+KV Cache (1 kernel) &lt;br&gt;
→ SDPA (1 kernel) &lt;br&gt;
→ O projection (1 kernel)&lt;/p&gt;

&lt;p&gt;Everything else – residual adds, RMS normalisation, and the entire next‑token loop – happens inside the CUDA graph without Python ever regaining control.&lt;/p&gt;

&lt;p&gt;Result: 140 tok/s, 4.75× the baseline, with exactly the same output as HuggingFace (cosine similarity = 1.000000).&lt;br&gt;
How the Fused Kernel Works&lt;/p&gt;

&lt;p&gt;The core is triton_fused_attention_v3.py. In one kernel we:&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Load the input activation vector.

Perform tiled matrix‑vector multiplies to compute Q, K, and V simultaneously.

Apply Rotary Position Embedding (RoPE) inline – the kernel sees both elements of each RoPE pair.

Write K and V directly into the persistent KV cache.

Store Q ready for the attention softmax.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;All intermediate values live in GPU registers and shared memory. No trips to global memory, no separate torch.roll, no torch.cat for KV cache updates.&lt;/p&gt;

&lt;p&gt;The SDPA kernel is still off‑the‑shelf (PyTorch’s scaled_dot_product_attention), but because it’s launched directly from the graph with zero dispatch lag, the overhead is negligible.&lt;br&gt;
Three Key Optimisations That Made It Lossless&lt;/p&gt;

&lt;p&gt;Getting speed is one thing. Getting perfect numerical fidelity is harder.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Head‑Aligned Block Size (BLOCK_M = 64)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Qwen 2.5‑0.5B uses a head dimension of 64. RoPE rotates pairs of elements (2k, 2k+1). If our Triton block size didn’t align with 64, some pairs would be split across thread blocks, requiring cross‑block reads that introduce tiny errors.&lt;/p&gt;

&lt;p&gt;Setting BLOCK_M = 64 guarantees that all RoPE pairs stay within a single block. This single tweak eliminated quality degradation at longer sequences.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;FP32 RoPE Accumulation Buffer&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The standard FP16 approach stores pre‑rotation values, reloads the partner, and applies the rotation. That store‑reload cycle in FP16 accumulates rounding error.&lt;/p&gt;

&lt;p&gt;I introduced an FP32 temporary buffer: Q and K values are cast to FP32 before rotation, the rotation is computed in full precision, and only the final result is stored back as FP16. With this fix, cosine similarity hit 1.000000 at all sequence lengths (it had been 0.998 at L=512).&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;CUDA Graph Capture&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Even with only 3 kernels per step, Python dispatch still adds latency. By capturing the entire decode loop (the sequence of 3 kernel launches) into a CUDA graph, we pay the dispatch cost only once. Subsequent tokens are submitted to the GPU with a single graph.replay(), effectively eliminating CPU‑GPU sync jitter.&lt;br&gt;
Batched Mode: 211× Throughput (When You Have Many Prompts)&lt;/p&gt;

&lt;p&gt;If you need to process many prompts at once, the engine also includes batched_inference.py. By batching 256 sequences together, the matrix‑vector multiplies (GEMV) become matrix‑matrix multiplies (GEMM), engaging the Tensor Cores. Model weights are loaded only once and reused across the whole batch.&lt;/p&gt;

&lt;p&gt;On the same RTX 3050, aggregate throughput jumps to 2,659 tok/s – a 211× improvement over the single‑stream naive baseline. It’s not a continuous‑batching server, but it’s great for offline batch jobs.&lt;br&gt;
Quick‑Start: Run It Yourself&lt;br&gt;
bash&lt;/p&gt;

&lt;p&gt;git clone &lt;a href="https://github.com/Skills03/qwen-2.5-inference-engine" rel="noopener noreferrer"&gt;https://github.com/Skills03/qwen-2.5-inference-engine&lt;/a&gt;&lt;br&gt;
cd qwen-2.5-inference-engine&lt;br&gt;
pip install torch triton transformers  # Linux/WSL&lt;br&gt;
python triton_fused_attention_v3.py&lt;/p&gt;

&lt;p&gt;Windows users can use the CuPy path (fused_attn_block.py). No special hardware beyond an Ampere+ NVIDIA card with ≥6 GB VRAM.&lt;br&gt;
How It Stacks Against vLLM&lt;/p&gt;

&lt;p&gt;vLLM is a brilliant production‑grade serving engine with PagedAttention, continuous batching, and support for hundreds of models. On the same hardware, if you configure vLLM with FlashAttention and CUDA graphs, you might see 100–140 tok/s for this model. So does this custom engine beat it? Marginally, but not dramatically – the true win is the bit‑identical output and the educational value of a minimal, hackable codebase. For single‑user, local‑only use, this engine gives you an extra 10–20% latency reduction while staying dead simple.&lt;/p&gt;

&lt;p&gt;If you need a multi‑user serving API, stick with vLLM. The two approaches are complementary, and in principle this fused kernel could even be plugged into vLLM’s custom‑attention backend.&lt;br&gt;
What I Learned (And You Can Too)&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Kernel launch overhead is the enemy on small models. The less work the GPU does per token, the more Python‑side dispatch dominates. Fuse ruthlessly.

Alignment matters for correctness, not just speed. A simple BLOCK_M = head_dim removed silent numerical errors.

Mixed precision isn’t free. An FP32 scratchpad gave back full fidelity with almost no performance cost.

Physics always wins. Our 140 tok/s is 73% of the memory‑bandwidth ceiling; the rest is irreducible compute and scheduling. That’s okay – you can only optimise so far.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;The entire project is open‑source and ready to hack on. If you’ve got a budget GPU and a desire for snappy local AI, clone it and see how far you can push your own model. I’d love to hear what you build.&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/Skills03" rel="noopener noreferrer"&gt;
        Skills03
      &lt;/a&gt; / &lt;a href="https://github.com/Skills03/qwen-2.5-inference-engine" rel="noopener noreferrer"&gt;
        qwen-2.5-inference-engine
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;Qwen 2.5 Inference Engine&lt;/h1&gt;
&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;4.75x lossless inference speedup&lt;/strong&gt; for Qwen-2.5 on RTX 3050 (6 GB VRAM).&lt;/p&gt;
&lt;p&gt;Built so local AI agents get sub-1s response time on consumer GPUs instead of paid APIs.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Results&lt;/h2&gt;
&lt;/div&gt;
&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;tok/s&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;th&gt;Quality&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;HuggingFace baseline&lt;/td&gt;
&lt;td&gt;30&lt;/td&gt;
&lt;td&gt;1.0x&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Triton fused + CUDA Graph&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;140&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;4.75x&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;cos=1.000000 (lossless)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Batched (batch=256)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2,659&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;211x&lt;/strong&gt; throughput&lt;/td&gt;
&lt;td&gt;lossless&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;p&gt;Measured on RTX 3050 6GB Laptop GPU, Qwen2.5-0.5B, Nsight-verified.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;What's Inside&lt;/h2&gt;
&lt;/div&gt;
&lt;div class="snippet-clipboard-content notranslate position-relative overflow-auto"&gt;&lt;pre class="notranslate"&gt;&lt;code&gt;qwen-engine/
  triton_fused_attention.py      # Triton fused QKV+RoPE+Attention kernel (WSL)
  triton_fused_attention_v3.py   # V3: head-aligned BLOCK_M=64 + FP32 RoPE fix
  fused_attn_block.py            # CuPy fused attention block (Windows)
  batched_inference.py           # 211x batched throughput benchmark
  benchmark.py                   # End-to-end lossless speed benchmark
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Key Techniques&lt;/h2&gt;

&lt;/div&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;Triton Fused QKV+RoPE (4.75x lossless)&lt;/h3&gt;

&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fused QKV projection + RoPE + KV cache write&lt;/strong&gt; in ONE Triton kernel&lt;/li&gt;
&lt;li&gt;Replaces 10+ PyTorch operations with 3 kernel launches&lt;/li&gt;
&lt;li&gt;CUDA Graph captures all 3 for zero Python dispatch overhead&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;Head-Aligned BLOCK_M = 64&lt;/h3&gt;

&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;RoPE pairs elements (2k…&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/Skills03/qwen-2.5-inference-engine" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


</description>
      <category>python</category>
      <category>triton</category>
      <category>cuda</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>I Built a Desktop GUI for Claude Code in One Day — Here's How</title>
      <dc:creator>Rishabh Kharyal</dc:creator>
      <pubDate>Thu, 15 Jan 2026 13:41:45 +0000</pubDate>
      <link>https://dev.to/rishabh_kharyal_5d4e2610b/i-built-a-desktop-gui-for-claude-code-in-one-day-heres-how-5af0</link>
      <guid>https://dev.to/rishabh_kharyal_5d4e2610b/i-built-a-desktop-gui-for-claude-code-in-one-day-heres-how-5af0</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F81nwqf7rwjkto9pii62l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F81nwqf7rwjkto9pii62l.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  I Built a Desktop GUI for Claude Code in One Day — Here's How
&lt;/h1&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Claude Code is amazing. But I got tired of the terminal.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So I built this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4f7lwolutjreyjhgu9rd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4f7lwolutjreyjhgu9rd.png" alt="Claude Cowork"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;

&lt;/p&gt;
&lt;div class="crayons-card c-embed text-styles text-styles--secondary"&gt;
    &lt;div class="c-embed__content"&gt;
      &lt;div class="c-embed__body flex items-center justify-between"&gt;
        &lt;a href="https://raw.githubusercontent.com/Skills03/Claude-Cowork/master/docs/demo.mp4" rel="noopener noreferrer" class="c-link fw-bold flex items-center"&gt;
          &lt;span class="mr-2"&gt;raw.githubusercontent.com&lt;/span&gt;
          

        &lt;/a&gt;
      &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;




&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🚀 Run &lt;strong&gt;3 AI tasks in parallel&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;📝 See &lt;strong&gt;live diffs&lt;/strong&gt; before files change&lt;/li&gt;
&lt;li&gt;💾 &lt;strong&gt;115MB portable&lt;/strong&gt; — works offline&lt;/li&gt;
&lt;li&gt;🖥️ &lt;strong&gt;Windows + Mac + Linux&lt;/strong&gt; from one codebase&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://github.com/Skills03/Claude-Cowork/releases" rel="noopener noreferrer"&gt;⬇️ Download&lt;/a&gt; | &lt;a href="https://github.com/Skills03/Claude-Cowork" rel="noopener noreferrer"&gt;📦 GitHub&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  💡 What I Learned (The Short Version)
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Electron isn't dead&lt;/strong&gt; — shipped to 3 platforms in 6 minutes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zustand &amp;gt; Redux&lt;/strong&gt; — for apps this size, simplicity wins&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ship broken, fix fast&lt;/strong&gt; — users find bugs faster than you&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Want the technical deep-dive? Keep reading. Want to try it? &lt;a href="https://github.com/Skills03/Claude-Cowork/releases" rel="noopener noreferrer"&gt;Download here&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  🤔 The Problem: Terminal-Only AI
&lt;/h2&gt;

&lt;p&gt;If you've used &lt;a href="https://docs.anthropic.com/en/docs/claude-code" rel="noopener noreferrer"&gt;Claude Code&lt;/a&gt;, you know it's incredibly capable. It can read your codebase, write files, run commands, and reason about complex problems.&lt;/p&gt;

&lt;p&gt;But it runs &lt;strong&gt;only in the terminal&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;For quick tasks, that's fine. But when you're:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Running multiple AI tasks simultaneously&lt;/li&gt;
&lt;li&gt;Reviewing file changes before approving them&lt;/li&gt;
&lt;li&gt;Context-switching between different projects&lt;/li&gt;
&lt;li&gt;Showing work to non-technical stakeholders&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;...the terminal becomes a bottleneck.&lt;/p&gt;

&lt;p&gt;I wanted something that felt more like &lt;strong&gt;pair programming with a colleague&lt;/strong&gt; than typing into a black box.&lt;/p&gt;




&lt;h2&gt;
  
  
  🛠️ The Solution: Claude Cowork
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4f7lwolutjreyjhgu9rd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4f7lwolutjreyjhgu9rd.png" alt="Claude Cowork Screenshot"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Claude Cowork is a desktop application that wraps the official Claude Agent SDK with:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;What It Does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Parallel Task Queue&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Run up to 3 AI tasks simultaneously&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Live Diff Visualization&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;See file changes before they happen&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Session Management&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Persist conversations in SQLite&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Plugin System&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Auto-discover MCP servers and skills&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cross-Platform&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Single codebase → Win/Mac/Linux installers&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Let me walk you through how I built each piece.&lt;/p&gt;




&lt;h2&gt;
  
  
  🏗️ Architecture Overview
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────────┐
│                    Electron Main Process                 │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────┐  │
│  │ IPC Handler │  │ Session DB  │  │ Settings Manager│  │
│  │             │  │  (SQLite)   │  │  (MCP + Skills) │  │
│  └──────┬──────┘  └──────┬──────┘  └────────┬────────┘  │
│         │                │                   │           │
│         └────────────────┼───────────────────┘           │
│                          │                               │
│              ┌───────────▼───────────┐                   │
│              │  Claude Agent SDK     │                   │
│              │  (Task Runner)        │                   │
│              └───────────────────────┘                   │
└─────────────────────────────────────────────────────────┘
                           │
                           │ IPC Bridge
                           ▼
┌─────────────────────────────────────────────────────────┐
│                   React 19 Frontend                      │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌─────────┐  │
│  │ Sidebar  │  │ Chat View│  │ Progress │  │Settings │  │
│  │          │  │ + Diffs  │  │  Panel   │  │  Modal  │  │
│  └──────────┘  └──────────┘  └──────────┘  └─────────┘  │
│                                                          │
│                    Zustand Store                         │
└─────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Tech Stack:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Electron 39&lt;/strong&gt; — Cross-platform desktop framework&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;React 19&lt;/strong&gt; — UI with latest features (Suspense, transitions)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zustand&lt;/strong&gt; — Lightweight state management&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;better-sqlite3&lt;/strong&gt; — Session persistence&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;@anthropic-ai/claude-agent-sdk&lt;/strong&gt; — The AI brain&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  ⚡ Deep Dive #1: Parallel Task Queue
&lt;/h2&gt;

&lt;p&gt;The killer feature. Most AI coding tools process one request at a time. Claude Cowork can handle three.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Problem
&lt;/h3&gt;

&lt;p&gt;Claude Agent SDK runs tasks synchronously. If you start a task, you wait until it completes.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Solution
&lt;/h3&gt;

&lt;p&gt;I implemented a task queue manager that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Accepts unlimited tasks into a queue&lt;/li&gt;
&lt;li&gt;Runs up to &lt;code&gt;MAX_CONCURRENT_TASKS&lt;/code&gt; (3) simultaneously&lt;/li&gt;
&lt;li&gt;Broadcasts status updates via IPC
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/electron/ipc-handlers.ts&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;MAX_CONCURRENT_TASKS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;taskQueue&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;QueuedTask&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;runningTasks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nb"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;QueuedTask&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;processTaskQueue&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;runningTasks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;size&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;MAX_CONCURRENT_TASKS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;nextTask&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;taskQueue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;queued&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;nextTask&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;// Start task in background&lt;/span&gt;
    &lt;span class="nx"&gt;nextTask&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;running&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nx"&gt;nextTask&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;startedAt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;createSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;nextTask&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cwd&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;runningTasks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;nextTask&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;nextTask&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="c1"&gt;// Run Claude (non-blocking)&lt;/span&gt;
    &lt;span class="nf"&gt;runClaude&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;nextTask&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;cwd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;nextTask&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cwd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;onEvent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;broadcast&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nf"&gt;completeTask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;nextTask&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="nf"&gt;processTaskQueue&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// Check for more tasks&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The UI
&lt;/h3&gt;

&lt;p&gt;Users see a collapsible panel showing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🟢 Running tasks (with stop button)&lt;/li&gt;
&lt;li&gt;🟡 Queued tasks (with cancel button)&lt;/li&gt;
&lt;li&gt;Toast notifications when tasks complete
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// TaskQueuePanel.tsx&lt;/span&gt;
&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"task-queue-panel"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;runningTasks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"task-item running"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;span&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"pulse-dot"&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;span&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;...&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;span&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;button&lt;/span&gt; &lt;span class="na"&gt;onClick&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;cancelTask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Stop&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;button&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; Queue "write tests for auth module", "update README", and "fix TypeScript errors" — they all run in parallel.&lt;/p&gt;




&lt;h2&gt;
  
  
  📝 Deep Dive #2: Live Diff Visualization
&lt;/h2&gt;

&lt;p&gt;When Claude edits a file, you should see &lt;em&gt;exactly&lt;/em&gt; what changed before it happens.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Implementation
&lt;/h3&gt;

&lt;p&gt;I intercept &lt;code&gt;Edit&lt;/code&gt; and &lt;code&gt;Write&lt;/code&gt; tool calls and render inline diffs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// EventCard.tsx&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;DiffView&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;oldContent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;newContent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;filePath&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="nx"&gt;DiffProps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;changes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;diffLines&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;oldContent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;newContent&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"diff-container"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"diff-header"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;span&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"file-icon"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;📄&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;span&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;span&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"file-path"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;filePath&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;span&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;pre&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"diff-content"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;changes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;change&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
          &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;span&lt;/span&gt;
            &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
            &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;
              &lt;span class="nx"&gt;change&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;added&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;diff-added&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;
              &lt;span class="nx"&gt;change&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;removed&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;diff-removed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;
              &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;diff-unchanged&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
            &lt;span class="si"&gt;}&lt;/span&gt;
          &lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;change&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
          &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;span&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;pre&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Before vs After
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Terminal (Claude Code):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Editing src/utils.ts...
Done.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Claude Cowork:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight diff"&gt;&lt;code&gt;// src/utils.ts
  export function formatDate(date: Date) {
&lt;span class="gd"&gt;-   return date.toString();
&lt;/span&gt;&lt;span class="gi"&gt;+   return date.toISOString().split('T')[0];
&lt;/span&gt;  }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You see the change, approve it, and move on. No guessing.&lt;/p&gt;




&lt;h2&gt;
  
  
  🔌 Deep Dive #3: Plugin System (MCP + Skills)
&lt;/h2&gt;

&lt;p&gt;Claude Cowork auto-discovers capabilities from two sources:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. MCP Servers
&lt;/h3&gt;

&lt;p&gt;Model Context Protocol servers extend Claude's abilities. I built a settings manager that:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// settings-manager.ts&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;BUILT_IN_MCP_SERVERS&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;MCPServer&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;filesystem&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Filesystem&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Read and write files&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;builtIn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;npx&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;-y&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@anthropic-ai/mcp-filesystem&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;fetch&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Web Fetch&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Fetch URLs and web content&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;builtIn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;npx&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;-y&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@anthropic-ai/mcp-fetch&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;

&lt;span class="c1"&gt;// Also loads external servers from ~/.claude/settings.json&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Skills Auto-Discovery
&lt;/h3&gt;

&lt;p&gt;Skills are markdown files with instructions. Claude Cowork scans &lt;code&gt;~/.claude/skills/&lt;/code&gt; and presents them in the settings modal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;discoverSkills&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="nx"&gt;SkillInfo&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;skillsDir&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;homedir&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;.claude&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;skills&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;skills&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;SkillInfo&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;

  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;folder&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nf"&gt;readdirSync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;skillsDir&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;skillPath&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;skillsDir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;folder&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;SKILL.md&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;existsSync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;skillPath&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;readFileSync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;skillPath&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;utf-8&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;triggers&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;parseSkillFrontmatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="nx"&gt;skills&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;folder&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;triggers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;skills&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; Install any Claude Code skill → it automatically appears in Claude Cowork.&lt;/p&gt;




&lt;h2&gt;
  
  
  🚀 Deep Dive #4: Cross-Platform CI/CD
&lt;/h2&gt;

&lt;p&gt;One codebase should produce installers for all platforms. Here's my GitHub Actions workflow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .github/workflows/build.yml&lt;/span&gt;

&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build &amp;amp; Release&lt;/span&gt;

&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;push&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;v*'&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;strategy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;matrix&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;include&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;os&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;windows-latest&lt;/span&gt;
            &lt;span class="na"&gt;platform&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;win&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;os&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;macos-latest&lt;/span&gt;
            &lt;span class="na"&gt;platform&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mac&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;os&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
            &lt;span class="na"&gt;platform&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;linux&lt;/span&gt;

    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ matrix.os }}&lt;/span&gt;

    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/setup-node@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;node-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;20&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm ci&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm run dist:${{ matrix.platform }}&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/upload-artifact@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;claude-cowork-${{ matrix.platform }}&lt;/span&gt;
          &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
            &lt;span class="s"&gt;release/*.exe&lt;/span&gt;
            &lt;span class="s"&gt;release/*.dmg&lt;/span&gt;
            &lt;span class="s"&gt;release/*.AppImage&lt;/span&gt;
            &lt;span class="s"&gt;release/*.deb&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Push a tag → Get installers for 3 platforms in ~6 minutes.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Output Sizes
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Size&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Windows&lt;/td&gt;
&lt;td&gt;NSIS Installer&lt;/td&gt;
&lt;td&gt;129 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows&lt;/td&gt;
&lt;td&gt;Portable EXE&lt;/td&gt;
&lt;td&gt;115 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;macOS&lt;/td&gt;
&lt;td&gt;DMG&lt;/td&gt;
&lt;td&gt;147 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Linux&lt;/td&gt;
&lt;td&gt;AppImage&lt;/td&gt;
&lt;td&gt;131 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Linux&lt;/td&gt;
&lt;td&gt;.deb&lt;/td&gt;
&lt;td&gt;89 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  🐛 Challenges &amp;amp; Solutions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Challenge 1: Native Modules on Windows
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;better-sqlite3&lt;/code&gt; requires native compilation. electron-builder handles this, but I hit symlink permission errors on Windows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt; Add &lt;code&gt;signAndEditExecutable: false&lt;/code&gt; to skip code signing during development.&lt;/p&gt;

&lt;h3&gt;
  
  
  Challenge 2: State Sync Between Processes
&lt;/h3&gt;

&lt;p&gt;Electron has two processes (main + renderer). They need to stay in sync.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt; Zustand store in renderer + IPC event broadcasting from main:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Main process&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;broadcast&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ServerEvent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;BrowserWindow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getAllWindows&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;forEach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;win&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;win&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;webContents&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;server-event&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Renderer process&lt;/span&gt;
&lt;span class="nf"&gt;useEffect&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;electron&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;onServerEvent&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;handleServerEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Updates Zustand store&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;[]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Challenge 3: Stopping Running Tasks
&lt;/h3&gt;

&lt;p&gt;Claude Agent SDK doesn't expose a clean abort mechanism.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt; AbortController pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;runClaude&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;RunOptions&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;RunnerHandle&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;abortController&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;AbortController&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="c1"&gt;// Pass signal to SDK (if supported) or handle manually&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;promise&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;claude&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;signal&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;abortController&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;signal&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;promise&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;abort&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;abortController&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;abort&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🎓 What I Learned
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Electron is still viable&lt;/strong&gt; — Despite the "Electron bad" memes, it ships fast and works everywhere.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;React 19 is nice&lt;/strong&gt; — Suspense boundaries and transitions make async UI smoother.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Zustand &amp;gt; Redux&lt;/strong&gt; — For apps this size, Zustand's simplicity wins.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;CI/CD is table stakes&lt;/strong&gt; — Automated builds save hours and catch platform-specific bugs early.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Ship early&lt;/strong&gt; — The first version had bugs. Users found them faster than I would have.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  🔮 What's Next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;[ ] &lt;strong&gt;GUI settings for API keys&lt;/strong&gt; — Remove dependency on Claude Code CLI&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;Multi-agent orchestration&lt;/strong&gt; — Spawn specialized agents for different tasks&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;Git integration&lt;/strong&gt; — Auto-commit checkpoints during long tasks&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;Project memory&lt;/strong&gt; — Remember context across sessions&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🎮 Try It Yourself
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Download:&lt;/strong&gt; &lt;a href="https://github.com/Skills03/Claude-Cowork/releases" rel="noopener noreferrer"&gt;GitHub Releases&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Requirements:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Claude Code installed and authenticated&lt;/li&gt;
&lt;li&gt;Node.js 18+ (for development)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Run from source:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/Skills03/Claude-Cowork.git
&lt;span class="nb"&gt;cd &lt;/span&gt;Claude-Cowork
npm &lt;span class="nb"&gt;install
&lt;/span&gt;npm run dev
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  💭 Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Claude Code changed how I write software. Claude Cowork makes that experience visual, parallel, and shareable.&lt;/p&gt;

&lt;p&gt;If you're building AI-powered developer tools, the playbook is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Wrap existing SDKs (don't reinvent)&lt;/li&gt;
&lt;li&gt;Add the UX layer users actually need&lt;/li&gt;
&lt;li&gt;Ship cross-platform from day one&lt;/li&gt;
&lt;li&gt;Open source everything&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The best tools feel like extensions of your brain. That's what I'm building toward.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Thanks for reading!&lt;/strong&gt; If you found this useful:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Star the repo on &lt;a href="https://github.com/Skills03/Claude-Cowork" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Follow me on &lt;a href="https://linkedin.com/in/rishabh-kharyal-03" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; for more builds&lt;/li&gt;
&lt;li&gt;DM me if you want to collaborate&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Built with Claude Code + Claude Cowork (yes, recursively)&lt;/em&gt; 🔄&lt;/p&gt;

</description>
      <category>react</category>
      <category>typescript</category>
      <category>ai</category>
      <category>electron</category>
    </item>
  </channel>
</rss>
