<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: WritHer</title>
    <description>The latest articles on DEV Community by WritHer (@benito_mallamaci_c902e934).</description>
    <link>https://dev.to/benito_mallamaci_c902e934</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3795503%2F11a2c18d-3252-4b19-a0b1-073eb7e3a053.png</url>
      <title>DEV Community: WritHer</title>
      <link>https://dev.to/benito_mallamaci_c902e934</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/benito_mallamaci_c902e934"/>
    <language>en</language>
    <item>
      <title>I built a 100% local Graph RAG engine for my Markdown notes</title>
      <dc:creator>WritHer</dc:creator>
      <pubDate>Sun, 31 May 2026 21:33:08 +0000</pubDate>
      <link>https://dev.to/benito_mallamaci_c902e934/i-built-a-100-local-graph-rag-engine-for-my-markdown-notes-33nf</link>
      <guid>https://dev.to/benito_mallamaci_c902e934/i-built-a-100-local-graph-rag-engine-for-my-markdown-notes-33nf</guid>
      <description>&lt;p&gt;Kwipu turns a folder of Markdown notes (or an Obsidian vault) into a queryable knowledge graph. Ask questions in plain language, get answers that connect facts across files, all running locally on Ollama with no cloud.&lt;/p&gt;

&lt;p&gt;My notes had become a graveyard. Hundreds of Markdown files, years of meeting notes, half-finished ideas and &lt;code&gt;[[wikilinks]]&lt;/code&gt; in an Obsidian vault, and the only way to find anything was full-text search that needed me to remember the exact word I once wrote.&lt;/p&gt;

&lt;p&gt;What I actually wanted was to &lt;em&gt;ask&lt;/em&gt;: “what did I decide about X, and who was involved?”, and get an answer that pulls threads from five different files at once.&lt;/p&gt;

&lt;p&gt;So I built &lt;strong&gt;Kwipu&lt;/strong&gt;: a fully local Graph RAG engine that turns a folder of Markdown into a knowledge graph you can talk to. No cloud, no API keys, no data leaving the machine. It runs on &lt;a href="https://ollama.ai/" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Repo: &lt;a href="https://github.com/benmaster82/Kwipu" rel="noopener noreferrer"&gt;https://github.com/benmaster82/Kwipu&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why a &lt;em&gt;graph&lt;/em&gt;, not just vector search
&lt;/h2&gt;

&lt;p&gt;Plain vector RAG retrieves chunks that &lt;em&gt;sound&lt;/em&gt; similar to your question. That’s great for “find me the paragraph about deadlines,” but it falls apart when the answer is spread across notes connected by relationships (person, project, decision, date).&lt;/p&gt;

&lt;p&gt;Kwipu builds a &lt;strong&gt;property graph&lt;/strong&gt; out of your notes first. It extracts entity-relation triples from two sources:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Structure you already wrote&lt;/strong&gt;: &lt;code&gt;[[wikilinks]]&lt;/code&gt; and YAML frontmatter get parsed straight into graph edges, with no LLM guesswork.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Implicit relations&lt;/strong&gt;: an LLM pass extracts additional triples from the prose.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Those two layers get merged into one index, so retrieval can actually &lt;em&gt;follow connections&lt;/em&gt; instead of just matching text.&lt;/p&gt;

&lt;h2&gt;
  
  
  How it works
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Your Notes (.md)
      |
      v
  Pre-processing      (extracts [[wikilinks]], YAML frontmatter)
      |
      v
  LLM extraction      (pulls extra entity-relation triples)
      |
      v
  Property Graph      (merges structural + LLM triples, persisted to disk)
      |
      v
  Hybrid retrieval    (synonym + vector + BM25 + temporal)
      |
      v
  LLM response        (answer generated from retrieved context)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The graph is built once and saved to disk. After that, queries load it instantly, and adding a single new note is incremental (roughly 20 to 60 seconds), not a full rebuild.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hybrid retrieval (4 strategies, one answer)
&lt;/h2&gt;

&lt;p&gt;Instead of betting on a single retriever, Kwipu combines four and lets them complement each other:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;LLM synonym expansion&lt;/strong&gt;: broadens the query (optional, turn it off with &lt;code&gt;--fast&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vector similarity&lt;/strong&gt;: semantic matches&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BM25 keyword scoring&lt;/strong&gt;: exact-term recall&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Temporal and metadata matching&lt;/strong&gt;: “what happened last March” actually works&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;There’s also a strict anti-hallucination prompt that forces the model to cite sources and refuse to invent facts, because a knowledge base that makes things up is worse than no knowledge base.&lt;/p&gt;

&lt;p&gt;And it’s &lt;strong&gt;multilingual&lt;/strong&gt; out of the box (Italian, English, French, German, Spanish, Portuguese, auto-detected).&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick start
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Install deps&lt;/span&gt;
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt

&lt;span class="c"&gt;# 2. Pull models in Ollama&lt;/span&gt;
ollama pull llama3.1:8b
ollama pull nomic-embed-text
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Point it at your notes by editing &lt;code&gt;KNOWLEDGE_DIR&lt;/code&gt; in &lt;code&gt;geode_graph.py&lt;/code&gt; (an Obsidian vault path works directly: it reads files without modifying them and ignores &lt;code&gt;.obsidian/&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;KNOWLEDGE_DIR&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;C:/Users/YourName/Documents/MyVault&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;MODEL_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llama3.1:8b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python geode_graph.py          &lt;span class="c"&gt;# full mode, best quality&lt;/span&gt;
python geode_graph.py &lt;span class="nt"&gt;--fast&lt;/span&gt;   &lt;span class="c"&gt;# skips synonym retriever, ~50% faster on CPU&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It watches the folder for changes and updates the graph automatically.&lt;/p&gt;

&lt;h2&gt;
  
  
  My favorite trick: build big, query small
&lt;/h2&gt;

&lt;p&gt;Graph construction is the expensive part: it needs an LLM call per chunk. Queries are cheap.&lt;/p&gt;

&lt;p&gt;So if your hardware is limited, you can build the graph &lt;strong&gt;once&lt;/strong&gt; with a heavy cloud model via Ollama, then switch to a tiny local model for everyday questions. The graph structure doesn’t change when you swap models, only response generation uses the smaller one.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Build once with a big model (high-quality extraction)&lt;/span&gt;
&lt;span class="c"&gt;# MODEL_NAME = "gpt-oss:20b-cloud"&lt;/span&gt;
python geode_graph.py

&lt;span class="c"&gt;# Then query daily with a small, fast local model&lt;/span&gt;
&lt;span class="c"&gt;# MODEL_NAME = "qwen2.5:3b"&lt;/span&gt;
python geode_graph.py &lt;span class="nt"&gt;--fast&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Best of both worlds: a graph built by a 20B+ model, queried on a 3B.&lt;/p&gt;

&lt;h2&gt;
  
  
  Being honest about the tradeoffs
&lt;/h2&gt;

&lt;p&gt;Graph RAG isn’t free. First-time builds take real time:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;th&gt;GPU (7B)&lt;/th&gt;
&lt;th&gt;CPU (3B)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;~7 min&lt;/td&gt;
&lt;td&gt;~10 min&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;~35 min&lt;/td&gt;
&lt;td&gt;~50 min&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;500+&lt;/td&gt;
&lt;td&gt;~3 hrs&lt;/td&gt;
&lt;td&gt;~4 hrs&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Recommended minimum is about 16 GB system RAM for a 7B model. The sweet spot for serious use is 7B+ on a GPU. But once the graph exists, queries are fast and lightweight (200 to 500 MB).&lt;/p&gt;

&lt;h2&gt;
  
  
  What’s next
&lt;/h2&gt;

&lt;p&gt;The next thing on the roadmap is a &lt;strong&gt;Telegram bot&lt;/strong&gt; so you can query your vault from your phone, anywhere.&lt;/p&gt;

&lt;p&gt;It’s MIT-licensed and tagged &lt;code&gt;help-wanted&lt;/code&gt;. If local-first AI, knowledge graphs, or Obsidian tooling is your thing, I’d love contributions, issues, or just a star.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/benmaster82/Kwipu" rel="noopener noreferrer"&gt;https://github.com/benmaster82/Kwipu&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;What would you want to ask your own notes if you could?&lt;/p&gt;

</description>
      <category>privacy</category>
      <category>obsidian</category>
      <category>rag</category>
      <category>python</category>
    </item>
    <item>
      <title>Stop paying for AI transcription! 🎙️ WritHer: 100% Local Voice Assistant for Windows. Privacy-first, Whisper + Ollama powered. Open Source on GitHub!</title>
      <dc:creator>WritHer</dc:creator>
      <pubDate>Fri, 01 May 2026 09:09:47 +0000</pubDate>
      <link>https://dev.to/benito_mallamaci_c902e934/stop-paying-for-ai-transcription-writher-100-local-voice-assistant-for-windows-42k9</link>
      <guid>https://dev.to/benito_mallamaci_c902e934/stop-paying-for-ai-transcription-writher-100-local-voice-assistant-for-windows-42k9</guid>
      <description>&lt;p&gt;Hey everyone! I wanted to share a small tool I’ve been building called WritHer.(Free and open source alternative to Wispr Flow)&lt;/p&gt;

&lt;p&gt;The idea is simple: it lives in your system tray and gives you two things.&lt;/p&gt;

&lt;p&gt;Hold AltGr anywhere (any app, any text field) and just speak. It transcribes your voice with Whisper and pastes the text right where your cursor is. No clicking, no switching apps.&lt;/p&gt;

&lt;p&gt;Hold Ctrl+R and you get a voice assistant that understands natural language. You can say things like “remind me to call Marco in one hour” or “appointment with the dentist tomorrow at 3pm” and it handles the rest. Notes, to-do lists, shopping lists, reminders with toast notifications, all stored locally in SQLite.&lt;/p&gt;

&lt;p&gt;The part I’m most proud of: everything runs 100% offline. Speech recognition via faster-whisper, intent parsing via Ollama, no cloud, no API keys, no telemetry. Once you download the models it works with no internet at all.&lt;/p&gt;

&lt;p&gt;There’s also a little animated floating widget with eyes that react to what it’s doing (listening, thinking, error…) which is silly but I kind of love it.&lt;/p&gt;

&lt;p&gt;It’s Python, MIT license, Windows 10/11 only for now.&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/benmaster82/writher" rel="noopener noreferrer"&gt;https://github.com/benmaster82/writher&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://getwrither.com" rel="noopener noreferrer"&gt;https://getwrither.com&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Would love feedback, especially from anyone who uses voice input regularly. Still early days but it works well for my daily workflow!&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>productivity</category>
      <category>showdev</category>
    </item>
    <item>
      <title>I Made Tkinter Look Like a Modern Glassmorphic App — Here's the Dark Magic I Used</title>
      <dc:creator>WritHer</dc:creator>
      <pubDate>Thu, 26 Feb 2026 22:29:53 +0000</pubDate>
      <link>https://dev.to/benito_mallamaci_c902e934/i-made-tkinter-look-like-a-modern-glassmorphic-app-heres-the-dark-magic-i-used-3718</link>
      <guid>https://dev.to/benito_mallamaci_c902e934/i-made-tkinter-look-like-a-modern-glassmorphic-app-heres-the-dark-magic-i-used-3718</guid>
      <description>&lt;p&gt;If you've ever built a desktop app in Python, you've probably used &lt;strong&gt;Tkinter&lt;/strong&gt;. And if you have, you probably think it's doomed to look like a clunky, grey Windows 95 application.&lt;/p&gt;

&lt;p&gt;I thought so too.&lt;/p&gt;

&lt;p&gt;But recently, I needed a lightweight, floating UI for a &lt;strong&gt;100% local, offline voice assistant&lt;/strong&gt; I was building. I absolutely refused to bundle an entire Chromium instance (Electron) just to render a small widget.&lt;/p&gt;

&lt;p&gt;So I decided to push Tkinter to its absolute limits. The result is &lt;a href="https://github.com/benmaster82/writher" rel="noopener noreferrer"&gt;&lt;strong&gt;Writher&lt;/strong&gt;&lt;/a&gt;: an open-source, privacy-first voice assistant and dictation tool powered by &lt;code&gt;faster-whisper&lt;/code&gt; and &lt;code&gt;Ollama&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;In this post, I'll break down the tricks I used to make a legacy Python GUI look modern, and the architecture behind a fully local AI desktop app.&lt;/p&gt;




&lt;h2&gt;
  
  
  🎨 The UI Hack: Making Tkinter Beautiful
&lt;/h2&gt;

&lt;p&gt;To get that modern, &lt;strong&gt;glassmorphic floating pill shape&lt;/strong&gt; with glowing borders, I completely bypassed Tkinter's standard widgets.&lt;/p&gt;

&lt;p&gt;Instead, I used &lt;strong&gt;PIL (Pillow)&lt;/strong&gt; to render high-resolution graphics dynamically on a transparent Tkinter &lt;code&gt;Canvas&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Trick: Borderless &amp;amp; Transparent Windows
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Remove the window frame entirely
&lt;/span&gt;&lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;overrideredirect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Chromakey hack: make a specific color fully transparent
&lt;/span&gt;&lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;wm_attributes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-transparentcolor&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;#000001&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives you a &lt;strong&gt;frameless, floating window&lt;/strong&gt; — the foundation for any modern-looking widget.&lt;/p&gt;

&lt;h3&gt;
  
  
  Dynamic Glow Rendering
&lt;/h3&gt;

&lt;p&gt;Every frame of the animation (the bot's eyes changing expressions: &lt;em&gt;listening&lt;/em&gt;, &lt;em&gt;thinking&lt;/em&gt;, &lt;em&gt;happy&lt;/em&gt;) is drawn on-the-fly using &lt;code&gt;ImageDraw&lt;/code&gt; and &lt;code&gt;ImageFilter.GaussianBlur&lt;/code&gt; to create a glowing effect that mimics SVG filters:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;PIL&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ImageDraw&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ImageFilter&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;draw_glow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;blur_radius&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;RGBA&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;draw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ImageDraw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Draw&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;draw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rounded_rectangle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;blur_radius&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;blur_radius&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
         &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;blur_radius&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;blur_radius&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;radius&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;color&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ImageFilter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;GaussianBlur&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;blur_radius&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Ghost Window
&lt;/h3&gt;

&lt;p&gt;Since Writher is a dictation tool, clicking it &lt;strong&gt;shouldn't steal focus&lt;/strong&gt; from your active app (like VSCode or a text editor). I used Win32 &lt;code&gt;ctypes&lt;/code&gt; to apply the &lt;code&gt;WS_EX_NOACTIVATE&lt;/code&gt; style:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ctypes&lt;/span&gt;

&lt;span class="n"&gt;GWL_EXSTYLE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;
&lt;span class="n"&gt;WS_EX_NOACTIVATE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mh"&gt;0x08000000&lt;/span&gt;
&lt;span class="n"&gt;WS_EX_TOOLWINDOW&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mh"&gt;0x00000080&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;make_ghost_window&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hwnd&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;style&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;windll&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user32&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;GetWindowLongW&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hwnd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;GWL_EXSTYLE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;ctypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;windll&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user32&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;SetWindowLongW&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;hwnd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;GWL_EXSTYLE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;style&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;WS_EX_NOACTIVATE&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;WS_EX_TOOLWINDOW&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;💡 Pro tip:&lt;/strong&gt; &lt;code&gt;WS_EX_TOOLWINDOW&lt;/code&gt; also hides the window from the Alt+Tab menu, making it behave like a true desktop widget.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  🧠 The Brain: 100% Local AI
&lt;/h2&gt;

&lt;p&gt;I'm tired of sending my voice and private notes to the cloud. Writher operates &lt;strong&gt;entirely on your local machine&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Speech-to-Text&lt;/strong&gt; — I used &lt;a href="https://github.com/SYSTRAN/faster-whisper" rel="noopener noreferrer"&gt;&lt;code&gt;faster-whisper&lt;/code&gt;&lt;/a&gt; (CTranslate2). It runs flawlessly on CPU or GPU and transcribes voice in near real-time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The LLM&lt;/strong&gt; — I hooked the app to &lt;a href="https://ollama.ai" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt;. Press &lt;code&gt;Ctrl+R&lt;/code&gt;, Writher listens, transcribes, and passes the text to a local model for processing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Function Calling&lt;/strong&gt; — This is where it gets interesting. Instead of just chatting, I configured the LLM with &lt;strong&gt;tool definitions&lt;/strong&gt;. It converts your voice commands into structured function calls — saving notes, scheduling appointments, or setting reminders in a local &lt;strong&gt;SQLite&lt;/strong&gt; database:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Thread-safe SQLite with WAL mode
&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sqlite3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;writher.db&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PRAGMA journal_mode=WAL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The LLM doesn't just understand you — it &lt;em&gt;acts&lt;/em&gt; on what you say.&lt;/p&gt;




&lt;h2&gt;
  
  
  ⚙️ The Muscle: Seamless OS Integration
&lt;/h2&gt;

&lt;p&gt;Most dictation tools copy text to your clipboard and force you to paste manually. Writher acts like a &lt;strong&gt;phantom keyboard&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;I initially used &lt;code&gt;pyperclip&lt;/code&gt;, but it suffers from race conditions when other apps lock the clipboard. To make it bulletproof, I wrote a &lt;strong&gt;custom clipboard injector&lt;/strong&gt; using the Win32 API (&lt;code&gt;OpenClipboard&lt;/code&gt;, &lt;code&gt;SetClipboardData&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;The flow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Save&lt;/strong&gt; your current clipboard contents&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inject&lt;/strong&gt; the transcribed text and simulate &lt;code&gt;Ctrl+V&lt;/code&gt; via &lt;code&gt;pynput&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Restore&lt;/strong&gt; your original clipboard — all in milliseconds&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;And just in case Windows blocks the clipboard? I implemented a fail-safe that automatically appends your dictation to a &lt;code&gt;recovery_notes.txt&lt;/code&gt; file. &lt;strong&gt;You never lose a single word.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🔑 What I Learned
&lt;/h2&gt;

&lt;p&gt;Building Writher taught me a few things I wasn't expecting:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;You don't need Electron for everything.&lt;/strong&gt; A transparent Tkinter canvas + Pillow can produce surprisingly polished UIs. The binary is tiny compared to any web-based alternative.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local AI is ready.&lt;/strong&gt; With &lt;code&gt;faster-whisper&lt;/code&gt; + Ollama, you can build genuinely useful AI tools that never phone home. Privacy doesn't have to mean sacrificing quality.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Win32 APIs are your secret weapon on Windows.&lt;/strong&gt; Ghost windows, clipboard control, focus management — they unlock capabilities that pure Python can't reach alone.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🚀 Try It Yourself
&lt;/h2&gt;

&lt;p&gt;The whole project is open-source. You can check out the UI implementation, the AI pipeline, and run it on your own machine:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;👉 &lt;a href="https://github.com/benmaster82/writher" rel="noopener noreferrer"&gt;Writher on GitHub&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I'd love to hear your thoughts. Have you ever pushed a legacy GUI framework beyond what it was meant to do? What local LLM setup are you running? Drop a comment — I read all of them.&lt;/p&gt;

&lt;p&gt;And if you find the project useful, a ⭐ on the repo helps more than you'd think. 🙏&lt;/p&gt;

</description>
      <category>python</category>
      <category>opensource</category>
      <category>ai</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
