<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Anas Vakyathodi</title>
    <description>The latest articles on DEV Community by Anas Vakyathodi (@anasvakyathodi).</description>
    <link>https://dev.to/anasvakyathodi</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F386518%2F0938da93-5c64-4897-a5ef-59f78b31e363.png</url>
      <title>DEV Community: Anas Vakyathodi</title>
      <link>https://dev.to/anasvakyathodi</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/anasvakyathodi"/>
    <language>en</language>
    <item>
      <title>I built 4 PDF editors in 12 months. The boring one finally shipped.</title>
      <dc:creator>Anas Vakyathodi</dc:creator>
      <pubDate>Tue, 26 May 2026 14:00:00 +0000</pubDate>
      <link>https://dev.to/anasvakyathodi/i-built-4-pdf-editors-in-12-months-the-boring-one-finally-shipped-1hjc</link>
      <guid>https://dev.to/anasvakyathodi/i-built-4-pdf-editors-in-12-months-the-boring-one-finally-shipped-1hjc</guid>
      <description>&lt;p&gt;Last February, I made one commit that threw away nine months of code.&lt;/p&gt;

&lt;p&gt;Gone: a Flask backend, a Tiptap document editor, a Gemini multimodal pipeline, three different PDF-to-HTML conversion experiments. In its place: one engine that edits PDF bytes directly in the browser.&lt;/p&gt;

&lt;p&gt;It's called Folio — &lt;a href="https://pdf-editor.online" rel="noopener noreferrer"&gt;pdf-editor.online&lt;/a&gt;, a Chrome extension and web app. You click a word in your invoice, type a new one, hit save. Same font, same position, same bytes everywhere else.&lt;/p&gt;

&lt;p&gt;But that wasn't the first thing I tried. It was the fourth.&lt;/p&gt;

&lt;h2&gt;
  
  
  Iteration 1: the Fabric.js overlay (early 2025)
&lt;/h2&gt;

&lt;p&gt;The first build took what felt like the obvious approach. The user clicks text on a rendered PDF. The editor extracts that text and its coordinates via pdf-lib, drops a Fabric.js text box at the same position with the same font and size, masks the original with a white rectangle, and lets the user edit freely. The white mask and the new text were both independently draggable, the way a Figma or Sketch user would expect.&lt;/p&gt;

&lt;p&gt;On a single-page document, the illusion held beautifully. I demoed it to a friend and felt great.&lt;/p&gt;

&lt;p&gt;Then I tried a 3-page lab report.&lt;/p&gt;

&lt;p&gt;PDF page coordinate systems aren't uniform. The origin can shift. Rotation matrices apply. &lt;code&gt;CropBox&lt;/code&gt; and &lt;code&gt;MediaBox&lt;/code&gt; differ. Fabric.js doesn't natively understand any of this, so position calculations that were pixel-perfect on page 1 drifted by several lines by page 3.&lt;/p&gt;

&lt;p&gt;Worse — and this is the part I should have caught sooner — on save, the editor flattened the Fabric.js canvas into an image and overlaid it onto the original PDF via pdf-lib. The output wasn't an &lt;em&gt;edited&lt;/em&gt; PDF. It was a PDF with an &lt;em&gt;image of edited text&lt;/em&gt; sitting on top of the original PDF. Open it in a different reader, print it at higher resolution, or diff it against the original, and the lie became obvious.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwx2jalkbi47qwiths1cq.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwx2jalkbi47qwiths1cq.webp" alt="A two-panel diagram showing the overlay editing approach. Left: a clean Fabric.js text box on top of a PDF. Right: the same page after save, with a pixelated image overlay sitting on top of the original." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I patched coordinate math for weeks — every commit some version of "fix: save bug, again" — before I admitted the foundation was wrong.&lt;/p&gt;

&lt;h2&gt;
  
  
  Iteration 2: the Gemini multimodal pipeline (mid-2025)
&lt;/h2&gt;

&lt;p&gt;Hard pivot. If editing PDFs in place was too hard, convert them to clean HTML, edit the HTML in Tiptap, then re-export back to PDF.&lt;/p&gt;

&lt;p&gt;The conversion was the interesting part. Standard PDF-to-HTML libraries like pdfminer and pdfplumber gave you readable text but lost the layout — multi-column reports turned into one long paragraph. So I tried Google's Gemini multimodal model. Fed it page images. Asked for semantic HTML.&lt;/p&gt;

&lt;p&gt;The results were genuinely impressive. Gemini understood column structure, table boundaries, header hierarchy, even color-coded sections in lab reports. Honestly the best PDF understanding I'd seen, AI or otherwise.&lt;/p&gt;

&lt;p&gt;So I built it out. Custom Tiptap extensions. AI-assisted document editing. A PDF import-export pipeline on a Flask backend. The product was finally doing something nothing else free could match.&lt;/p&gt;

&lt;p&gt;And then I had to start disabling things, one by one. PDF-to-Word conversion went first — Vercel's serverless size limit. Then the import pipeline started returning errors during normal traffic.&lt;/p&gt;

&lt;p&gt;Here's what broke: &lt;strong&gt;cost.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Folio was a free product from day one. Every upload was a Gemini API call — roughly $0.02 for a 5-page invoice, twenty cents for a 100-page contract. For a free tool with the ambition to serve real volume, this was an existential expense.&lt;/p&gt;

&lt;p&gt;I tried a free-tier limit of three conversions per day. Users immediately bumped against it. I never enabled paid plans because the product hadn't proven itself yet. Then Gemini's free quota started returning 429s during normal traffic. Users opened the upload dialog and saw &lt;code&gt;Backend returned 500: 429 Gemini quota&lt;/code&gt;. Meanwhile, the Flask backend hit Vercel's serverless function size limit. Everything was on fire simultaneously.&lt;/p&gt;

&lt;h2&gt;
  
  
  Iteration 3: library-based conversion, no AI (late 2025)
&lt;/h2&gt;

&lt;p&gt;Drop the AI. Use pdfplumber server-side to extract text, positions, and font info deterministically. Render to clean HTML. Edit in Tiptap. Re-export with pdf-lib. No model calls, no quota errors, predictable cost.&lt;/p&gt;

&lt;p&gt;This was supposed to be the answer.&lt;/p&gt;

&lt;p&gt;What broke: the round trip lost fidelity. The HTML rebuilt from pdfplumber's output &lt;em&gt;looked&lt;/em&gt; close to the original — but Tiptap is a paragraph-based editor, not a layout editor. The moment a user edited anything more complex than a single line of text, the HTML-to-PDF mapping cracked. Kerning shifted. Fonts substituted to system defaults. Columns realigned. Tables collapsed.&lt;/p&gt;

&lt;p&gt;I shipped this version and used it myself for two weeks. Every PDF I edited came out subtly wrong — kerning a little off, line height slightly different, an invoice with a name change that looked like a &lt;em&gt;different invoice&lt;/em&gt;. The "edit" had quietly become a "rewrite," which wasn't the contract I wanted to ship.&lt;/p&gt;

&lt;h2&gt;
  
  
  Iteration 4: just edit the bytes (early 2026)
&lt;/h2&gt;

&lt;p&gt;Back to first principles. I sat down with the PDF 1.7 spec one more time, and this time I actually read it.&lt;/p&gt;

&lt;p&gt;Here's the realization that should have come earlier: &lt;strong&gt;PDF content streams are editable in place if you just edit the bytes.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A &lt;code&gt;Tj&lt;/code&gt; operator that draws &lt;code&gt;$317.40&lt;/code&gt; can be rewritten to draw &lt;code&gt;$317.50&lt;/code&gt; by changing a single byte in the content stream — the &lt;code&gt;4&lt;/code&gt; becomes a &lt;code&gt;5&lt;/code&gt;. Every other byte stays untouched. No conversion. No round-trip. No fidelity loss. The original embedded fonts stay in the file, so the new text picks them up automatically.&lt;/p&gt;

&lt;p&gt;pdf-lib could do this in principle via its low-level object-model API. The hard part wasn't writing the bytes — it was &lt;em&gt;finding the right bytes to write to&lt;/em&gt;. Because every PDF generator on Earth emits text in a different shape.&lt;/p&gt;

&lt;p&gt;Quick vocabulary, since the table is dense. A PDF page is a tiny program of drawing instructions — that program is called a &lt;em&gt;content stream&lt;/em&gt;. The two operators that draw text are &lt;code&gt;Tj&lt;/code&gt; (show a string) and &lt;code&gt;TJ&lt;/code&gt; (show an array of strings with spacing between them). The string itself can be written as a literal — &lt;code&gt;(Hello) Tj&lt;/code&gt; — or as hex — &lt;code&gt;&amp;lt;48656C6C6F&amp;gt; Tj&lt;/code&gt;, where each byte pair maps to a character. Some generators wrap a whole word in one operator. Others split every letter into its own. All of them render identically when you open the file.&lt;/p&gt;

&lt;p&gt;Here's the variety:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Generator&lt;/th&gt;
&lt;th&gt;Stream shape&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;LaTeX, simple form fillers&lt;/td&gt;
&lt;td&gt;Literal Tj&lt;/td&gt;
&lt;td&gt;&lt;code&gt;(text) Tj&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Word, LibreOffice&lt;/td&gt;
&lt;td&gt;Literal TJ with kerning array&lt;/td&gt;
&lt;td&gt;&lt;code&gt;[(a) -50 (b)] TJ&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;wkhtmltopdf, Markdown pipelines&lt;/td&gt;
&lt;td&gt;Adjacent literal Tj&lt;/td&gt;
&lt;td&gt;&lt;code&gt;(frag) Tj (frag) Tj&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Most publishing tools&lt;/td&gt;
&lt;td&gt;Hex Tj (CID subset font)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;&amp;lt;48656C6C6F&amp;gt; Tj&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Same, fragmented&lt;/td&gt;
&lt;td&gt;Adjacent hex Tj&lt;/td&gt;
&lt;td&gt;&lt;code&gt;&amp;lt;0041&amp;gt; Tj … &amp;lt;0042&amp;gt; Tj&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Crystal Reports, iText, Jasper&lt;/td&gt;
&lt;td&gt;Hex inside TJ array&lt;/td&gt;
&lt;td&gt;&lt;code&gt;[&amp;lt;0041&amp;gt; &amp;lt;004E&amp;gt;] TJ&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Enterprise medical / lab software&lt;/td&gt;
&lt;td&gt;Hex across separate TJ arrays&lt;/td&gt;
&lt;td&gt;&lt;code&gt;[&amp;lt;0041&amp;gt;] TJ [&amp;lt;004E&amp;gt;] TJ&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F890kpb3wg9wjhs7vfwck.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F890kpb3wg9wjhs7vfwck.webp" alt="A specimen sheet showing the same edit — 'HI' to 'YO' — appearing across seven different PDF content-stream encodings used by LaTeX, Word, wkhtmltopdf, hex-encoded publishing tools, fragmented CID fonts, Crystal Reports, and enterprise lab software. The changed bytes in each AFTER pattern are highlighted in forest green." width="799" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;These all render identically to the eye through pdf.js. At the byte level they're seven different problems.&lt;/p&gt;

&lt;p&gt;The current engine has seven substitution passes, applied in order of specificity. Each handles one of the shapes above. Most browser-based PDF editors handle passes 1 and 2 — the easy ones — and silently fail on the rest, which is why "edit any PDF, free" landing pages tend to fall apart the moment you feed them anything from a real enterprise toolchain.&lt;/p&gt;

&lt;p&gt;The seventh pass was the hardest. I added it at 2 AM while debugging a real lab report from a medical center. Every visible character of &lt;code&gt;ANAS&lt;/code&gt;, &lt;code&gt;DIRECT BILIRUBIN&lt;/code&gt;, &lt;code&gt;SGPT&lt;/code&gt; was wrapped in its own one-glyph TJ array inside a single &lt;code&gt;BT…ET&lt;/code&gt; block — the begin-text / end-text markers that bracket a section of text drawing in the content stream. None of the previous six passes could touch it. Phase 7 walks adjacent arrays, joins their decoded text across font boundaries, finds the matching subsequence, and rewrites the first slot.&lt;/p&gt;

&lt;p&gt;The whole pivot landed in one big commit. The day the Flask backend, the Tiptap editor, the HTML conversion pipeline, the Gemini integration, and a frankly embarrassing amount of UI scaffolding all got thrown away.&lt;/p&gt;

&lt;p&gt;What followed was eight weeks of what I came to think of as &lt;em&gt;"fixing the things we lied about."&lt;/em&gt; The upload flow still tried to ping the dead Flask backend. The &lt;code&gt;.env.example&lt;/code&gt; still listed Gemini keys. The editor toolbar clipped on mobile. And — my favorite — the upload dialog still said &lt;em&gt;"Deterministic engine first — receipts and tables don't need AI. We'll only call the model if confidence is low."&lt;/em&gt; That sentence was doubly false: the model was gone, and the receipt heuristic had been retired the week before.&lt;/p&gt;

&lt;p&gt;The instinct to &lt;strong&gt;delete every false claim from the UI the moment you notice it&lt;/strong&gt; became one of the policy decisions that shaped the rest of the product. That's why the Chrome Web Store listing still has a section called &lt;em&gt;"Things Folio does NOT do (yet)"&lt;/em&gt; right under the marketing copy: no whiteout, no image insert, no annotations, no page reorder, no scanned-PDF editing. People keep telling me that section is the most-read part of the page. Apparently "I can't do this" is the cheapest trust signal on the internet.&lt;/p&gt;

&lt;p&gt;Around the same time as the engine rewrite, I also tore down the entire UI and rebuilt it with Claude — forest-green accent on paper tone, a custom brand mark that holds together from 16-pixel favicon to marquee promo tile. But that's a different post.&lt;/p&gt;




&lt;p&gt;So that's where it landed. &lt;a href="https://pdf-editor.online" rel="noopener noreferrer"&gt;pdf-editor.online&lt;/a&gt;, a Chrome extension and web app. The extension does one thing: opens the web app in a new tab. One Chrome permission (&lt;code&gt;tabs&lt;/code&gt;), no host access, no content scripts, no telemetry. The web app does the work. You click any text in a rendered PDF, type a new word, save. For digital PDFs — invoices, contracts, lab reports, receipts — the output is byte-identical to the original except for the characters you changed. You can &lt;code&gt;xxd&lt;/code&gt; it against the source if you don't believe me.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyjqw40k9z1epg66ilp3f.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyjqw40k9z1epg66ilp3f.webp" alt="A screenshot of the Folio editor showing a PDF invoice with a click-to-edit popover anchored over the total amount. The popover shows the new value being typed and a green 'Fits original width' indicator." width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You'll need an account to upload a PDF, but the editing itself happens in your browser. Your file content doesn't go through our backend; it sits in your own private Supabase Storage bucket, only readable by you. No AI runs on the edit path. The substitution engine has 82 unit tests across all seven encoding shapes, and I add a new one every time someone sends me a PDF that breaks it.&lt;/p&gt;

&lt;p&gt;The fourth iteration is the one that shipped. The first three were the price.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you find a PDF that breaks it, send it — that's how phase 7 got built.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>productivity</category>
      <category>learning</category>
      <category>pdflib</category>
      <category>vibecoding</category>
    </item>
  </channel>
</rss>
