I built a PDF editor… and realized most of them work in a way people wouldn’t expect
I’ve been working on a PDF editor recently, and I went down a rabbit hole I didn’t expect.
At first, I thought building a “simple” PDF tool would be straightforward: upload a file → edit → download.
But the deeper I went, the more I realized something surprising:
most online PDF tools don’t work the way users think they do.
The assumption everyone makes
When you upload a PDF to an online editor, most people assume:
- the file is processed instantly
- maybe some light transformations happen
- then it’s sent back
In reality, a lot of tools:
- upload your file to a server
- store it (sometimes longer than expected)
- process it remotely
- sometimes pass it through multiple services
Not necessarily malicious — but definitely not always transparent.
The real technical challenge
I tried to approach the problem differently.
Instead of focusing on features first, I asked:
what would a clean and predictable architecture for PDF editing look like?
That’s where things got interesting.
Problem #1: Rendering vs Editing
PDFs are not like HTML.
You don’t have:
- semantic structure
- editable DOM
- consistent fonts
What you get instead is:
- glyph positioning
- embedded fonts
- drawing instructions
So editing text is not “editing text”.
It’s more like:
remove pixels and redraw something that looks identical
Problem #2: Fonts are a nightmare
This was by far the hardest part.
Even if you extract the font:
- it may be subsetted
- it may fail to load in the browser
- it may render differently than in the original PDF
So you end up dealing with:
- fallback strategies
- font matching heuristics
- visual inconsistencies
And yes… bold text randomly becoming non-bold is a real thing 😅
Problem #3: Pixel-perfect editing
If you want a clean UX, you can’t just slap a contenteditable div on top.
You need:
- canvas rendering for accuracy
- hidden input handling for typing
- custom caret drawing
- manual selection logic
Basically… rebuilding a mini text engine.
A different approach
After experimenting with multiple architectures, I ended up with something like this:
- render the PDF page as an image (canvas)
- detect text blocks
- mask original text when editing
- redraw text manually using canvas
- keep interaction logic separate from rendering
This avoids a lot of inconsistencies between HTML and PDF rendering.
Client-side vs Server-side
There’s also an interesting trade-off:
Client-side:
- better for privacy
- no upload needed
- limited by browser capabilities
Server-side:
- more powerful
- easier to control rendering
- raises privacy concerns
There’s no perfect solution — it depends on the use case.
What surprised me the most
Honestly?
How complex something as “simple” as editing a PDF actually is.
It’s one of those problems that looks trivial from the outside, but gets messy very quickly.
Curious to hear your thoughts
If you’ve worked with PDFs before:
- how did you handle font rendering?
- did you go client-side or server-side?
- any tricks for keeping visual consistency?
I’d love to learn how others approached this.
Top comments (0)