The problem nobody talks about
Copy text from an academic PDF. Paste it. Get $%&@#^*~.
That's not a bug. It's intentional — called font obfuscation.
Publishers and teachers deliberately break Unicode mappings so
copy-paste returns garbage, while the PDF looks visually perfect.
No clean open-source tool existed to fix this. So I built one.
What VeilPDF does
recover — detects broken ToUnicode tables, reconstructs text,
falls back to OCR automatically.
protect — lets YOU obfuscate your own PDFs, inject invisible
ownership watermarks, encrypt with AES-256.
forensics — detects fake redactions (black boxes with text
still underneath), font fingerprints, hidden metadata.
ai — summarize and extract structured data via Ollama.
Fully offline. No API keys. No cloud.
The stack
Rust CLI → orchestrates everything, single binary
Python core → pypdf, pdfminer, tesseract
Zig crypto layer → AES-256, byte-level watermark injection
One command setup:
[github](https://github.com/Gaurav-x111/veilpdf)
cd veilpdf
./veilpdf # auto-installs everything
Zero cloud. Zero cost. Runs entirely on your machine.
GitHub: https://github.com/Gaurav-x111/veilpdf
Feedback welcome — especially on the Zig crypto layer
and OCR fallback heuristics.
Top comments (0)