DEV Community

bruce huang
bruce huang

Posted on

Building a Browser-Side Document Comparison Tool: Privacy-First .docx Diffing with JavaScript

1、The Problem — Why lawyers overpay for document comparison
2、Architecture — JSZip + LCS paragraph alignment + word-level diff
3、Implementation Details
Extracting text from .docx (ZIP → XML → structured paragraphs)
Fuzzy paragraph alignment (why LCS isn't enough)
Word-level diff rendering (redline format)
4、Performance — 140ms for 20 pages, all client-side
5、Limitations & Next Steps — Formatting changes, tables, PDF

Top comments (0)