I got tired of redoing boilerplate every time I swapped document models. Adithya built Omnidocs: a unified inference layer for OCR, table extraction, layout, structured extraction. Runs locally or on GPU, MLX/vLLM backends. Repo: https://github.com/adithya-s-k/omnidocs
Why this matters: one API to run 16+ models means you can A/B models without rewiring preprocessors and postprocessing. You still get different outputs — but wiring time drops from days to hours, which changes how fast you iterate on prompts and schemas.
Practical caveats: "unified" doesn't erase model quirks. You still need a tiny adapter per model for schema normalization, unit tests that assert fields not strings, latency budgets (hosted VLMs spike), and a validation set per task. Iterate locally on vLLM/MLX, then
Takeaway: unified inference layers are worth the upfront work — they centralize the plumbing so you can iterate on prompts, schemas, and metrics. If you ship doc-processing, try Omnidocs and tell Adithya which model/integration or task you'd add first.
Top comments (0)