Applying Constrained Generative AI to USPTO Design Patent Figure Production

#ai #automation #design #machinelearning

Design patent drawings present an underexplored constrained generation problem: produce technically compliant multi-view technical illustrations from 3D inputs, where compliance is formally specified and verifiable.

The Task Definition

USPTO design patent applications require a minimum of 7 orthographic and perspective views of the claimed design. Compliance constraints include:

Line weight consistency across all views (rejections triggered by inter-view variance)
Surface shading conventions derived from historical technical illustration standards (contour shading, stippling for transparency, oblique strokes for metallic surfaces)
Broken line semantics: dashed lines indicate unclaimed subject matter; solid lines indicate claimed subject matter. The boundary must be unambiguous across all views simultaneously.
Completeness: every geometric feature visible in any view must be consistently disclosed in all views where it would be visible

This is a structured prediction problem with a formal evaluation criterion (USPTO examiner acceptance / 112 rejection rate).

Why This Is Non-Trivial

Standard image-to-image or 3D rendering pipelines don't solve this directly. Challenges include:

Cross-view consistency is a global constraint—not a per-image property. Each view must be checked against all others.
Shading semantics are perceptual, not photorealistic. Patent shading communicates surface type to a human examiner, not lighting simulation.
Broken line placement is a legal/strategic decision, not a visual one. The model must support human-in-the-loop control over claim boundary designation.
Domain gap: training data for USPTO-compliant line art is limited compared to general CAD or technical illustration datasets.

PatentFig: Applied Approach

PatentFig is a production system we built to address this pipeline. It accepts 3D models, CAD screenshots, or sketches as input and generates USPTO-compliant multi-view figures.

Key design decisions:

Separate the geometric projection step (deterministic, rule-based) from the stylization step (learned)
Human-controlled broken-line toggle rather than attempting to infer claim strategy from geometry
Output validated against known rejection categories before delivery

The system currently handles the full 7-view generation workflow and is live at patentfig.ai.

Open Questions

Genuinely curious whether anyone in this community has worked on related problems:

Multi-view consistency as a training objective (beyond just single-image quality)
Domain adaptation for technical illustration styles with small training sets
Formal verification of structured visual output against rule-based compliance specs

Happy to discuss technical tradeoffs or share more about the architecture. Comments open.

→ patentfig.ai