GenCAD: Generating Editable Parametric CAD Models From Images

#ai #productivity #tutorial #webdev

Most AI 3D generators hand you a mesh — a shell of triangles you can render and 3D-print, but barely edit. Open one in Fusion 360 or SolidWorks and you cannot change a fillet radius, move a hole, or adjust a sketch dimension. The design intent is gone; you are left with frozen surface geometry. For anyone building design automation, CAD plugins, or AI-assisted engineering tools, that limitation is the whole problem.

GenCAD, a research project on generative AI for computer-aided design, takes the harder route. Instead of generating geometry, it generates the program that builds the geometry — a sequence of parametric CAD operations you can re-open, re-run, and edit.

Why a mesh is not a CAD model

A mesh describes a surface: thousands of triangles pinned in 3D space. A parametric CAD model describes a process: draw a 2D sketch, extrude it 40 mm, cut a 6 mm hole, round an edge. Each step carries editable parameters, and the order of steps encodes design intent.

That difference decides what you can do next. Change "40 mm" to "55 mm" in a parametric model and the dependent geometry updates correctly. There is no equivalent edit on a mesh — you would be dragging vertices and hoping.

GenCAD outputs the process, not the surface. Its target representation is a CAD command sequence in the style of the DeepCAD dataset: 2D sketches composed of lines, arcs, and circles, followed by extrude operations. Because the result is a short program rather than a point cloud, it drops into a feature tree and stays editable.

How GenCAD generates a design

The published approach chains three components, and understanding them tells you where it will and will not fit your stack.

First, a CAD autoencoder. A transformer is trained to compress a CAD command sequence into a fixed-length latent vector and reconstruct it back into valid commands. This produces a continuous latent space where nearby points decode to buildable designs.

Second, contrastive image-to-CAD alignment. Borrowing the idea behind CLIP, an image encoder is trained so that a rendered picture of a part lands close to that part's CAD latent. The model learns a shared space for what a part looks like and how the part is built.

Third, a latent diffusion model. Conditioned on an image embedding, it samples a CAD latent, and the autoencoder's decoder turns that latent back into a command sequence. Image in, editable CAD program out.

GenCAD builds on the DeepCAD representation, a public dataset of over 170,000 CAD construction sequences sourced from Onshape. That foundation is also a ceiling: the representation covers sketch-and-extrude modeling. Lofts, sweeps, revolves, and freeform surfacing sit outside its vocabulary, so the model reasons about prismatic, mechanical-looking parts rather than organic shapes.

What you can build on it today

GenCAD is research code with a paper, not a hosted API or an SDK. If you are scoping it into a product, four constraints matter.

The output needs a translation layer. A DeepCAD-style command sequence is not a STEP file. The sketch-and-extrude operations map cleanly onto the Onshape FeatureScript or Fusion 360 API, but you write that bridge yourself.

The operation vocabulary is narrow. Brackets, plates, housings, and simple mechanical parts are in scope. Turbine blades and ergonomic grips are not.

It is image-conditioned. GenCAD's published work generates CAD from an input image. Text-to-CAD — typing a prompt and getting a model — is a parallel research thread, not what this project does. If you need natural-language input, you are building that front end or pairing GenCAD with a separate text-to-image step.

Reproducing it means training models. You need the dataset, GPU time, and the patience to retrain. Treat GenCAD as a reference architecture for your own system, not a dependency you import.

Generated CAD can be geometrically valid and still wrong. A model may close cleanly in a kernel yet violate wall-thickness rules, miss a tolerance, or simply not match the request. Any pipeline that turns AI output into manufacturable parts needs a validation stage — geometry checks, design-rule checks, and human review before anything reaches a machine.

If you want to clone the GenCAD repository and trace how the autoencoder, contrastive encoder, and diffusion model fit together, an AI-aware editor shortens the loop. It can summarize an unfamiliar training script, explain a tensor-shape mismatch, and help draft the export bridge to your CAD kernel.

For a developer audience, the takeaway is structural. The notable move in GenCAD is not the diffusion model — it is the decision to generate an editable CAD program instead of dead geometry. Any team building AI into an engineering workflow faces the same choice, and the parametric path is the one that survives contact with a real design review.

Originally published at pickuma.com. Subscribe to the RSS or follow @pickuma.bsky.social for new reviews.