Architectural drawings already contain everything needed to build cabinet layouts.
Dimensions.
Cabinet placements.
Appliance spacing.
But most of this information exists in PDF drawings, which are designed for humans, not machines.
When cabinet designs move into production, someone usually needs to manually:
- interpret the drawing
- rebuild the layout in CAD
- generate a 3D model
- verify measurements
That process is slow and repetitive.
We recently worked on a system designed to automate this workflow by converting cabinet drawings directly into structured 3D data.
The interesting part wasn’t the AI model itself.
It was building the pipeline that makes the automation reliable.
System Architecture
The system converts cabinet drawings into 3D models through a multi-stage pipeline.
PDF Drawing
↓
PDF → High Resolution Images
↓
View Region Detection
↓
Cabinet Detection (YOLO)
↓
Measurement Extraction (LLM + OCR)
↓
Coordinate Mapping
↓
3D Geometry Generation
↓
AutoCAD DWG Export
Each stage handles a specific problem in the workflow.
Breaking the system into modules made it easier to debug and improve accuracy.
Step 1 — Converting PDF Drawings to Images
Architectural PDFs are not ideal inputs for computer vision models.
They contain vector data mixed with annotations, layers, and text.
To simplify processing, we convert each page into a high-resolution PNG image (300 DPI).
Higher resolution improves:
- text extraction accuracy
- detection performance
- line segmentation
Small dimension labels become unreadable at lower resolutions, so image quality matters more than expected.
Step 2 — View Region Detection
A single architectural sheet usually contains multiple views:
- floor plans
- elevations
- section views
- cabinet details
Processing the entire page creates too much noise.
Instead, we segment the sheet into visual regions and classify them.
The system prioritizes the base floor plan, which typically contains cabinet placement information.
This step reduces false detections later in the pipeline.
Step 3 — Cabinet Detection Using YOLO
Once the relevant region is identified, we run an object detection model.
We trained a YOLO model to detect cabinet-related objects such as:
- base cabinets
- upper cabinets
- tall cabinets
- appliances
Each detection returns:
- bounding box coordinates
- confidence score
- object label
Low-confidence detections are filtered out before moving to the next stage.
This step establishes where cabinets exist in the layout.
Step 4 — Extracting Measurements
Detection tells us where cabinets are, but not their size.
Cabinet drawings include dimension annotations like:
30"
2'-6"
34 1/2"
These values may appear:
- rotated
- overlapping other text
- connected via leader lines
Traditional OCR struggles with this.
Instead, we combine OCR with a vision-enabled LLM.
For each detected cabinet, the system:
- Crops the region around the cabinet
- Sends the image to a vision model
- Requests structured measurements
Example output:
{
"cabinet_type": "base",
"width": 36,
"height": 34.5,
"depth": 24
}
To prevent errors, we added validation rules.
If measurements fall outside expected cabinet ranges, the result is flagged for manual review.
Step 5 — Coordinate and Scale Detection
Architectural drawings use scale references such as:
1/8" = 1'-0"
Without interpreting this scale, cabinet positions remain in pixel space.
The system identifies the scale marker and converts pixel distances into real-world coordinates.
Each cabinet receives an X/Y/Z position relative to the drawing origin.
This allows the layout to be reconstructed accurately in 3D space.
Step 6 — Generating the 3D Layout
Once we have:
- cabinet detections
- measurements
- real-world coordinates
we can generate 3D geometry.
We implemented a viewer using Three.js where each cabinet becomes a parametric 3D object.
This step is less about visualization and more about validating the pipeline.
Architects can quickly review the generated layout and correct any misdetections.
The goal isn’t perfect automation.
It’s reducing manual modeling work.
Step 7 — Exporting to AutoCAD
The final stage converts the generated geometry into DWG files.
Using the AutoCAD SDK, the system exports:
- cabinet blocks
- correct dimensions
- real-world coordinates
If upstream data is correct, the export works reliably.
Interestingly, this stage turned out to be one of the simplest parts of the system.
Most of the complexity lies in interpreting drawings correctly.
Challenges We Encountered
1. Drawings Are Inconsistent
Architectural drawings vary widely depending on the designer.
Annotation styles, measurement formats, and layout conventions are rarely standardized.
The system needs to handle a wide range of variations.
2. Measurement Ambiguity
Dimension labels are not always placed directly next to objects.
They may refer to multiple cabinets or entire cabinet groups.
Resolving these relationships requires contextual reasoning.
3. Legacy Drawings
Older scanned drawings introduce additional problems:
- blurred lines
- noisy backgrounds
- overlapping annotations
These reduce detection accuracy significantly.
Current System Status
The system is currently an MVP under active development.
Performance is strong for:
- clean digital drawings
- modern architectural layouts
Edge cases remain for:
- scanned plans
- dense dimension annotations
- complex sheet compositions
Even with these limitations, the system already provides meaningful time savings.
Architects can start with a generated layout and correct it instead of creating it from scratch.
Final Thoughts
Projects like this highlight an important lesson about AI systems.
The hardest part usually isn’t the model.
It’s designing the workflow around it.
Solving this problem required combining:
- document processing
- computer vision
- LLM interpretation
- geometry generation
- CAD integration
Individually, these technologies are powerful.
Together, they create a system capable of automating a real engineering workflow.

Top comments (0)