Turning a Single Image into a Production-Ready 3D Model for $0.01
Last week we shipped something our users have been asking for: image-to-3D generation with PBR textures on PixelAPI. Here's the engineering breakdown of how we got there, what we learned, and why we priced it at just $0.01 per model.
The Problem
Most image-to-3D APIs fall into two camps:
- Enterprise-only: Luma AI, CSM.ai — $0.10 to $0.50 per model, API access requires sales calls
- Subscription-locked: Meshy ($20/mo), Tripo ($12-140/mo) — you pay monthly whether you generate 1 model or 1000
Developers building e-commerce tools, game asset pipelines, or AR/VR apps need pay-per-use pricing with no commitments. And they need good quality — untextured meshes aren't useful for production.
Choosing the Right Model
We evaluated three open-source models:
| Model | Shape Quality (ULIP-T) | Textures | VRAM | License | Auth Required |
|---|---|---|---|---|---|
| TRELLIS (Microsoft) | 0.0769 | Basic | ~20GB | MIT | Yes (gated HF) |
| TripoSR | 0.0767 | Basic | ~8GB | MIT | No |
| Hunyuan3D 2.1 | 0.0774 | PBR | ~29GB | Apache 2.0 | No |
Hunyuan3D 2.1 won on every metric that matters for production use: best shape quality, full PBR texture support (albedo + normal + roughness maps), and no API keys needed for model weights.
The tradeoff: it needs ~29GB VRAM, which means an RTX 6000 Ada (48GB). Our RTX 4070s (16GB) can't run it. We dedicated our LLM3 machine (RTX 6000 Ada) as the 3D worker.
The Architecture
User Upload
→ POST /v1/3d/generate (Gateway)
→ Image saved to storage
→ Job pushed to Redis queue (pixelapi:3d:jobs)
→ Worker picks up job
→ Shape generation (~45s)
→ PBR texture painting (~45s)
→ GLB uploaded to CDN
→ Result returned via API
Key decisions:
Standalone worker, not integrated: Hunyuan3D uses ~29GB VRAM continuously when loaded. Mixing it with our image generation workers (which use 12-16GB) would cause constant OOM kills. The 3D worker runs as a separate systemd service.
Polling over WebSockets: The generation takes ~90 seconds total. We use synchronous polling from the client (the endpoint blocks until complete) rather than WebSockets. Simpler architecture, works with all clients.
Redis queue: Same pattern as our image generation — jobs in Redis, worker pops and processes. Allows easy horizontal scaling if we add more GPU machines.
The Hard Parts
Building this was not smooth. Here's every bug we hit:
1. The target_reduction bug
Hunyuan3D's mesh simplification uses trimesh.simplify_quadric_decimation(). The code passed target_count=40000 as a positional argument, which Python mapped to the percent parameter (first param). So trimesh tried to simplify with percent=40000 — which is > 1.0. The fix: face_count=target_count.
2. Missing C++ extensions
Two compiled modules needed building:
-
mesh_inpaint_processor.cpp(pybind11) — handles vertex inpainting for texture painting -
custom_rasterizer(CUDA) — differentiable renderer for multi-view generation
Neither shipped pre-compiled. The compile script had hardcoded python (not python3), and custom_rasterizer_kernel needed LD_LIBRARY_PATH pointing to PyTorch's lib directory.
3. The Redis connection issue
Our gateway uses aioredis (async Redis). The 3D endpoint imported rdb from the queue module at load time, but rdb is None until init_redis() runs during app startup. Solution: lazy get_3d_rdb() function that creates its own connection on first use.
4. The bpy (Blender) trap
Hunyuan3D imports bpy (Blender's Python module) in its mesh utilities. Ubuntu's blender package doesn't expose bpy as a Python module — you'd need to build Blender from source or use the standalone bpy pip package (which doesn't exist for Python 3.10). We made bpy import optional with a mock module, then fixed the actual code paths to not need it.
Pricing Math
Our rule: 2x cheaper than the cheapest mainstream competitor.
- Tripo3D Pro: ~$0.0066/model ($19.90/mo for 3000 credits)
- Meshy Pro: ~$0.02/model ($20/mo for 1000 credits)
- PixelAPI: $0.01/model (10 credits)
We went slightly above the 2x rule for Tripo3D (their subscription pricing is loss-leader), but comfortably 2x cheaper than Meshy and 10-50x cheaper than Luma/enterprise options.
Cost per model for us:
- GPU time: ~90s on RTX 6000 Ada → ~$0.001 electricity
- Storage: ~4-22MB GLB per model → negligible
- Bandwidth: ~5-25MB download → ~$0.0002
We're profitable on day one, even at $0.01.
What's Next
- GPU priority scheduler: Currently 3D shares LLM3 with video generation and our Mushika rendering service. We need intelligent queue management that preempts lower-priority work when revenue jobs arrive.
- Multi-model support: TripoSR for fast/cheap models (~10s), Hunyuan3D for quality.
- 3D model marketplace: Let users sell generated 3D assets.
Try It
curl -X POST https://api.pixelapi.dev/v1/3d/generate \
-H "Authorization: Bearer YOUR_KEY" \
-F "image=@product.jpg" \
-F "format=glb"
# Returns: {"status":"completed","output_url":"...glb","generation_time":88.5}
Sign up at pixelapi.dev — 100 free credits to start. No credit card.
If you're building anything with 3D APIs, I'd love to hear about it. Find me on X/Twitter or Discord.
Top comments (0)