Hi! I will share my experience building a POC to generate 2D and 3D images for a friend's project.
One friend asked me to attend a meeting at his house to define the user story for a project he is planning. This was my first in-person meeting since COVID.
The general idea is that, during the registration process:
- The user answers some predefined questions and generates a 2D Dragon image with a predefined branding style.
- The image is generated with an AI image API creation (could be nano banana, GPT, or something else)
- Generate a 3D version of your dragon with Tripo AI and get the GLB file needed for Three.js
- Add some predefined animations on Threejs to the dragon.
The initial step was researching the available APIs and their pricing.
Image generation
First, I checked and tried to generate images. For images, you have different options. If you want to use them for free and compare the same prompt across different AI models, you can use this to compare them: https://arena.ai/image/side-by-side.
I experimented with my logo, which was created some years ago. To improve the style, I want to remove the background, but the images added white and gray boxes instead of the actual transparent PNG or move the incorrect puzzle piece.
You can add text to generate your image, or upload an image. These are my experiments with Gemini
From one of my photos I requested:
- Remove the background
- Change the color of my jacket to red
- Add a smile because I am always serious in photos
- Add a soccer stadium background
- Change the image like a Panini World Cup sticker.
I also created a banner for my LinkedIn after some experiments trying to add the logo and all in place, without overlapping text or the image with the profile picture, and with the logo in the correct position.
Another interesting experiment was adding an infographic to explain accessibility testing.
Gemini also includes some songs and videos, here is the link from my LinkedIn
So sometimes, although you describe the A pose as returning to the T pose and it overlaps the image, some tips add context, for example:
- Add context/role: For example, you are a graphic designer with experience in designing new characters.
- Small and specific tasks: Don't try to modify an image with all instructions at the same time. For my panini, it was on steps: remove background, change jacket color, add some smile, change background to World Cup, and add panini. Also, I shared the current image of the last Mexican jersey.
- Add some quality criteria and a checklist
I got better results with GPT-Image-2
3D Images
I asked Claude to suggest some options for the 3D images. Another alternative to Tripo is Meshy.
Also, it's important to understand the key concepts to achieve better results with 3D images and animations. We're easier with a dragon than a person.
- A-pose: is the standard initial posture in which most characters are modeled: standing, facing forward, with the arms extended downward forming roughly a 45° angle with the torso (hence the "A" shape). This position is not random: it makes it easier to place the internal skeleton later. It prevents the mesh from deforming oddly in delicate areas, such as the shoulders or armpits, as the character starts to move.
- Mesh: is the visible surface of the model, a kind of "digital skin" made up of thousands of small polygons (usually triangles or squares) joined together by edges and vertices. The more polygons it has, the more detailed and smooth the character will look, but it will also be heavier to process.
- Rigging consists of placing a virtual skeleton made of bones and joints inside the model. Although the viewer will never see it, this skeleton is what allows the character to be animated: when a bone is moved (for example, the arm bone), the mesh surrounding it deforms in response to that movement, just as human skin follows the real bone.
Both APIs are very similar in price: around 30/40 tokens per image, whether from text or image. The image is textureless, and you can call another API to generate the texture or rigging.
For animations, both include animations, but are more for people. I couldn't find a flying animation, but with Claude, I can create one after a few attempts.
You can download the models and export to Mixamo and other software like Blender or Unity to use in your games.
You can also send the model to 3D printing services.
Tripo
Payments for API and Web designers use different credits. I tested with me as a model and dancing, but you need pose A, or else your jacket looks like one piece.
I tried one dragon and a girl, and the animations were bad, but as a tester, I was curious.
If you set an A pose, you get better results, but my logo wasn't added correctly after using the auto rig.
I tried to export and import to Mixamo, I got an error while exporting. And with the other model, my arms were wrong.
For some images, the 3D model is incorrect, and most of the auto textures are metallic.
To get some images of only my face, I forgot to mention that I need jeans, and some images generated were without pants.
Also, sometimes it wasn't following the A pose and generated a T pose or multiple images inside the image.
The support is not user-friendly, and they can have a solution in 7 business days. You need to add your email and model ID. I requested more clarification about the issue with the image, but the answer was that only one person is required, and that the A pose is mandatory.
This was my best result with some animations is better on laptops than mobile:
https://studio.tripo3d.ai/3d-model/c449581f-5da7-4bf9-8be0-1d1c2b6f3e6b?invite_code=CTeDxH
Meshy
It's easier, and I think the rig option is better because you can set the parts needed for rigging. The credits are shared between the API and the web designer.
Here I tried me as some QA image
A car, but the change in the color of the car to matte instead of metallic wasn't clear.
I saw the manual rig I added myself as a Lego, but I couldn't remove the knees.
Lego animation
The import from Meshy to Mixamo went smoothly, with no errors.
I requested support to remove the knees for lego animations and was added to the list of features if it becomes more popular, with at least more friendly and faster answers without my username or model ID and I got the answer faster.
This is one example with meshy:
https://www.meshy.ai/s/EQNVGv
Mixamo
This is an Adobe product that lets you upload your models and add predefined animations. I like it because the animations are better than the AI.
This is the import auto rigger.
The animation is better because the model has a cape and was animated correctly.
Claude POC demo
I created an experiment to build a Svelte app that generates images in 2D and 3D for the dragons and the ninja, related to my Abi's testing dojo, my website to practice testing.
I got some errors with the animations maybe threejs animations are not very common. I got some errors from the API and for meshy I need to prepare the image for the rigging because if you don't limit the image you end with a image with the Rigging API max limit. That is on their API documentation with API parameter you limit that faces.
When using input_task_id, models with more than 300,000 faces are not supported for rigging. Please use the Remesh API to reduce the face count before rigging.
The solution was add should_remesh: true and target_polycount: 150000 directly on the API request
const res = await fetch(${MESHY_BASE}/openapi/v1/image-to-3d, {
method: 'POST',
headers: headers(),
body: JSON.stringify({
image_url: imageUrl,
model_type: 'standard',
should_texture: true,
should_remesh: true,
target_polycount: 150000
})
})
You can try for a limited time because I only bought few credits: https://ninja-generator-910246220092.us-central1.run.app/
One thing I noticed is that most of the examples of 2D and 3D images were of white people. And some ninjas have the cap with holes. There may be some bias about that.




















Top comments (0)