DEV Community

abigail armijo
abigail armijo

Posted on

What I Learned Exploring AI-Generated 3D: A Hands-On Tour of Meshy, Tripo, and Three.js

Hi! I will share my experience building a POC to generate 2D and 3D images for a friend's project.

One friend asked me to attend a meeting at his house to define the user story for a project he is planning. This was my first in-person meeting since COVID.

The general idea is that, during the registration process:

  • The user answers some predefined questions and generates a 2D Dragon image with a predefined branding style.
  • The image is generated with an AI image API creation (could be nano banana, GPT, or something else)
  • Generate a 3D version of your dragon with Tripo AI and get the GLB file needed for Three.js
  • Add some predefined animations on Threejs to the dragon.

The initial step was researching the available APIs and their pricing.

Image generation

First, I checked and tried to generate images. For images, you have different options. If you want to use them for free and compare the same prompt across different AI models, you can use this to compare them: https://arena.ai/image/side-by-side.

I experimented with my logo, which was created some years ago. To improve the style, I want to remove the background, but the images added white and gray boxes instead of the actual transparent PNG or move the incorrect puzzle piece.

Logo with bad transparent png

Logo with incorrect moving of the puzzle pieces

You can add text to generate your image, or upload an image. These are my experiments with Gemini

From one of my photos I requested:

  • Remove the background
  • Change the color of my jacket to red
  • Add a smile because I am always serious in photos
  • Add a soccer stadium background
  • Change the image like a Panini World Cup sticker.

Photo as panini sticker

I also created a banner for my LinkedIn after some experiments trying to add the logo and all in place, without overlapping text or the image with the profile picture, and with the logo in the correct position.

Linkedn banner

Another interesting experiment was adding an infographic to explain accessibility testing.

Accessibility infographic

Gemini also includes some songs and videos, here is the link from my LinkedIn

Continue my experiments with images, I explore now the music option. I requested to the previous image, change to use the Mexican soccer jersey, and requested some music related to the soccer world… | Abigail Armijo Hernández 🇺🇦🇮🇱🇵🇸🕊

Continue my experiments with images, I explore now the music option. I requested to the previous image, change to use the Mexican soccer jersey, and requested some music related to the soccer world cup

favicon linkedin.com

So sometimes, although you describe the A pose as returning to the T pose and it overlaps the image, some tips add context, for example:

  • Add context/role: For example, you are a graphic designer with experience in designing new characters.
  • Small and specific tasks: Don't try to modify an image with all instructions at the same time. For my panini, it was on steps: remove background, change jacket color, add some smile, change background to World Cup, and add panini. Also, I shared the current image of the last Mexican jersey.
  • Add some quality criteria and a checklist

I got better results with GPT-Image-2

3D Images

I asked Claude to suggest some options for the 3D images. Another alternative to Tripo is Meshy.

Also, it's important to understand the key concepts to achieve better results with 3D images and animations. We're easier with a dragon than a person.

  • A-pose: is the standard initial posture in which most characters are modeled: standing, facing forward, with the arms extended downward forming roughly a 45° angle with the torso (hence the "A" shape). This position is not random: it makes it easier to place the internal skeleton later. It prevents the mesh from deforming oddly in delicate areas, such as the shoulders or armpits, as the character starts to move.
  • Mesh: is the visible surface of the model, a kind of "digital skin" made up of thousands of small polygons (usually triangles or squares) joined together by edges and vertices. The more polygons it has, the more detailed and smooth the character will look, but it will also be heavier to process.
  • Rigging consists of placing a virtual skeleton made of bones and joints inside the model. Although the viewer will never see it, this skeleton is what allows the character to be animated: when a bone is moved (for example, the arm bone), the mesh surrounding it deforms in response to that movement, just as human skin follows the real bone.

3D Key concepts

Both APIs are very similar in price: around 30/40 tokens per image, whether from text or image. The image is textureless, and you can call another API to generate the texture or rigging.

For animations, both include animations, but are more for people. I couldn't find a flying animation, but with Claude, I can create one after a few attempts.

You can download the models and export to Mixamo and other software like Blender or Unity to use in your games.

You can also send the model to 3D printing services.

Tripo

Payments for API and Web designers use different credits. I tested with me as a model and dancing, but you need pose A, or else your jacket looks like one piece.

Dancing

I tried one dragon and a girl, and the animations were bad, but as a tester, I was curious.

Dragon and girl

If you set an A pose, you get better results, but my logo wasn't added correctly after using the auto rig.

Ninja kick animation

I tried to export and import to Mixamo, I got an error while exporting. And with the other model, my arms were wrong.

Import image from Tripo to Mixamo

For some images, the 3D model is incorrect, and most of the auto textures are metallic.

2D Image

3D model

To get some images of only my face, I forgot to mention that I need jeans, and some images generated were without pants.

Also, sometimes it wasn't following the A pose and generated a T pose or multiple images inside the image.

Imge from image

The support is not user-friendly, and they can have a solution in 7 business days. You need to add your email and model ID. I requested more clarification about the issue with the image, but the answer was that only one person is required, and that the A pose is mandatory.

This was my best result with some animations is better on laptops than mobile:
https://studio.tripo3d.ai/3d-model/c449581f-5da7-4bf9-8be0-1d1c2b6f3e6b?invite_code=CTeDxH

Meshy

It's easier, and I think the rig option is better because you can set the parts needed for rigging. The credits are shared between the API and the web designer.

Here I tried me as some QA image

QA image

A car, but the change in the color of the car to matte instead of metallic wasn't clear.

Sports car

I saw the manual rig I added myself as a Lego, but I couldn't remove the knees.

Rig steps

Lego animation

Lego animation

The import from Meshy to Mixamo went smoothly, with no errors.

I requested support to remove the knees for lego animations and was added to the list of features if it becomes more popular, with at least more friendly and faster answers without my username or model ID and I got the answer faster.

This is one example with meshy:
https://www.meshy.ai/s/EQNVGv

Mixamo

This is an Adobe product that lets you upload your models and add predefined animations. I like it because the animations are better than the AI.

This is the import auto rigger.

Mixamo Auto-rigger

The animation is better because the model has a cape and was animated correctly.

Ninja dancing video

Claude POC demo

I created an experiment to build a Svelte app that generates images in 2D and 3D for the dragons and the ninja, related to my Abi's testing dojo, my website to practice testing.

I got some errors with the animations maybe threejs animations are not very common. I got some errors from the API and for meshy I need to prepare the image for the rigging because if you don't limit the image you end with a image with the Rigging API max limit. That is on their API documentation with API parameter you limit that faces.

When using input_task_id, models with more than 300,000 faces are not supported for rigging. Please use the Remesh API to reduce the face count before rigging.

The solution was add should_remesh: true and target_polycount: 150000 directly on the API request

const res = await fetch(${MESHY_BASE}/openapi/v1/image-to-3d, {
method: 'POST',
headers: headers(),
body: JSON.stringify({
image_url: imageUrl,
model_type: 'standard',
should_texture: true,
should_remesh: true,
target_polycount: 150000
})
})

Custom animation created by Claude

You can try for a limited time because I only bought few credits: https://ninja-generator-910246220092.us-central1.run.app/

One thing I noticed is that most of the examples of 2D and 3D images were of white people. And some ninjas have the cap with holes. There may be some bias about that.

Top comments (0)