DEV Community

Cover image for Photo to Anime: API vs AnimeGAN v2/v3 (Open Source)
AI Engine
AI Engine

Posted on • Originally published at ai-engine.net

Photo to Anime: API vs AnimeGAN v2/v3 (Open Source)

You want to add an anime or cartoon filter to your application. The open-source route is tempting: AnimeGAN has been the go-to model since 2020, with v2 (2021) and v3 (2022) available as pretrained weights on GitHub. It is free, runs locally, and generates decent anime portraits. But once you look at what it actually outputs and what it takes to run in production, the picture changes. This guide compares AnimeGAN v2 and v3 against the cloud Photo to Anime API on the same portrait.

Want to see the results on your own photos? Try the Photo to Anime API on a portrait.

Quick Comparison

Photo to Anime API AnimeGAN v2/v3
Styles available 14 (6 premium + 7 classic + Ghibli) 5 total (face_paint, celeba_distill, paprika, Hayao, Shinkai)
Style variety Cartoon, Pixar, Arcane, Comic, Caricature, Illustration, Anime, Sketch, Ghibli Japanese anime only
Setup API key PyTorch (~2GB) or ONNX Runtime
First-run latency 1-2s (network only) 60s (PyTorch import + model download)
Steady-state latency 1-2s per image 2-3s (v2 CPU) / 700ms (v3 CPU)
Output resolution Up to 1600x1344 512x512 fixed (v2 face_paint)
Style intensity control Yes (0.0-1.0) No (fixed per model)
License Commercial Open source

What AnimeGAN Does

AnimeGAN is a GAN trained to transform photos into anime-style images. Each version ships with a handful of pretrained weights:

AnimeGANv2 (PyTorch):

  • face_paint_512_v2: portrait-focused, clean lines
  • celeba_distill: distilled from CelebA, softer colors
  • paprika: trained on Paprika aesthetic

AnimeGANv3 (ONNX):

  • AnimeGANv3_Hayao_36: Hayao Miyazaki style
  • AnimeGANv3_Shinkai_37: Makoto Shinkai style

All variants produce similar results on the same input: a stylized version with anime-like shading. They are trained for the same target domain (Japanese anime).

What the Photo to Anime API Does

The API provides 3 endpoints with 14 total styles:

  • /cartoonize-pro: 6 premium styles (Cartoon Pro, Caricature, Arcane, Comic, Pixar, Illustration)
  • /cartoonize: 7 classic styles (Anime, 3D, Hand-drawn, Sketch, Art Style, Design, Illustration)
  • /ghibli: Studio Ghibli style with face enhancement

The premium styles produce distinct artistic aesthetics that AnimeGAN cannot reach. Same portrait, 6 different looks. None of these can be produced by AnimeGAN.

See side-by-side image comparison, code for all 3 tools, and the install pain in the complete guide.


The Style Coverage Gap

This is the fundamental difference. AnimeGAN was trained on anime datasets for a single target aesthetic per model. The API supports a broader set of style domains:

  • Japanese anime: both
  • Studio Ghibli: both (API has dedicated endpoint, AnimeGAN has v3 Hayao)
  • Arcane / dark fantasy: API only
  • Pixar / 3D animation: API only
  • Comic book: API only
  • Caricature: API only
  • Sketch / line drawing: API only
  • Style intensity control: API only (0.0 to 1.0)

Running AnimeGAN Locally

Here is how to install and run AnimeGANv2 on your machine. You will need Python 3.8+ and about 2.5GB of free disk space for dependencies.

AnimeGANv2 with PyTorch

import torch
from PIL import Image

model = torch.hub.load(
    "bryandlee/animegan2-pytorch:main",
    "generator",
    pretrained="face_paint_512_v2",
    trust_repo=True,
)
face2paint = torch.hub.load(
    "bryandlee/animegan2-pytorch:main",
    "face2paint",
    size=512,
    trust_repo=True,
)

img = Image.open("portrait.jpg").convert("RGB")
with torch.no_grad():
    output = face2paint(model, img)
output.save("result.jpg", quality=90)
Enter fullscreen mode Exit fullscreen mode

First run takes about 60 seconds: PyTorch imports, repo clone from GitHub, model weight download, initial inference. Subsequent runs take 2-3 seconds per image on CPU. Output is fixed at 512x512.

AnimeGANv3 with ONNX Runtime

import numpy as np
import onnxruntime as ort
from PIL import Image

# Download model from https://github.com/TachibanaYoshino/AnimeGANv3/releases
session = ort.InferenceSession("AnimeGANv3_Hayao_36.onnx")
img = Image.open("portrait.jpg").convert("RGB")

w, h = img.size
scale = 512 / max(w, h)
new_w = int((w * scale) // 8) * 8
new_h = int((h * scale) // 8) * 8
img_resized = img.resize((new_w, new_h), Image.LANCZOS)
arr = np.array(img_resized).astype(np.float32) / 127.5 - 1.0
arr = arr[np.newaxis, :, :, :]

output = session.run(None, {session.get_inputs()[0].name: arr})
result_arr = (output[0][0] + 1.0) * 127.5
result_arr = np.clip(result_arr, 0, 255).astype(np.uint8)
Image.fromarray(result_arr).save("result.jpg", quality=90)
Enter fullscreen mode Exit fullscreen mode

v3 is faster than v2 on CPU (~700ms per image) thanks to ONNX Runtime, but you manage preprocessing manually.

Calling the Photo to Anime API

import requests

url = "https://phototoanime1.p.rapidapi.com/cartoonize-pro"
headers = {
    "x-rapidapi-host": "phototoanime1.p.rapidapi.com",
    "x-rapidapi-key": "YOUR_API_KEY",
}

for style in ["arcane", "pixar", "comic", "caricature"]:
    with open("portrait.jpg", "rb") as f:
        response = requests.post(
            url,
            headers=headers,
            files={"image": f},
            data={"style": style, "style_degree": "0.7"},
        )
    result = response.json()
    print(f"{style}: {result['image_url']}")
Enter fullscreen mode Exit fullscreen mode

Performance Benchmark

We ran both tools on the same portrait. Test environment: Intel Core i7-7700HQ @ 2.80GHz, 4 cores, no GPU.

Tool First call Steady state Output size Notes
AnimeGANv2 (face_paint) ~60s 2-3s 512x512 PyTorch import + repo clone + 8MB weight download on first run
AnimeGANv3 (Hayao ONNX) ~3s 700ms matches input 4MB ONNX model, no PyTorch needed
Photo to Anime API 1-2s 1-2s up to 1600x1344 No local setup, includes network

The API wins on cold start (no model to load), consistency (same latency every call), and output resolution. AnimeGANv3 wins on per-call latency once warm if you stay in the anime style. AnimeGANv2 is the slowest of the three on every metric except cost.


When to Choose AnimeGAN

  • Japanese anime style is enough. If you only need anime portraits, AnimeGAN handles that.
  • Offline processing. No network dependency.
  • Custom training. Fine-tune AnimeGAN on your own dataset with GPU time.
  • Zero per-call cost. Self-hosting wins on cost at high volume with existing GPU hardware.

When to Choose the API

  • Multiple artistic styles. Arcane, Pixar, Comic, Caricature, Illustration, Ghibli. AnimeGAN cannot produce any of these.
  • Higher output resolution. Up to 1600x1344 vs AnimeGAN's fixed 512x512.
  • Style intensity control. style_degree parameter (0.0-1.0) for subtle to strong effects.
  • No dependency management. No PyTorch (~2GB), no onnxruntime, no model weights to download.
  • Face-aware processing. Detects faces and applies stylization with smooth blending.

Sources


Read the full guide with side-by-side image comparison, all 6 premium styles demo, and JavaScript examples on ai-engine.net.

Top comments (0)