AI Engine

Posted on May 17 • Originally published at ai-engine.net

Photo to Anime: API vs AnimeGAN v2/v3 (Open Source)

#python #api #ai #opensource

You want to add an anime or cartoon filter to your application. The open-source route is tempting: AnimeGAN has been the go-to model since 2020, with v2 (2021) and v3 (2022) available as pretrained weights on GitHub. It is free, runs locally, and generates decent anime portraits. But once you look at what it actually outputs and what it takes to run in production, the picture changes. This guide compares AnimeGAN v2 and v3 against the cloud Photo to Anime API on the same portrait.

Want to see the results on your own photos? Try the Photo to Anime API on a portrait.

Quick Comparison

	Photo to Anime API	AnimeGAN v2/v3
Styles available	14 (6 premium + 7 classic + Ghibli)	5 total (face_paint, celeba_distill, paprika, Hayao, Shinkai)
Style variety	Cartoon, Pixar, Arcane, Comic, Caricature, Illustration, Anime, Sketch, Ghibli	Japanese anime only
Setup	API key	PyTorch (~2GB) or ONNX Runtime
First-run latency	1-2s (network only)	60s (PyTorch import + model download)
Steady-state latency	1-2s per image	2-3s (v2 CPU) / 700ms (v3 CPU)
Output resolution	Up to 1600x1344	512x512 fixed (v2 face_paint)
Style intensity control	Yes (0.0-1.0)	No (fixed per model)
License	Commercial	Open source

What AnimeGAN Does

AnimeGAN is a GAN trained to transform photos into anime-style images. Each version ships with a handful of pretrained weights:

AnimeGANv2 (PyTorch):

face_paint_512_v2: portrait-focused, clean lines
celeba_distill: distilled from CelebA, softer colors
paprika: trained on Paprika aesthetic

AnimeGANv3 (ONNX):

AnimeGANv3_Hayao_36: Hayao Miyazaki style
AnimeGANv3_Shinkai_37: Makoto Shinkai style

All variants produce similar results on the same input: a stylized version with anime-like shading. They are trained for the same target domain (Japanese anime).

What the Photo to Anime API Does

The API provides 3 endpoints with 14 total styles:

/cartoonize-pro: 6 premium styles (Cartoon Pro, Caricature, Arcane, Comic, Pixar, Illustration)
/cartoonize: 7 classic styles (Anime, 3D, Hand-drawn, Sketch, Art Style, Design, Illustration)
/ghibli: Studio Ghibli style with face enhancement

The premium styles produce distinct artistic aesthetics that AnimeGAN cannot reach. Same portrait, 6 different looks. None of these can be produced by AnimeGAN.

See side-by-side image comparison, code for all 3 tools, and the install pain in the complete guide.

The Style Coverage Gap

This is the fundamental difference. AnimeGAN was trained on anime datasets for a single target aesthetic per model. The API supports a broader set of style domains:

Japanese anime: both
Studio Ghibli: both (API has dedicated endpoint, AnimeGAN has v3 Hayao)
Arcane / dark fantasy: API only
Pixar / 3D animation: API only
Comic book: API only
Caricature: API only
Sketch / line drawing: API only
Style intensity control: API only (0.0 to 1.0)

Running AnimeGAN Locally

Here is how to install and run AnimeGANv2 on your machine. You will need Python 3.8+ and about 2.5GB of free disk space for dependencies.

AnimeGANv2 with PyTorch

import torch
from PIL import Image

model = torch.hub.load(
    "bryandlee/animegan2-pytorch:main",
    "generator",
    pretrained="face_paint_512_v2",
    trust_repo=True,
)
face2paint = torch.hub.load(
    "bryandlee/animegan2-pytorch:main",
    "face2paint",
    size=512,
    trust_repo=True,
)

img = Image.open("portrait.jpg").convert("RGB")
with torch.no_grad():
    output = face2paint(model, img)
output.save("result.jpg", quality=90)

First run takes about 60 seconds: PyTorch imports, repo clone from GitHub, model weight download, initial inference. Subsequent runs take 2-3 seconds per image on CPU. Output is fixed at 512x512.

AnimeGANv3 with ONNX Runtime

import numpy as np
import onnxruntime as ort
from PIL import Image

# Download model from https://github.com/TachibanaYoshino/AnimeGANv3/releases
session = ort.InferenceSession("AnimeGANv3_Hayao_36.onnx")
img = Image.open("portrait.jpg").convert("RGB")

w, h = img.size
scale = 512 / max(w, h)
new_w = int((w * scale) // 8) * 8
new_h = int((h * scale) // 8) * 8
img_resized = img.resize((new_w, new_h), Image.LANCZOS)
arr = np.array(img_resized).astype(np.float32) / 127.5 - 1.0
arr = arr[np.newaxis, :, :, :]

output = session.run(None, {session.get_inputs()[0].name: arr})
result_arr = (output[0][0] + 1.0) * 127.5
result_arr = np.clip(result_arr, 0, 255).astype(np.uint8)
Image.fromarray(result_arr).save("result.jpg", quality=90)

v3 is faster than v2 on CPU (~700ms per image) thanks to ONNX Runtime, but you manage preprocessing manually.

Calling the Photo to Anime API

import requests

url = "https://phototoanime1.p.rapidapi.com/cartoonize-pro"
headers = {
    "x-rapidapi-host": "phototoanime1.p.rapidapi.com",
    "x-rapidapi-key": "YOUR_API_KEY",
}

for style in ["arcane", "pixar", "comic", "caricature"]:
    with open("portrait.jpg", "rb") as f:
        response = requests.post(
            url,
            headers=headers,
            files={"image": f},
            data={"style": style, "style_degree": "0.7"},
        )
    result = response.json()
    print(f"{style}: {result['image_url']}")

Performance Benchmark

We ran both tools on the same portrait. Test environment: Intel Core i7-7700HQ @ 2.80GHz, 4 cores, no GPU.

Tool	First call	Steady state	Output size	Notes
AnimeGANv2 (face_paint)	~60s	2-3s	512x512	PyTorch import + repo clone + 8MB weight download on first run
AnimeGANv3 (Hayao ONNX)	~3s	700ms	matches input	4MB ONNX model, no PyTorch needed
Photo to Anime API	1-2s	1-2s	up to 1600x1344	No local setup, includes network

The API wins on cold start (no model to load), consistency (same latency every call), and output resolution. AnimeGANv3 wins on per-call latency once warm if you stay in the anime style. AnimeGANv2 is the slowest of the three on every metric except cost.

When to Choose AnimeGAN

Japanese anime style is enough. If you only need anime portraits, AnimeGAN handles that.
Offline processing. No network dependency.
Custom training. Fine-tune AnimeGAN on your own dataset with GPU time.
Zero per-call cost. Self-hosting wins on cost at high volume with existing GPU hardware.

When to Choose the API

Multiple artistic styles. Arcane, Pixar, Comic, Caricature, Illustration, Ghibli. AnimeGAN cannot produce any of these.
Higher output resolution. Up to 1600x1344 vs AnimeGAN's fixed 512x512.
Style intensity control. style_degree parameter (0.0-1.0) for subtle to strong effects.
No dependency management. No PyTorch (~2GB), no onnxruntime, no model weights to download.
Face-aware processing. Detects faces and applies stylization with smooth blending.

Sources

AnimeGANv2 PyTorch (bryandlee) - PyTorch port, 3 pretrained weights, torch.hub integration
AnimeGANv3 official repository - Hayao and Shinkai ONNX models, v1.1.0 release
AnimeGANv2 paper and TensorFlow implementation - architecture details

Read the full guide with side-by-side image comparison, all 6 premium styles demo, and JavaScript examples on ai-engine.net.

DEV Community