DEV Community

Cover image for How We Built an AI That Transforms Your Photos Into Pixel Art in 10 Seconds (And Why It Matters for Web3)
Matias Affolter
Matias Affolter

Posted on

How We Built an AI That Transforms Your Photos Into Pixel Art in 10 Seconds (And Why It Matters for Web3)

TL;DR: We created an AI-powered pixel art generator that preserves facial features while creating authentic retro art. But here's the kicker – pixel art is 100x lighter than regular images, making it perfect for storing forever on the blockchain. Let me show you how we did it. TRY IT


The Problem We're Solving

Picture this: You love NFTs, but uploading a 2MB photo to the blockchain costs you an arm and a leg. Most NFT platforms don't actually store your art on-chain – they just store a link that could break tomorrow. Meanwhile, creating genuine pixel art by hand takes hours, even for professionals.

We thought: What if we could convert any photo into authentic pixel art that's light enough to live permanently on-chain?

That's how Pixagram was born.

Screenshot Pixel Art Conversion

The Magic Behind the Curtain

Our AI pipeline is like a sophisticated assembly line, where each station does one specific job really well. Here's what happens when you upload a photo:

Pixagram.io - Social Media Blockchain

Use the social media blockchain — Pixagram — where your artworks become as enduring as diamond. Transform photos into pixel art using pixa.pics, then mint, post, and thrive on our NFT-powered network.

favicon pixagram.io

1. Face Detection & Enhancement (The Foundation)

First, we use InsightFace to detect faces in your image. But we don't just find the face – we actually understand it. The system extracts:

  • 512-dimensional facial embeddings (basically a mathematical fingerprint of your face)
  • Facial keypoints (eyes, nose, mouth positions)
  • Age and gender estimates
  • Facial structure data

Here's where it gets interesting: we crop the face with 30% padding around it (for context), then enhance it through multiple stages:

  • Resize to optimal dimensions
  • Sharpen features (1.5x boost)
  • Enhance contrast (1.1x)
  • Adjust brightness (1.05x)

Think of this as giving the AI the best possible view of your face before the transformation begins.

2. Dual Embedding System (The Secret Sauce)

Most AI art generators use one type of embedding. We use two working together:

CLIP Embeddings:

  • These understand the semantic meaning of your face
  • They know "this is a smiling woman with blue eyes"
  • Perfect for maintaining identity at a conceptual level

InsightFace Embeddings:

  • These understand the geometric structure of your face
  • They know exact distances between features
  • Perfect for preserving facial accuracy

By combining both, we get the best of semantic understanding AND structural precision.

3. The Transformation Pipeline

Now the real magic happens. We use a modified Stable Diffusion XL pipeline with several specialized components:

a) Custom Pixel Art LORA
We trained a specialized LORA (Low-Rank Adaptation) on authentic pixel art. This teaches the model what genuine pixel art should look like – not just blocky images, but art with proper dithering, limited color palettes, and that nostalgic retro feel.

b) InstantID ControlNet
This maintains facial structure using the keypoints we extracted earlier. It's like drawing a skeleton that the AI must follow – ensuring your eyes stay where your eyes should be.

c) Zoe Depth ControlNet
This preserves the 3D structure of your image. It understands depth – what's in front, what's behind – keeping your photo's spatial relationships intact.

d) IP-Adapter Integration
This injects our dual embeddings directly into the diffusion process through cross-attention. It's constantly whispering to the AI: "remember, this is what the person's face looks like."

4. Post-Processing Polish

After generation, we apply color matching in LAB color space. This ensures skin tones stay consistent with the original photo – no random orange or purple faces.

For faces, we create soft masks with feathered edges, allowing seamless blending between the pixelated transformation and facial features.

The Technical Stack

Here's what powers the system:

- Base Model: Stable Diffusion XL (custom checkpoint: "horizon")
- Face Analysis: InsightFace (AnteLope v2)
- Scheduler: LCM (enables 12-step generation)
- ControlNets: InstantID + Zoe Depth
- Image Encoding: CLIP Vision Model
- Pipeline: Img2Img (preserves structure)
Enter fullscreen mode Exit fullscreen mode

Key Parameters:

  • Steps: 12 (thanks to LCM scheduler)
  • CFG Scale: 1.0-1.5 (LCM sweet spot)
  • Img2Img Strength: 0.55 (balances transformation vs. fidelity)
  • Identity Preservation: 1.3x (boosted for maximum face accuracy)
  • Resolution: Auto-optimized to 896×1152 or 832×1216

Why This Matters: The Blockchain Advantage

Here's where things get really interesting. Let's talk numbers:

A typical photo:

  • Average size: 2-3 MB
  • On IPFS: Still requires external hosting
  • Link can break if hosting fails
  • Expensive to store on-chain

Pixagram pixel art:

  • Average size: 15-20 KB
  • That's 100-150x smaller
  • Can be embedded directly on-chain
  • Stored forever in the blockchain itself
  • No external dependencies

Our Blockchain: Built for This

Pixagram runs on a fork of HIVE/STEEM, optimized for social media and NFTs:

  • 72 KB maximum post capacity – perfect for pixel art
  • 3-second block time – near-instant confirmation
  • Proof-of-Brain consensus – rewards creators
  • 24 elected witnesses – truly decentralized

When you create pixel art on Pixagram, it's not just stored on-chain – it becomes part of the chain. Your art will exist as long as the blockchain exists. No broken links. No lost hosting. Forever.

How It Actually Performs

Let's be real about what this AI can and can't do:

What it excels at:

  • Portraits with clear facial features
  • Photos with good lighting
  • Maintaining facial identity (80-95% similarity)
  • Creating authentic pixel art aesthetic
  • Fast generation (10 seconds average)

What's challenging:

  • Very low-light photos
  • Extremely complex scenes
  • Photos where faces are tiny or obscured
  • Multiple faces (focuses on largest)

We've tested thousands of photos, and the system achieves 80-90% average face similarity, with best cases hitting 95%+.

Try It Yourself

Want to see it in action? We've made it dead simple:

  1. Pixagram.io – Our full platform with blockchain integration
  2. Hugging Face Space – Try the AI without signup

The Hugging Face space lets you experiment with all the parameters:

  • Adjust identity preservation strength
  • Control pixel art intensity
  • Fine-tune depth and structure preservation
  • Enable/disable color matching

The Bigger Picture

We're not just building an AI art generator. We're building a Web3 social network where:

  • Every post rewards creators
  • Art lives forever on-chain
  • No corporation owns your content
  • Lightweight pixel art enables true decentralization

The NFT market has 11.6 million users worldwide, but most platforms are either:

  • Exciting but centralized (like Instagram)
  • Decentralized but boring (like traditional NFT marketplaces)

Pixagram aims to be both: exciting AND decentralized.

Technical Deep Dive: The Code

For the developers reading this, here's how the core generation works:

# Core pipeline initialization
pipe = StableDiffusionXLControlNetImg2ImgPipeline.from_single_file(
    model_path,
    controlnet=[instantid_controlnet, depth_controlnet],
    torch_dtype=torch.float16
)

# Face analysis
faces = face_app.get(image_array)
face_embeddings = face.normed_embedding  # 512-dim

# IP-Adapter projection
image_embeds = image_proj_model(insightface_embeds)

# Generation with dual ControlNets
result = pipe(
    image=input_image,
    control_image=[face_keypoints, depth_map],
    controlnet_conditioning_scale=[0.85, 0.75],
    added_cond_kwargs={"image_embeds": image_embeds},
    strength=0.55,
    num_inference_steps=12,
    guidance_scale=1.45
)
Enter fullscreen mode Exit fullscreen mode

The key insight? Multi-modal conditioning. We're hitting the model from multiple angles simultaneously:

  • Text prompts guide style
  • Depth maps preserve structure
  • Keypoints maintain facial geometry
  • Embeddings preserve identity
  • Img2Img keeps overall composition

What's Next?

We're currently in testnet alpha, with mainnet beta coming in October 2025. Our roadmap includes:

  • Enhanced multi-face support
  • More art styles beyond pixel art
  • Advanced blockchain governance
  • Creator monetization tools
  • NFT marketplace integration

Join the Revolution

Whether you're a:

  • Developer – Fork our code, build on our blockchain
  • Artist – Create pixel art that lives forever
  • Crypto enthusiast – Early adopter advantages
  • Curious human – Just want to pixelate your selfie

We'd love to have you.

Try it now:


Final Thoughts

The internet promised to make information free and accessible forever. But in practice, links break, servers die, and companies disappear. Blockchain offers a second chance at that promise – but only if we can make it practical.

Pixel art isn't just nostalgia. It's a 100x compression that makes permanent, decentralized storage actually feasible. It's small enough to embed in a blockchain, distinctive enough to be valuable as art, and our AI makes it accessible to everyone.

That's the magic formula: AI accessibility + artistic value + blockchain permanence.

"Create artworks lasting forever on the blockchain while getting rewarded." That's not just our tagline – it's the future we're building.

Come help us build it.


What do you think? Would you want your photos turned into permanent blockchain pixel art? Let me know in the comments!


Tags: #ai #blockchain #web3 #nft #pixelart #machinelearning #stablediffusion #opensource

About Pixagram:
Pixagram SA is a Swiss company (Zug) developing Web3.0 social media infrastructure. Our technology combines AI art generation with lightweight blockchain storage, creating the first truly on-chain social network for pixel art NFTs.

Top comments (0)