Matias Affolter

Posted on Oct 28

How We Built an AI That Transforms Your Photos Into Pixel Art in 10 Seconds (And Why It Matters for Web3)

#ai #blockchain #webdev #javascript

TL;DR: We created an AI-powered pixel art generator that preserves facial features while creating authentic retro art. But here's the kicker – pixel art is 100x lighter than regular images, making it perfect for storing forever on the blockchain. Let me show you how we did it. TRY IT

The Problem We're Solving

Picture this: You love NFTs, but uploading a 2MB photo to the blockchain costs you an arm and a leg. Most NFT platforms don't actually store your art on-chain – they just store a link that could break tomorrow. Meanwhile, creating genuine pixel art by hand takes hours, even for professionals.

We thought: What if we could convert any photo into authentic pixel art that's light enough to live permanently on-chain?

That's how Pixagram was born.

The Magic Behind the Curtain

Our AI pipeline is like a sophisticated assembly line, where each station does one specific job really well. Here's what happens when you upload a photo:

Pixagram.io - Social Media Blockchain

Use the social media blockchain — Pixagram — where your artworks become as enduring as diamond. Transform photos into pixel art using pixa.pics, then mint, post, and thrive on our NFT-powered network.

pixagram.io

1. Face Detection & Enhancement (The Foundation)

First, we use InsightFace to detect faces in your image. But we don't just find the face – we actually understand it. The system extracts:

512-dimensional facial embeddings (basically a mathematical fingerprint of your face)
Facial keypoints (eyes, nose, mouth positions)
Age and gender estimates
Facial structure data

Here's where it gets interesting: we crop the face with 30% padding around it (for context), then enhance it through multiple stages:

Resize to optimal dimensions
Sharpen features (1.5x boost)
Enhance contrast (1.1x)
Adjust brightness (1.05x)

Think of this as giving the AI the best possible view of your face before the transformation begins.

2. Dual Embedding System (The Secret Sauce)

Most AI art generators use one type of embedding. We use two working together:

CLIP Embeddings:

These understand the semantic meaning of your face
They know "this is a smiling woman with blue eyes"
Perfect for maintaining identity at a conceptual level

InsightFace Embeddings:

These understand the geometric structure of your face
They know exact distances between features
Perfect for preserving facial accuracy

By combining both, we get the best of semantic understanding AND structural precision.

3. The Transformation Pipeline

Now the real magic happens. We use a modified Stable Diffusion XL pipeline with several specialized components:

a) Custom Pixel Art LORA
We trained a specialized LORA (Low-Rank Adaptation) on authentic pixel art. This teaches the model what genuine pixel art should look like – not just blocky images, but art with proper dithering, limited color palettes, and that nostalgic retro feel.

b) InstantID ControlNet
This maintains facial structure using the keypoints we extracted earlier. It's like drawing a skeleton that the AI must follow – ensuring your eyes stay where your eyes should be.

c) Zoe Depth ControlNet
This preserves the 3D structure of your image. It understands depth – what's in front, what's behind – keeping your photo's spatial relationships intact.

d) IP-Adapter Integration
This injects our dual embeddings directly into the diffusion process through cross-attention. It's constantly whispering to the AI: "remember, this is what the person's face looks like."

4. Post-Processing Polish

After generation, we apply color matching in LAB color space. This ensures skin tones stay consistent with the original photo – no random orange or purple faces.

For faces, we create soft masks with feathered edges, allowing seamless blending between the pixelated transformation and facial features.

The Technical Stack

Here's what powers the system:

- Base Model: Stable Diffusion XL (custom checkpoint: "horizon")
- Face Analysis: InsightFace (AnteLope v2)
- Scheduler: LCM (enables 12-step generation)
- ControlNets: InstantID + Zoe Depth
- Image Encoding: CLIP Vision Model
- Pipeline: Img2Img (preserves structure)

Key Parameters:

Steps: 12 (thanks to LCM scheduler)
CFG Scale: 1.0-1.5 (LCM sweet spot)
Img2Img Strength: 0.55 (balances transformation vs. fidelity)
Identity Preservation: 1.3x (boosted for maximum face accuracy)
Resolution: Auto-optimized to 896×1152 or 832×1216

Why This Matters: The Blockchain Advantage

Here's where things get really interesting. Let's talk numbers:

A typical photo:

Average size: 2-3 MB
On IPFS: Still requires external hosting
Link can break if hosting fails
Expensive to store on-chain

Pixagram pixel art:

Average size: 15-20 KB
That's 100-150x smaller
Can be embedded directly on-chain
Stored forever in the blockchain itself
No external dependencies

Our Blockchain: Built for This

Pixagram runs on a fork of HIVE/STEEM, optimized for social media and NFTs:

72 KB maximum post capacity – perfect for pixel art
3-second block time – near-instant confirmation
Proof-of-Brain consensus – rewards creators
24 elected witnesses – truly decentralized

When you create pixel art on Pixagram, it's not just stored on-chain – it becomes part of the chain. Your art will exist as long as the blockchain exists. No broken links. No lost hosting. Forever.

How It Actually Performs

Let's be real about what this AI can and can't do:

What it excels at:

Portraits with clear facial features
Photos with good lighting
Maintaining facial identity (80-95% similarity)
Creating authentic pixel art aesthetic
Fast generation (10 seconds average)

What's challenging:

Very low-light photos
Extremely complex scenes
Photos where faces are tiny or obscured
Multiple faces (focuses on largest)

We've tested thousands of photos, and the system achieves 80-90% average face similarity, with best cases hitting 95%+.

Try It Yourself

Want to see it in action? We've made it dead simple:

Pixagram.io – Our full platform with blockchain integration
Hugging Face Space – Try the AI without signup

The Hugging Face space lets you experiment with all the parameters:

Adjust identity preservation strength
Control pixel art intensity
Fine-tune depth and structure preservation
Enable/disable color matching

The Bigger Picture

We're not just building an AI art generator. We're building a Web3 social network where:

Every post rewards creators
Art lives forever on-chain
No corporation owns your content
Lightweight pixel art enables true decentralization

The NFT market has 11.6 million users worldwide, but most platforms are either:

Exciting but centralized (like Instagram)
Decentralized but boring (like traditional NFT marketplaces)

Pixagram aims to be both: exciting AND decentralized.

Technical Deep Dive: The Code

For the developers reading this, here's how the core generation works:

# Core pipeline initialization
pipe = StableDiffusionXLControlNetImg2ImgPipeline.from_single_file(
    model_path,
    controlnet=[instantid_controlnet, depth_controlnet],
    torch_dtype=torch.float16
)

# Face analysis
faces = face_app.get(image_array)
face_embeddings = face.normed_embedding  # 512-dim

# IP-Adapter projection
image_embeds = image_proj_model(insightface_embeds)

# Generation with dual ControlNets
result = pipe(
    image=input_image,
    control_image=[face_keypoints, depth_map],
    controlnet_conditioning_scale=[0.85, 0.75],
    added_cond_kwargs={"image_embeds": image_embeds},
    strength=0.55,
    num_inference_steps=12,
    guidance_scale=1.45
)

The key insight? Multi-modal conditioning. We're hitting the model from multiple angles simultaneously:

Text prompts guide style
Depth maps preserve structure
Keypoints maintain facial geometry
Embeddings preserve identity
Img2Img keeps overall composition

What's Next?

We're currently in testnet alpha, with mainnet beta coming in October 2025. Our roadmap includes:

Enhanced multi-face support
More art styles beyond pixel art
Advanced blockchain governance
Creator monetization tools
NFT marketplace integration

Join the Revolution

Whether you're a:

Developer – Fork our code, build on our blockchain
Artist – Create pixel art that lives forever
Crypto enthusiast – Early adopter advantages
Curious human – Just want to pixelate your selfie

We'd love to have you.

Try it now:

Production platform: pixagram.io
Experiment freely: Hugging Face Space
Get in touch: omnibus@pixagram.io

Final Thoughts

The internet promised to make information free and accessible forever. But in practice, links break, servers die, and companies disappear. Blockchain offers a second chance at that promise – but only if we can make it practical.

Pixel art isn't just nostalgia. It's a 100x compression that makes permanent, decentralized storage actually feasible. It's small enough to embed in a blockchain, distinctive enough to be valuable as art, and our AI makes it accessible to everyone.

That's the magic formula: AI accessibility + artistic value + blockchain permanence.

"Create artworks lasting forever on the blockchain while getting rewarded." That's not just our tagline – it's the future we're building.

Come help us build it.

What do you think? Would you want your photos turned into permanent blockchain pixel art? Let me know in the comments!

Tags: #ai #blockchain #web3 #nft #pixelart #machinelearning #stablediffusion #opensource

About Pixagram:
Pixagram SA is a Swiss company (Zug) developing Web3.0 social media infrastructure. Our technology combines AI art generation with lightweight blockchain storage, creating the first truly on-chain social network for pixel art NFTs.

DEV Community