Introduction
This project is a real-time image color palette extractor built with Python, OpenCV, and Streamlit.
The application analyzes an uploaded image, extracts dominant colors using K-means clustering, and matches them with the closest Tailwind CSS colors in the LAB color space using the ΔE2000 perceptual difference formula.
🔗 Live Demo: Real-time Palette Extractor
💻 Source Code: GitHub Repository
👤 Author: Ertuğrul Mutlu
1. System Architecture
📂 project_root
├── main.py # Streamlit app entry point
├── core
| ├── extractor.py # K-means clustering & palette extraction
| ├── tailwind.py # Tailwind color data & matching functions
| ├── color_ops.py # Color conversion & ΔE2000 calculations
| ├── ui.py # UI rendering components for Streamlit
└── requirements.txt # Python dependencies
Key benefits of this modular design:
- Maintainability: Easy to swap algorithms or UI components.
- Reusability: Core logic can be reused in CLI tools or APIs.
- Performance Tuning: Optimization in one module won’t affect others.
2. Color Extraction Pipeline
2.1 Image Preprocessing
We read the image with Pillow, ensure RGB format, and resize to optimize speed without sacrificing visual accuracy.
from PIL import Image
import cv2
import numpy as np
def preprocess_image(file, max_side=1024):
img = Image.open(file).convert("RGB")
arr = np.array(img)
h, w, _ = arr.shape
if max(h, w) > max_side:
scale = max_side / max(h, w)
arr = cv2.resize(arr, (int(w*scale), int(h*scale)), interpolation=cv2.INTER_AREA)
return arr
Highlights:
- Works with large images up to several MB in real-time.
- Avoids quality loss by using
INTER_AREA
interpolation.
2.2 Dominant Color Extraction with K-means
We limit pixel sampling to ~400k points to avoid memory bottlenecks while preserving accuracy.
def extract_palette(image_array, k=6):
pixels = image_array.reshape(-1, 3).astype(np.float32)
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 40, 0.2)
_, labels, centers = cv2.kmeans(
pixels, k, None, criteria, attempts=3, flags=cv2.KMEANS_PP_CENTERS
)
centers = np.clip(centers, 0, 255).astype(np.uint8)
return centers
Why K-means?
- Stable clustering.
- Efficient with OpenCV’s optimizations.
- Easily adjustable cluster count.
3. Tailwind Color Matching
3.1 LAB Conversion
We convert both extracted colors and Tailwind palette entries from RGB to LAB for perceptually accurate distance measurement.
from skimage.color import rgb2lab
lab_palette = rgb2lab(tw_rgb / 255.0)
3.2 ΔE2000 Matching
from skimage.color import deltaE_ciede2000
def nearest_tailwind(rgb):
rgb_arr = np.array([rgb], dtype=np.uint8) / 255.0
lab1 = rgb2lab(rgb_arr.reshape(1, 1, 3)).reshape(-1, 3)
dists = deltaE_ciede2000(lab1, lab_palette)
idx = np.argmin(dists)
return tailwind_entries[idx], dists[idx]
Advantages:
- ΔE2000 is industry standard for print, branding, and design.
- Human-eye aligned matching.
4. UI/UX Features
Built in Streamlit, the UI includes:
- Sidebar with adjustable K value.
- Toggle for Tailwind match overlay.
- Color swatches with HEX/RGB.
- Export to JSON, CSS vars, Tailwind config.
import streamlit as st
def display_color_block(hex_value, label):
st.markdown(
f"""
<div style='width:100%; height:50px; border-radius:8px; background:{hex_value}; border:1px solid #ccc;'></div>
<p>{label}</p>
""",
unsafe_allow_html=True
)
5. Applications
This tool is relevant for:
- Design systems: Extracting brand colors from assets.
- Web dev: Tailwind CSS color mapping.
- Marketing: Ensuring campaign color consistency.
- Art projects: Creating palettes from images.
6. Theory & Metrics (Deeper)
6.1 K-means, Initialization, and Complexity
-
Initialization: OpenCV uses k-means++ when
flags=cv2.KMEANS_PP_CENTERS
, which spreads initial centers and reduces poor local minima. - Iterations: Each iteration alternates assignment (nearest-center) and update (mean of assigned points).
-
Complexity: Roughly
O(N × K × I)
, whereN
= sampled pixels,K
= clusters,I
= iterations. This is why we downscale and subsample.
Practical tip: sampling ~200k–400k pixels is usually indistinguishable from using all pixels for palette purposes but much faster.
6.2 RGB vs HSV vs LAB
Space | What it represents | Pros | Cons |
---|---|---|---|
RGB | Device-primaries | Native for images, simple | Not perceptually uniform |
HSV | Hue/Sat/Value | Intuitive UI knobs | Still not uniform; hue wrap tricky |
LAB | Lightness + opponent | Approx. perceptual uniformity | Conversion overhead |
Why LAB? Distance in LAB correlates better with human perception, making nearest-color matching far more reliable.
6.3 ΔE Variants
- ΔE76: Euclidean distance in LAB (simple, fast, but less accurate).
- ΔE94: Adds weighting for chroma/lightness differences.
-
ΔE2000: Current standard; includes weighting + a rotation term
R_T
to handle blue region non-linearities.
Interpretation (rules of thumb):
- ΔE < 1: nearly imperceptible
- 1–2: perceptible through close observation
- 2–10: perceptible at a glance
- > 10: large difference
6.4 Our Vectorized ΔE2000
We use a fully vectorized ΔE2000 implementation so a single LAB color can be compared to the entire Tailwind palette efficiently:
# color_ops.py (excerpt)
import numpy as np
def deltaE2000(lab1: np.ndarray, lab2: np.ndarray) -> np.ndarray:
L1, a1, b1 = lab1[:, 0:1], lab1[:, 1:2], lab1[:, 2:3]
L2, a2, b2 = lab2[None, :, 0], lab2[None, :, 1], lab2[None, :, 2]
C1 = np.sqrt(a1**2 + b1**2); C2 = np.sqrt(a2**2 + b2**2)
C_bar = (C1 + C2) / 2.0
C_bar7 = C_bar**7
G = 0.5 * (1 - np.sqrt(C_bar7 / (C_bar7 + (25.0**7))))
a1p = (1 + G) * a1; a2p = (1 + G) * a2
C1p = np.sqrt(a1p**2 + b1**2); C2p = np.sqrt(a2p**2 + b2**2)
def _atan2(y, x):
ang = np.arctan2(y, x); return np.where(ang < 0, ang + 2*np.pi, ang)
h1p = _atan2(b1, a1p); h2p = _atan2(b2, a2p)
dLp = L1 - L2; dCp = C1p - C2p
dhp = h2p - h1p
dhp = np.where(dhp > np.pi, dhp - 2*np.pi, dhp)
dhp = np.where(dhp < -np.pi, dhp + 2*np.pi, dhp)
dHp = 2.0 * np.sqrt(C1p * C2p) * np.sin(dhp / 2.0)
Lp_bar = (L1 + L2) / 2.0; Cp_bar = (C1p + C2p) / 2.0
hp_bar = (h1p + h2p) / 2.0
hp_bar = np.where(np.abs(h1p - h2p) > np.pi, hp_bar + np.pi, hp_bar)
hp_bar = np.where(hp_bar >= 2*np.pi, hp_bar - 2*np.pi, hp_bar)
T = (1 - 0.17*np.cos(hp_bar - np.deg2rad(30))
+ 0.24*np.cos(2*hp_bar)
+ 0.32*np.cos(3*hp_bar + np.deg2rad(6))
- 0.20*np.cos(4*hp_bar - np.deg2rad(63)))
SL = 1 + (0.015 * (Lp_bar - 50)**2) / np.sqrt(20 + (Lp_bar - 50)**2)
SC = 1 + 0.045 * Cp_bar
SH = 1 + 0.015 * Cp_bar * T
delta_theta = np.deg2rad(30) * np.exp(- ((np.rad2deg(hp_bar) - 275) / 25)**2)
RC = 2 * np.sqrt(Cp_bar**7 / (Cp_bar**7 + 25.0**7))
RT = -np.sin(2 * delta_theta) * RC
dE = np.sqrt((dLp / SL)**2 + (dCp / SC)**2 + (dHp / SH)**2 + RT * (dCp / SC) * (dHp / SH))
return dE.astype(np.float32)
7. Performance & Caching
7.1 Sampling & Downscaling
-
Downscale to
max_side=1024
(configurable) to cap pixels. - Subsample to ~400k pixels for k-means input.
7.2 Streamlit Caching
Use @st.cache_resource
to store Tailwind entries + LAB matrix:
@st.cache_resource
def _load_tailwind_cache():
entries, lab = build_tailwind_entries_and_lab_remote()
return entries, lab
This avoids re-downloading and re-converting on every rerun.
7.3 Vectorization Everywhere
- Distance computations are pure NumPy (no Python loops) → orders of magnitude faster.
- Weight computation assigns each (downscaled) pixel to a centroid in one vectorized pass.
8. Tailwind Palette — Remote Fetch + Fallback
We attempt CDN/UNPKG/GitHub raw in order; on failure we fallback to a local subset.
# tailwind_remote-like approach (concept)
CDN_CANDIDATES = [
"https://cdn.jsdelivr.net/npm/tailwindcss@3.4.10/src/public/colors.js",
"https://unpkg.com/tailwindcss@3.4.10/src/public/colors.js",
"https://raw.githubusercontent.com/tailwindlabs/tailwindcss/master/src/public/colors.js",
]
For portability (and Streamlit Cloud), ensure requests
+ json5
are in requirements.txt
.
9. Accessibility Math (WCAG 2.x)
We compute relative luminance and contrast ratio to choose the ideal text color (black/white) per swatch.
# color_ops.py (excerpt)
def relative_luminance(rgb):
def _srgb_to_lin(c):
c = c/255.0
return c/12.92 if c <= 0.04045 else ((c + 0.055)/1.055)**2.4
r,g,b = rgb
R,G,B = _srgb_to_lin(r), _srgb_to_lin(g), _srgb_to_lin(b)
return 0.2126*R + 0.7152*G + 0.0722*B
def contrast_ratio(rgb1, rgb2):
L1 = relative_luminance(rgb1); L2 = relative_luminance(rgb2)
L1, L2 = (L1, L2) if L1 >= L2 else (L2, L1)
return (L1 + 0.05) / (L2 + 0.05)
Badge mapping used in the UI:
-
≥ 7.0
→ AAA -
≥ 4.5
→ AA -
≥ 3.0
→ AA (Large) - else → N/A
10. Sorting & Semantics
Beyond frequency (weight), we support hue and luminance sorting to make palettes more semantically meaningful.
import colorsys
def rgb_to_hsv_deg(rgb):
r,g,b = [v/255.0 for v in rgb]
h,s,v = colorsys.rgb_to_hsv(r,g,b)
return h*360.0, s, v
This allows quickly reordering UI tokens: primary → secondary → accent, etc.
11. Reproducibility & Determinism
- Fix RNG seeds for sampling (
seed=42
). - K-means may still vary slightly; consider locking specific centroids in future iterations (advanced feature) for fully deterministic outputs.
12. Edge Cases & Pitfalls
-
Grayscale/monochrome images → many centers converge; expect similar hex values. Consider decreasing
k
automatically when variance is low. - Highly compressed JPEGs → block artifacts may create false small clusters; slight Gaussian blur can help.
- Overexposed/underexposed images → weights biased toward very light/dark tones; consider histogram clipping.
13. Benchmarking & Profiling
- Use
timeit
to compare ΔE76 vs ΔE2000 matching speed. - Profile with
cProfile
orline_profiler
specifically around k-means and distance functions. - Cache Tailwind LAB and reuse across sessions.
14. Deployment Notes (Streamlit Cloud)
- Ensure
requirements.txt
includes:streamlit
,numpy
,Pillow
,opencv-python-headless
,requests
,json5
. - No secrets needed. App is fully stateless.
- Test on mobile; the grid is responsive (auto-fit min 220px).
15. Links
- Live App: Real-time Palette Extractor
- GitHub Repo: View on GitHub
- LinkedIn: ErtuÄŸrul Mutlu
Top comments (0)