From 8 to 13 modules, 4 languages, and a fully responsive mobile layout — the interactive ML playground keeps growing.
What Happened
The last update added 3 modules (Gradient Descent, Confusion Matrix, Overfitting). This time I went bigger: 5 more modules covering logistic regression, PCA, SVM, Naive Bayes, and t-SNE. Plus cross-cutting features: 4-language internationalization and mobile touch support.
The project now has 13 interactive modules, each with 5 progressive levels, a DefinitionGuide, LiveHint messages, and auto-detected completion.
The 5 New Modules
1. Probability Seesaw — Logistic Regression
The metaphor: A seesaw balances two classes. The sigmoid (S-curve) smoothly transitions from "class 0" to "class 1" across the decision boundary.
How it works: The model fits logistic regression using gradient descent with configurable steps and learning rate. The 2D canvas shows:
- A colored probability heatmap (blue = p<0.5, pink = p≥0.5)
- The sigmoid decision boundary (p=0.5 contour) as a dashed yellow line
- Two blob-shaped classes that overlap slightly near the decision boundary
The gradient descent optimizer minimizes log-loss:
function logisticGradient(points, w0, w1, b) {
let dw0 = 0, dw1 = 0, db = 0;
for (const p of points) {
const z = w0 * p.x + w1 * p.y + b;
const pred = sigmoid(z);
const error = pred - p.label;
dw0 += error * p.x;
dw1 += error * p.y;
db += error;
}
return { dw0: dw0 / n, dw1: dw1 / n, db: db / n };
}
What you learn: Accuracy alone isn't enough — log-loss measures probability calibration. Level 5 requires perfect classification and near-zero log-loss, meaning well-calibrated probabilities.
2. Shadow Puppets — PCA
The metaphor: Each data point casts a "shadow" (perpendicular projection) onto a line. As you rotate the line, the shadows spread out or bunch up. The direction with the most spread is the first principal component.
How it works: PCA is computed analytically via the covariance matrix:
function computePCA(points) {
// Center the data, compute 2x2 covariance matrix
// Eigen-decompose: θ = 0.5 * atan2(2*vxy, vxx - vyy)
// Explained variance = λ1 / (λ1 + λ2)
}
The user drags a yellow handle on the PC line to rotate it. Real-time feedback:
- Projection lines (green dashed) connect each point to its shadow
- Projected points (green circles) slide along the line
- Variance bar fills up as the user approaches the true PC1
- Error angle shows degrees away from the optimal direction
What you learn: PCA finds the direction of maximum variance. The "shadow" visualization makes dimensionality reduction concrete — you're literally seeing how a 2D point can be represented by a single number (its projection coordinate).
3. Tug-of-War — SVM
The metaphor: The decision boundary is like a rope in a tug-of-war. The outermost points (support vectors) are the teams pulling on the rope — they determine where the boundary lies. Remove a support vector and the line shifts. Add a point near the boundary and the margin shrinks.
How it works: The SVM solver randomly samples pairs of points from opposite classes and checks if the perpendicular bisector separates all the data. Among all separating lines, it picks the one with the maximum margin:
function computeSVM(points) {
// Randomly sample {class0, class1} pairs as candidate boundaries
// For each pair: compute perpendicular bisector
// Check if it separates all points (with tolerance)
// Pick the one with the largest minimum distance to any point
// Return { wx, wy, b, margin, supportVectors }
}
This is a randomized approximation of the true hard-margin SVM, but for small 2D datasets it converges to the optimal solution within a few hundred iterations.
Support vectors are identified as points within 1.15× the margin distance — they glow with a yellow ring that makes them stand out visually.
Interactive features:
- Click empty space → add a point (alternates between classes for balance)
- Click a point → remove it
- The margin and support vector count update instantly
What you learn: Only support vectors matter for the decision boundary. Non-support-vector points can be moved freely without affecting the classification.
4. Spam or Not — Naive Bayes
The metaphor: You're a mail sorter. Each email has some words written on it. You drag it into the "Spam" or "Not Spam" bin. Every time you sort an email, the probability of each word given spam/not-spam updates — that's Bayes' Theorem in action.
How it works: Each email is generated with 2-5 words from a vocabulary of 12 words (mix of spammy words like "free", "win", "cash" and benign ones like "hello", "meeting", "report"). The user drags cards into bins using D3 drag-and-drop.
Word probabilities use Laplace smoothing:
wordProbs[word] = {
givenSpam: (countInSpam + 1) / (totalSpam + 2),
givenNot: (countInNot + 1) / (totalNot + 2),
};
The sidebar shows the top "spam words" ranked by the ratio P(word|spam) / P(word|not-spam). This ratio is the core of Naive Bayes — words with high ratios are strong spam indicators.
What you learn: Naive Bayes assumes word independence (which is naive — "free" and "win" often co-occur). Despite this, the classifier works because it only needs the relative ordering of probabilities, not exact calibration.
5. Unfolding Origami — t-SNE / UMAP
The metaphor: High-dimensional data is like a crumpled piece of origami paper. t-SNE unfolds it into 2D, revealing the hidden structure — but the unfolding can look different depending on how you hold it (the perplexity parameter).
How it works: This is a full (simplified) t-SNE implementation running in the browser:
- Generate 6D data with 4 distinct clusters
- Compute pairwise Euclidean distances between all points
- Binary search for σ per point to achieve target perplexity (Gaussian kernel)
- Initialize random 2D positions
- Optimize KL divergence via gradient descent with momentum
The KL divergence gradient:
// For each pair (i, j):
// p_ij = high-D affinity (Gaussian, normalized)
// q_ij = low-D affinity (Cauchy, normalized)
// gradient = 4 * Σ_j (p_ij - q_ij) * (y_i - y_j) * (1 + ||y_i - y_j||²)⁻¹
The animation runs at ~20ms per iteration using requestAnimationFrame, completing 300 iterations in about 6 seconds. Users can adjust perplexity from 5 to 50 and watch how the embedding changes.
What you learn: Perplexity controls the balance between local and global structure. Low perplexity (5) shows fine-grained clusters. High perplexity (50) shows the overall layout. There's no single "correct" t-SNE — different perplexities reveal different aspects of the data.
Cross-Cutting Features
Multi-Language i18n (4 Languages)
The project now supports English, Spanish, French, and Chinese. The implementation uses:
src/utils/i18n.js— A dictionary of translation strings organized by namespace (nav.*,levels.*,guide.*,controls.*). Thet(key, locale)function does dot-notation lookup with graceful fallback to the key itself.src/utils/LanguageContext.jsx— A React context that stores the current locale and syncs it tolocalStorage. Wraps the entire app tree inApp.jsx.src/components/LanguageSwitcher.jsx— Four pill-buttons (EN, ES, FR, 中文) in the sidebar footer. Active language is highlighted with the accent color.Component integration —
LevelSystem.jsxandDefinitionGuide.jsxuseuseLanguage()to translate their labels. All 13 workbenches passt("levels.title")etc. through the shared components.
The translation process for each locale took about 30 minutes — mostly getting technical ML terms right in each language ("support vectors", "decision boundary", "overfitting").
Mobile Touch Support
The previous fixed-width 700×500 canvas layout was unusable on mobile. The changes:
Responsive sidebar — Switches from a fixed left panel to a slide-over drawer triggered by a hamburger icon. A semi-transparent backdrop overlay closes the sidebar on tap.
Touch-action CSS —
touch-action: noneon SVG elements prevents browser default gestures (scroll, zoom) from interfering with drag interactions.Range slider styling — Custom
-webkit-appearance: nonestyles with larger thumb targets (16px) for finger-friendly interaction.Overflow handling — Canvas containers use
overflow-x: autoon mobile withmax-width: 100%on the SVG. Users can scroll horizontally if the canvas exceeds the viewport.
D3's d3.drag() already handles touch events natively — it translates touch coordinates to the same event format as mouse coordinates, so all existing drag interactions (PCA rotation, sorting machine cards, K-Means magnet placement) work without modification.
What 13 Modules Looks Like
workbenches/
├── LinearRegression/ Stretchy Rope (D3 + OLS)
├── KMeans/ Magnetic Clusters (D3 drag + forces)
├── DecisionTrees/ 20 Questions (D3 interactive tree)
├── Ensemble/ Jury Room (D3 decision surface)
├── NeuralNetworks/ Lego Blocks (Three.js 3D)
├── GradientDescent/ Roller Coaster (D3 contour + animation)
├── ConfusionMatrix/ Sorting Machine (D3 drag-drop)
├── Overfitting/ Emperor's Tailor (D3 polynomial fit)
├── LogisticRegression/ Probability Seesaw (D3 + GD optimizer)
├── PCA/ Shadow Puppets (D3 drag-rotate)
├── SVM/ Tug-of-War (D3 interactive)
├── NaiveBayes/ Spam or Not (D3 drag-sort)
└── TSNE/ Unfolding Origami (D3 + KL optimization)
Every module follows the same structural pattern: a single self-contained JSX file, D3 for rendering, useLevelSystem for progression, DefinitionGuide + LiveHint for inline education.
Reflections
What Worked
The t-SNE module was the hardest but most rewarding. Implementing KL divergence optimization from scratch in the browser was genuinely challenging — binary searching per-point σ, handling numerical stability in the gradient, making the animation smooth at 50fps. But watching real 6D → 2D unfolding in real-time is spectacular.
The SVM "click to add/remove" interaction is surprisingly addictive. There's something satisfying about placing a point right on the margin boundary and watching the line recoil. It makes the concept of support vectors visceral in a way that static diagrams never could.
i18n barely increased the bundle size. The translation dictionaries add about 3KB gzipped. The
LanguageContextandLanguageSwitcheradd negligible overhead. Worth doing early.
What I'd Improve
t-SNE performance. At 48 points with 300 iterations, the optimization takes ~6 seconds. For a production-quality experience, I'd want 100+ points in under 2 seconds. Using a Web Worker for the optimization loop would keep the UI responsive.
Mobile canvas sizing. The fixed 700×500 canvas works on desktop but requires horizontal scroll on phones. A better approach would use SVG
viewBoxwith responsive width/height that scales to fit the viewport.SVM solver robustness. The randomized solver works for well-separated data but can fail with overlapping classes (no perfect linear separator exists). A fallback to a soft-margin formulation would make it more robust.
What's Next?
The Ideas for New Features table in the README now has 13 new suggestions:
- K-Nearest Neighbors — click a point, see its K nearest neighbors vote
- DBSCAN — adjust density radius, watch clusters form
- Random Forest — a forest of stumps voting together
- Gradient Boosting — each tree fixes the previous one's mistakes
- Reinforcement Learning — a rat learning a maze
- Attention Mechanism — spotlight on words
- GANs — forger vs detective
Plus activation functions, regularization, cross-validation, feature engineering, batch normalization, and collaborative filtering.
The door is always open for contributions.
Try It
git clone https://github.com/harishkotra/play-learn-ml.git
cd play-learn-ml
npm install && npm run dev
The project is free, open-source, and runs entirely in the browser. No backend, no API keys, no sign-up.
Screenshots
Try it here: https://play-learn-ml.vercel.app/
Code & more: https://www.dailybuild.xyz/project/163-play-learn-ml















Top comments (0)