<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Tomohisa Iura</title>
    <description>The latest articles on DEV Community by Tomohisa Iura (@tomoiura).</description>
    <link>https://dev.to/tomoiura</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3883635%2Fea5dc161-f2af-456c-8d7d-298d7c673a71.png</url>
      <title>DEV Community: Tomohisa Iura</title>
      <link>https://dev.to/tomoiura</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/tomoiura"/>
    <language>en</language>
    <item>
      <title>Draw a Digit and Watch the Neural Network Think in Real Time</title>
      <dc:creator>Tomohisa Iura</dc:creator>
      <pubDate>Fri, 17 Apr 2026 04:56:35 +0000</pubDate>
      <link>https://dev.to/tomoiura/draw-a-digit-and-watch-the-neural-network-think-in-real-time-3oe8</link>
      <guid>https://dev.to/tomoiura/draw-a-digit-and-watch-the-neural-network-think-in-real-time-3oe8</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;"A neural network can recognize digits" — but what's actually happening inside?&lt;/p&gt;

&lt;p&gt;I built a tool where you &lt;strong&gt;draw a digit with your finger or mouse, and watch the CNN (Convolutional Neural Network) recognize it in real time&lt;/strong&gt;, with the internal signal flow visualized as it happens.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://tomoiura.github.io/digit_recognizer/" rel="noopener noreferrer"&gt;Try the Demo&lt;/a&gt;&lt;/strong&gt; (runs in your browser — no install needed)&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3u5y3vradaxau75tyx5d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3u5y3vradaxau75tyx5d.png" alt="screenshot" width="800" height="864"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/Tomoiura/digit_recognizer" rel="noopener noreferrer"&gt;GitHub Repository&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is This?
&lt;/h2&gt;

&lt;p&gt;A tool that lets you see &lt;strong&gt;how a neural network makes its decisions&lt;/strong&gt; as you draw handwritten digits.&lt;/p&gt;

&lt;h3&gt;
  
  
  Three Visualizations
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Dial-style Heatmap&lt;/strong&gt; — Digits 0–9 arranged like a phone dial, with color intensity showing confidence in real time. As you draw, you can see the network thinking: "looks like an 8... wait, now it's a 3."&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Network Diagram&lt;/strong&gt; — Input → Conv1 → Conv2 → FC → Output nodes and links light up orange based on signal strength. You can trace exactly which pathways the signal took to reach the answer.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;CNN Input Preview&lt;/strong&gt; — Shows how your drawing gets downscaled to 28×28 pixels. This is what the network actually "sees."&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Not an Emulation — The Real Thing
&lt;/h3&gt;

&lt;p&gt;This is not a simulation or replay. A &lt;strong&gt;real CNN with 27,690 parameters&lt;/strong&gt; is running in your browser. Every time you draw a stroke, actual convolutions, ReLU activations, max-pooling, and fully-connected layer computations are executed, and the intermediate values are visualized directly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I Built This
&lt;/h2&gt;

&lt;p&gt;My previous project, &lt;a href="https://github.com/Tomoiura/transformer-emulator" rel="noopener noreferrer"&gt;Transformer Emulator&lt;/a&gt;, visualized the internals of a Transformer. But that was a "watch" experience — replaying pre-computed results.&lt;/p&gt;

&lt;p&gt;This time, I wanted a &lt;strong&gt;"touch" experience&lt;/strong&gt;. You draw a digit, and the network reacts instantly. The probabilities shift as you draw. The moment when "I'm drawing a 3 but the network thinks it's an 8" — that's something no textbook can give you.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Happens While You Draw
&lt;/h2&gt;

&lt;p&gt;On every &lt;code&gt;pointermove&lt;/code&gt; event while drawing, the following pipeline runs:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Canvas → 28×28 downscale&lt;/strong&gt; — Bounding box detection, center-of-mass alignment. Same preprocessing as MNIST.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CNN inference (JavaScript)&lt;/strong&gt; — Conv → ReLU → MaxPool → Conv → ReLU → MaxPool → FC → Softmax. Pure matrix operations in vanilla JavaScript.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Visualization update&lt;/strong&gt; — Intermediate activations from each layer drive the dial colors and network diagram node/link brightness.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For a CNN this size on 28×28 input, inference completes in &lt;strong&gt;a few milliseconds&lt;/strong&gt; — fast enough to run on every stroke without dropping frame rate.&lt;/p&gt;

&lt;h2&gt;
  
  
  Things I Learned
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. CNNs "Hesitate"
&lt;/h3&gt;

&lt;p&gt;Watching the probability shift while drawing a "3":&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Drawing stage&lt;/th&gt;
&lt;th&gt;Prediction&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Drew a vertical line&lt;/td&gt;
&lt;td&gt;1: 30%, 7: 25%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Closed the top curve&lt;/td&gt;
&lt;td&gt;8: 55%, 9: 20%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Opened the bottom&lt;/td&gt;
&lt;td&gt;3: 60%, 8: 22%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Finished drawing&lt;/td&gt;
&lt;td&gt;3: 92%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The intermediate states genuinely look like an 8. The CNN's "hesitation" matches human intuition — it's making rational judgments.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Preprocessing Makes or Breaks Accuracy
&lt;/h3&gt;

&lt;p&gt;In the first version, drawing a "4" got classified as "7". The cause: missing preprocessing. MNIST data is center-of-mass aligned, but I was just naively downscaling the canvas to 28×28. Adding MNIST-compliant preprocessing (bounding box detection → center alignment → fit into 20×20 region) fixed it immediately.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. 27,690 Parameters, 98% Accuracy
&lt;/h3&gt;

&lt;p&gt;GPT-4 reportedly has ~1.8 trillion parameters. This CNN is &lt;strong&gt;1/65-millionth&lt;/strong&gt; that size. Yet it achieves 98.04% test accuracy. "Choose the right architecture (convolutions) and you can get high accuracy with minimal parameters" — this is the essence of CNNs, and you can feel it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tech Stack
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Technology&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Training&lt;/td&gt;
&lt;td&gt;Python / pure NumPy&lt;/td&gt;
&lt;td&gt;No PyTorch — all backpropagation implemented from scratch. Educational purpose&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Inference&lt;/td&gt;
&lt;td&gt;Vanilla JavaScript&lt;/td&gt;
&lt;td&gt;Runs entirely in the browser. No external libraries&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Visualization&lt;/td&gt;
&lt;td&gt;SVG + Canvas + CSS&lt;/td&gt;
&lt;td&gt;Network diagram in SVG, drawing and preview in Canvas&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Output&lt;/td&gt;
&lt;td&gt;Single HTML file (~620KB)&lt;/td&gt;
&lt;td&gt;Trained weights embedded as JSON. Easy to distribute&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Model Architecture
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Conv(5x5, 8ch) → ReLU → MaxPool(2)    # Detect 8 types of features from the image
Conv(3x3, 16ch) → ReLU → MaxPool(2)   # Combine into 16 higher-level features
Flatten(400) → FC(64) → ReLU          # Integrate all features for judgment
FC(10) → Softmax                       # Output probabilities for 0–9
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Network Diagram Implementation
&lt;/h3&gt;

&lt;p&gt;Nodes for each layer are placed in SVG, with &lt;code&gt;&amp;lt;line&amp;gt;&lt;/code&gt; elements connecting adjacent layers. During inference, activation values update each node's &lt;code&gt;fill&lt;/code&gt; and each link's &lt;code&gt;stroke-opacity&lt;/code&gt;, making the signal flow visible.&lt;/p&gt;

&lt;p&gt;There are 552 links total, but most have &lt;code&gt;opacity&lt;/code&gt; near 0 — visually, only the active pathways light up.&lt;/p&gt;

&lt;h2&gt;
  
  
  Multilingual Support
&lt;/h2&gt;

&lt;p&gt;A toggle button next to the title switches between Japanese and English. The initial language is auto-detected from the browser's language setting, and can also be set via URL parameter (&lt;code&gt;?lang=en&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;Since there are few text elements, a JS dictionary holds both languages and a button click swaps all text instantly — even mid-drawing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Live Demo
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://tomoiura.github.io/digit_recognizer/" rel="noopener noreferrer"&gt;https://tomoiura.github.io/digit_recognizer/&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Just open it in your browser.&lt;/p&gt;

&lt;h3&gt;
  
  
  Build from Source
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/Tomoiura/digit_recognizer.git
&lt;span class="nb"&gt;cd &lt;/span&gt;digit_recognizer
pip &lt;span class="nb"&gt;install &lt;/span&gt;numpy
python main.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;First run downloads MNIST data and trains the model (takes a few minutes). Subsequent runs use cached weights and complete in seconds.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;My previous Transformer Emulator was about "watching AI learn." This project is about "drawing with your own hand and feeling AI react in real time."&lt;/p&gt;

&lt;p&gt;Instead of formulas or diagrams, the answer to "what is a neural network doing?" comes through &lt;strong&gt;touching, seeing, and feeling&lt;/strong&gt;. That's the experience I was aiming for.&lt;/p&gt;

&lt;p&gt;If you find technical errors or have suggestions, &lt;a href="https://github.com/Tomoiura/digit_recognizer" rel="noopener noreferrer"&gt;Issues and PRs&lt;/a&gt; are welcome.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Related&lt;/strong&gt;: &lt;a href="https://github.com/Tomoiura/transformer-emulator" rel="noopener noreferrer"&gt;Transformer Emulator&lt;/a&gt; — Visualize the internals of a Transformer decoder, also running in the browser.&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>neuralnetwork</category>
      <category>visualization</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
