<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ben Kemp | Python/SQL/PowerBI/Excel Tutorials</title>
    <description>The latest articles on DEV Community by Ben Kemp | Python/SQL/PowerBI/Excel Tutorials (@benardkemp).</description>
    <link>https://dev.to/benardkemp</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3742241%2Ff6accef2-5eef-468b-b661-3bdb2bba91e0.png</url>
      <title>DEV Community: Ben Kemp | Python/SQL/PowerBI/Excel Tutorials</title>
      <link>https://dev.to/benardkemp</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/benardkemp"/>
    <language>en</language>
    <item>
      <title>Sparse Neural Networks in Python — From Pruning to Dynamic Rewiring</title>
      <dc:creator>Ben Kemp | Python/SQL/PowerBI/Excel Tutorials</dc:creator>
      <pubDate>Sat, 31 Jan 2026 10:24:25 +0000</pubDate>
      <link>https://dev.to/benardkemp/sparse-neural-networks-in-python-from-pruning-to-dynamic-rewiring-3o00</link>
      <guid>https://dev.to/benardkemp/sparse-neural-networks-in-python-from-pruning-to-dynamic-rewiring-3o00</guid>
      <description>&lt;p&gt;Deep learning has followed a predictable pattern for years:&lt;/p&gt;

&lt;p&gt;Add more layers. Add more parameters. Add more GPUs.&lt;/p&gt;

&lt;p&gt;Dense scaling works — but it’s expensive, wasteful, and increasingly impractical outside hyperscale environments.&lt;/p&gt;

&lt;p&gt;Sparse neural networks offer a different direction:&lt;/p&gt;

&lt;p&gt;Keep the capacity. Reduce the computation.&lt;/p&gt;

&lt;p&gt;And you don’t need trillion-parameter models to understand how.&lt;/p&gt;

&lt;p&gt;In this series, I implemented sparse neural networks step-by-step in PyTorch — starting from scratch and moving toward dynamic sparse training.&lt;/p&gt;

&lt;p&gt;Here’s what sparse actually means in practice.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is a Sparse Neural Network?
&lt;/h2&gt;

&lt;p&gt;A neural network is sparse when:&lt;/p&gt;

&lt;p&gt;Many weights are exactly zero&lt;/p&gt;

&lt;p&gt;Or only a fraction of neurons activate per input&lt;/p&gt;

&lt;p&gt;Or only parts of the network are used conditionally&lt;/p&gt;

&lt;p&gt;Instead of computing everything, you compute only what matters.&lt;/p&gt;

&lt;p&gt;That changes the scaling equation.&lt;/p&gt;

&lt;p&gt;Dense layer compute: FLOPs ≈ input_dim × output_dim&lt;br&gt;
Sparse layer compute: FLOPs ≈ (1 − sparsity) × input_dim × output_dim&lt;/p&gt;

&lt;p&gt;At 80% sparsity, you keep 20% of the compute.&lt;/p&gt;

&lt;p&gt;That’s not compression — that’s architectural efficiency.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Python-First Sparse Series
&lt;/h2&gt;

&lt;p&gt;This isn’t theory-heavy.&lt;/p&gt;

&lt;p&gt;Each article builds sparse models directly in PyTorch.&lt;/p&gt;

&lt;p&gt;1️⃣ Dense vs Sparse (Masking)&lt;/p&gt;

&lt;p&gt;We start with a normal MLP and introduce a binary weight mask:&lt;/p&gt;

&lt;p&gt;sparse_weight = weight * mask&lt;/p&gt;

&lt;p&gt;That’s it.&lt;/p&gt;

&lt;p&gt;You immediately control structural sparsity.The Python-First Sparse Series&lt;/p&gt;

&lt;p&gt;This isn’t theory-heavy.&lt;/p&gt;

&lt;p&gt;Each article builds sparse models directly in PyTorch.&lt;/p&gt;

&lt;p&gt;1️⃣ Dense vs Sparse (Masking)&lt;/p&gt;

&lt;p&gt;We start with a normal MLP and introduce a binary weight mask:&lt;/p&gt;

&lt;p&gt;sparse_weight = weight * mask&lt;/p&gt;

&lt;p&gt;That’s it.&lt;/p&gt;

&lt;p&gt;You immediately control structural sparsity.&lt;/p&gt;

&lt;p&gt;2️⃣ Magnitude-Based Pruning&lt;/p&gt;

&lt;p&gt;Train dense → remove smallest weights:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;threshold = torch.quantile(weights.abs(), pruning_ratio)
mask = weight.abs() &amp;gt; threshold
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can often prune 80–90% of weights with surprisingly small degradation.&lt;/p&gt;

&lt;p&gt;This is the simplest form of structural sparsity.&lt;/p&gt;

&lt;p&gt;3️⃣ Activation Sparsity (k-WTA)&lt;/p&gt;

&lt;p&gt;Instead of removing weights, restrict which neurons fire:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;topk_vals, topk_idx = torch.topk(x, k, dim=1)
mask.scatter_(1, topk_idx, 1.0)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now only k neurons activate per sample.&lt;/p&gt;

&lt;p&gt;Compute drops. Structure stays intact.&lt;/p&gt;

&lt;p&gt;4️⃣ Sparse Training From Scratch&lt;/p&gt;

&lt;p&gt;Why train dense at all?&lt;/p&gt;

&lt;p&gt;Initialize sparse and train only active connections.&lt;/p&gt;

&lt;p&gt;Weights that are masked never receive gradient updates.&lt;/p&gt;

&lt;p&gt;You eliminate wasted early compute.&lt;/p&gt;

&lt;p&gt;5️⃣ Dynamic Sparse Training&lt;/p&gt;

&lt;p&gt;Static masks can be limiting.&lt;/p&gt;

&lt;p&gt;So we rewire during training:&lt;/p&gt;

&lt;p&gt;Prune weak connections&lt;/p&gt;

&lt;p&gt;Regrow new ones&lt;/p&gt;

&lt;p&gt;Keep total sparsity constant&lt;/p&gt;

&lt;p&gt;Now the network doesn’t just optimize weights.&lt;/p&gt;

&lt;p&gt;It optimizes connectivity.&lt;/p&gt;

&lt;p&gt;This is conceptually close to modern sparse research (RigL-style approaches).&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Developers Should Care
&lt;/h2&gt;

&lt;p&gt;Sparse networks aren’t just research experiments.&lt;/p&gt;

&lt;p&gt;They matter because:&lt;/p&gt;

&lt;p&gt;Compute is expensive&lt;/p&gt;

&lt;p&gt;Edge devices need efficiency&lt;/p&gt;

&lt;p&gt;Model size ≠ model cost&lt;/p&gt;

&lt;p&gt;Modern MoE architectures are sparse&lt;/p&gt;

&lt;p&gt;Conditional execution is becoming standard&lt;/p&gt;

&lt;p&gt;If you’re building models beyond toy datasets, efficiency becomes real very quickly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Dense Scaling vs Sparse Scaling
&lt;/h2&gt;

&lt;p&gt;Dense scaling: More parameters → more compute&lt;/p&gt;

&lt;p&gt;Sparse scaling: More capacity → controlled compute&lt;/p&gt;

&lt;p&gt;That shift changes architecture design decisions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where This Leads
&lt;/h2&gt;

&lt;p&gt;The next logical step is:&lt;/p&gt;

&lt;p&gt;Sparse attention&lt;/p&gt;

&lt;p&gt;Mixture of Experts&lt;/p&gt;

&lt;p&gt;Conditional token routing&lt;/p&gt;

&lt;p&gt;Fair dense vs sparse benchmarking&lt;/p&gt;

&lt;p&gt;Because sparsity isn’t about shrinking models.&lt;/p&gt;

&lt;p&gt;It’s about scaling smarter.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Thought
&lt;/h2&gt;

&lt;p&gt;If you want to understand sparse neural networks, don’t start with theory.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://solvewithpython.com/sparse-neural-networks/" rel="noopener noreferrer"&gt;Start with code.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once you see how much you can remove — and still learn — you’ll realize dense is just one point in the design space.&lt;/p&gt;

&lt;p&gt;Sparse networks open the rest of it.&lt;/p&gt;

</description>
      <category>machinelearning</category>
    </item>
    <item>
      <title>The Neural Network Lexicon: Understand Neural Networks Without the Black Box</title>
      <dc:creator>Ben Kemp | Python/SQL/PowerBI/Excel Tutorials</dc:creator>
      <pubDate>Fri, 30 Jan 2026 14:40:17 +0000</pubDate>
      <link>https://dev.to/benardkemp/the-neural-network-lexicon-understand-neural-networks-without-the-black-box-lea</link>
      <guid>https://dev.to/benardkemp/the-neural-network-lexicon-understand-neural-networks-without-the-black-box-lea</guid>
      <description>&lt;p&gt;Neural networks power modern AI — but for many developers, they still feel like magic.&lt;/p&gt;

&lt;p&gt;Not because the math is impossible, but because most explanations are either:&lt;/p&gt;

&lt;p&gt;too theoretical, or&lt;/p&gt;

&lt;p&gt;hidden behind high-level libraries.&lt;/p&gt;

&lt;p&gt;I built the Neural Network Lexicon to fix that.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is the Neural Network Lexicon?
&lt;/h2&gt;

&lt;p&gt;It’s a concept-by-concept reference for neural networks, explained from first principles.&lt;/p&gt;

&lt;p&gt;One concept per page.&lt;br&gt;
Clear definitions.&lt;br&gt;
No framework lock-in.&lt;/p&gt;

&lt;p&gt;Each entry answers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What is this concept?&lt;/li&gt;
&lt;li&gt;Why does it matter?&lt;/li&gt;
&lt;li&gt;How does it work conceptually?&lt;/li&gt;
&lt;li&gt;What usually goes wrong?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And yes — every concept includes a minimal Python example to make the computation visible.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Python (and Why Minimal)?
&lt;/h2&gt;

&lt;p&gt;The Python snippets are intentionally small.&lt;/p&gt;

&lt;p&gt;Not to build full models — but to show that:&lt;/p&gt;

&lt;p&gt;neural networks are just computations.&lt;/p&gt;

&lt;p&gt;Seeing a neuron as a weighted sum or a loss function as a number you can print changes how you think about ML.&lt;/p&gt;

&lt;h2&gt;
  
  
  Runnable Examples on GitHub
&lt;/h2&gt;

&lt;p&gt;To keep the lexicon readable, full runnable examples live in GitHub:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One idea per file&lt;/li&gt;
&lt;li&gt;No frameworks&lt;/li&gt;
&lt;li&gt;Edit → run → observe&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Read the concept, run the code, tweak a value, and learn faster.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What Does It Cover?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The lexicon is complete, not just introductory:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Core foundations (neurons, activations, loss)&lt;/li&gt;
&lt;li&gt;Training &amp;amp; optimization&lt;/li&gt;
&lt;li&gt;CNNs, RNNs, Transformers&lt;/li&gt;
&lt;li&gt;Generalization &amp;amp; robustness&lt;/li&gt;
&lt;li&gt;Explainability, uncertainty, fairness&lt;/li&gt;
&lt;li&gt;Deployment &amp;amp; model lifecycle&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In total: 100 structured entries.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who Is This For?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Developers using ML libraries who want real understanding&lt;/li&gt;
&lt;li&gt;Students overwhelmed by fragmented explanations&lt;/li&gt;
&lt;li&gt;Engineers who want to debug models, not just train them&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you believe understanding comes before optimization, this is for you.&lt;/p&gt;

&lt;p&gt;📘 &lt;a href="https://github.com/Benard-Kemp/Neural-Network-Lexicon/wiki" rel="noopener noreferrer"&gt;Neural Network Lexicon (GitHub Wiki)&lt;/a&gt;&lt;br&gt;
Built as part of &lt;a href="https://solvewithpython.com/" rel="noopener noreferrer"&gt;SolveWithPython&lt;/a&gt; — learning by understanding, not memorizing.&lt;/p&gt;

&lt;p&gt;Neural networks aren’t magic.&lt;br&gt;
Once you understand what they compute, everything else follows.&lt;/p&gt;

</description>
      <category>python</category>
      <category>neural</category>
      <category>programming</category>
    </item>
  </channel>
</rss>
