Table of Contents
- Neuron: The Basic Unit
- Layer: Many Neurons Working Together
- Block: Layers Grouped into Units
- Network: The Complete Model
- Why Blocks Matter in Modern AI
- Evolution Toward Large Language Models
1. Neuron: The Basic Unit
At the smallest scale, a neuron takes numbers as input and produces one number as output.
- Inputs: array of values from previous layer
- Weights: one weight per input
- Bias: a trainable constant
- Activation function: shapes the final output
Formula:
output = f( Σ (w_i * a_i) + b )
A neuron is nothing more than a weighted sum plus bias, passed through an activation function.
2. Layer: Many Neurons Working Together
A layer is just a group of neurons running in parallel.
Example: input vector of size 4, output layer of 3 neurons.
- Weights: matrix of shape (4X3)
- Bias: array of size 3
- Output: array of size 3
Computation:
z = a^(L-1) * W + b
a^(L) = f(z)
So a layer transforms an input array into a new array.
3. Block: Layers Grouped into Units
A block is several layers packaged together. Blocks are used because the same pattern of layers often repeats.
Example block:
- Input: size 20
- Layer 1: 20 → 12
- Layer 2: 12 → 6
- Layer 3: 6 → 4
- Output: size 4
At a glance:
20 → 12 → 6 → 4
The whole block is just “input 20, output 4.”
4. Network: The Complete Model
A neural network is the full stack of blocks that solves a task.
- A small network may be one block.
- Larger networks are many blocks stacked together.
- The scope is the difference:
- Block = part of the network
- Network = the entire model
5. Why Blocks Matter in Modern AI
- CNNs (for images) use convolutional blocks.
- ResNets use residual blocks.
- Transformers (used in LLMs) use transformer blocks: each block has self-attention, feedforward layers, normalization, and residual connections.
- By repeating the same block many times, networks scale to billions of parameters.
6. Evolution Toward Large Language Models
- Start: single neurons → learn simple mappings.
- Next: layers → capture richer transformations.
- Then: blocks → reusable patterns that go deeper.
- Finally: networks made of hundreds of blocks → capable of handling language, vision, and more.
Large Language Models (LLMs) are just very large stacks of transformer blocks. The principle is the same as the tiny neuron: weighted sum + bias → activation.
Takeaway
Neural networks scale by abstraction:
Neuron → Layer → Block → Network → LLM
Top comments (0)