RNNs From Scratch: a Network With a Memory of the Past

#ai #machinelearning #deeplearning #beginners

"not good" and "good not" mean opposite things — order and history matter. A plain network can't capture that; an RNN can, because it carries a hidden state forward. Build a sequence and watch the memory update.

🔁 Step through a sequence: https://dev48v.infy.uk/dl/day10-rnn.html

The trick: a hidden state

let h = zeros();   // the "memory", a summary of everything seen so far

Before reading a token it holds the past; after reading, it's updated.

The recurrence

for (const x of sequence) {
  h = tanh(Wx·x + Wh·h + b);   // mix this token with the running memory
}

The Wh·h term feeds memory forward; tanh keeps it bounded. The output at any step depends on the whole history, not just the current token.

One shared cell, over time

There aren't separate networks per step — it's ONE small cell with one set of weights, applied again and again down the sequence (like a conv kernel sharing weights across space, an RNN shares across time). So it handles any length.

Memory in action

In the demo, "not" sets a context so the NEXT token's effect flips — "not good" ends negative, "not bad" positive. Same token, different history → different result. That's memory.

The flaw

Training unrolls the loop and backprops through every step. But over long sequences the gradient vanishes (or explodes) — so vanilla RNNs forget the distant past. The fix is gates: LSTM/GRU, next.

Run a sequence.

DEV Community