DEV Community

Cover image for Understanding LSTMs: A Better Recurrent Neural Network
Rijul Rajesh
Rijul Rajesh

Posted on

Understanding LSTMs: A Better Recurrent Neural Network

In the previous article we saw that, In recurrent neural networks, how a feedback loop allows us to unroll a network that works with different amounts of sequential data.

However, we also saw that when we unroll the network too many times with a weight greater than 1, the math results in repeatedly multiplying the input by that weight.

This large number causes the gradient to explode.

On the other hand, if the weight on the feedback loop is less than 1, it leads to the vanishing gradient problem.

So, basic RNNs are hard to train because the gradients can either explode or vanish.

To address this issue, we can extend the basic vanilla recurrent neural network without making drastic changes.

In the coming articles, including this one, we will discuss Long Short-Term Memory, which is a type of recurrent neural network designed to avoid the exploding and vanishing gradient problems.

Main Idea Behind How LSTM Works

The main idea behind Long Short-Term Memory is that instead of using the same feedback loop connection for events that happened long ago and events that just happened yesterday, it separates these responsibilities.

Long Short-Term Memory uses separate paths to make predictions about tomorrow.

One path is responsible for long-term memory, and another path is responsible for short-term memory.

Now that we have a basic idea of what LSTM is, let us go into the details.

LSTM – The Details

Compared to the basic RNN we discussed earlier, the LSTM version is more complex.

Unlike the networks we used earlier, we will be using sigmoid functions here.

In the next article, we will cover this in more detail.

Looking for an easier way to install tools, libraries, or entire repositories?
Try Installerpedia: a community-driven, structured installation platform that lets you install almost anything with minimal hassle and clear, reliable guidance.

Just run:

ipm install repo-name
Enter fullscreen mode Exit fullscreen mode

… and you’re done! πŸš€

Installerpedia Screenshot

πŸ”— Explore Installerpedia here

Top comments (0)