In the previous article, we stopped at the concept of the context vector.
In this article, we will start by decoding the context vector.
Connecting the Decoder
The first thing we need to do is connect the long-term and short-term memories (the cell states and hidden states) that form the context vector to a new set of LSTMs.
Just like the encoder, the decoder will also have two layers, and each layer will have two LSTM cells.
The LSTMs in the decoder are different from the ones in the encoder and have their own separate weights and biases.
Using the Context Vector
The context vector is used to initialize the long-term and short-term memories (the cell states and hidden states) in the LSTMs of the decoder.
This is important because it allows the decoder to start with the information learned from the input sentence.
Goal of the Decoder
The ultimate goal of the decoder is to convert the context vector into the output sentence.
In simple terms, the encoder understands the input, and the decoder generates the output based on that understanding.
Decoder Inputs
Just like in the encoder, the input to the LSTM cells in the first layer comes from an embedding layer.
However, in this case, the embedding layer creates embedding values for Spanish words, such as:
ir
vamos
y
(End of Sentence symbol)
Each of these words is treated as a token, and the embedding layer converts them into numbers that the neural network can process.
We will explore the details of how the decoder generates the output sentence in the next article.
Looking for an easier way to install tools, libraries, or entire repositories?
Try Installerpedia: a community-driven, structured installation platform that lets you install almost anything with minimal hassle and clear, reliable guidance.
Just run:
ipm install repo-name
… and you’re done! 🚀

Top comments (0)