Understanding Transformers Part 11: How Decoding Begins

#ai #machinelearning

In the previous article we wrapped up the encoder part, In this article, we will start building the second part of the transformer: the decoder.

Just like the encoder, the decoder also begins with word embeddings.

However, this time the embeddings are created for the output vocabulary, which consists of Spanish words such as:

ir
vamos
y
<EOS> token

Starting the Decoding Process

To begin decoding, we use the <EOS> token as the input.

This is a common way to initialize the decoding process for an encoded sentence.

In some cases, people use a <SOS> (Start of Sentence) token instead.

Creating the Initial Input

We represent the <EOS> token as a vector by assigning:

1 to <EOS>
0 to all other words in the vocabulary

From this we can see that 2.70 and -1.34 are the numbers that represent the value for the EOS token.

Now that we have the initial input for the decoder, the next step is to add positional encoding.

We will explore this in the next article.

Looking for an easier way to install tools, libraries, or entire repositories?
Try Installerpedia: a community-driven, structured installation platform that lets you install almost anything with minimal hassle and clear, reliable guidance.

Just run: