Understanding Seq2Seq Neural Networks – Part 2: Embeddings for Sequence Inputs

#ai #machinelearning

In the previous article, we just began with the concept of the sequence to sequence problem, and dealing with variable length inputs and outputs.

We already know how to use Long Short-Term Memory (LSTM) units to deal with variable-length inputs and outputs.

For example, if the input sentence is “Let’s go”, we first put “Let’s” into the input of the LSTM.

Then we unroll the LSTM and plug “go” into the second input.

However, we can’t just jam words directly into a neural network.
Instead, we use an embedding layer to convert the words into numbers.

To keep the example relatively simple, the English vocabulary for our Encoder–Decoder model only contains three words: “Let’s”, “to”, and “go.”

It also contains the symbol EOS, which stands for End of Sentence.

Because the vocabulary contains a mix of words and symbols, we refer to the individual elements in a vocabulary as tokens
(Let’s, to, go, < EOS >).

In this example, we are creating two embedding values per token, instead of hundreds or thousands, to keep things simple.

Now that we have an embedding layer for the input vocabulary, we need to connect it to the LSTM.

We will explore this in the next article.

Looking for an easier way to install tools, libraries, or entire repositories?
Try Installerpedia: a community-driven, structured installation platform that lets you install almost anything with minimal hassle and clear, reliable guidance.

Just run: