Ganesh Kumar

Posted on May 1

Attention Is All You Need - Part 4

#deeplearning #machinelearning #nlp #tutorial

Hello, I'm Ganesh. I'm building git-lrc, an AI code reviewer that runs on every commit. It is free, unlimited, and source-available on Github. Star Us to help devs discover the project. Do give it a try and share your feedback for improving the product.

In a previous article, I explained about the embedding and preprocessing of input.

In this article we will discuss step 2 of the transformer model, i.e. position encoding.

What is Positional Encoding?

Positional encoding is a technique used to encode the position of a word in a sentence. It is used to overcome the limitation of the transformer model that it cannot process the input in parallel.

Basically whenever the input text is given to the transformer model.

say

The Lion attacked Deer.
The Deer attacked Lion.

The token generated almost like this

791, 33199, 18855, 64191, 627, 791, 64191, 18855, 33199, 13

The meaning is different but the token created same.

Which will effect vector embeddings

How Positional Encoding Will Solve?

So, basically tokens of "Lion" and "attacked" will be same in both the sentences.

This will lead to the loss of position information. To overcome this, positional encoding is used.

We use position encoding for encoding position of words. We assign a vector to each position.

To get rid of this problem we use positional encoding.

This is the formula for positional encoding:

In this next step, we will get the positional information for each word correctly without losing any information.

So, we completed step 2 of the transformer model.

Conclusion

We found out text that has different meanings has the same tokens and loses information of the position of the word in the sentence.

To overcome this, we use positional encoding.

Reference: https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf

Any feedback or contributors are welcome! It’s online, source-available, and ready for anyone to use.
⭐ Star it on GitHub: https://github.com/HexmosTech/git-lrc

DEV Community

Attention Is All You Need - Part 4

What is Positional Encoding?

How Positional Encoding Will Solve?

Conclusion

Top comments (0)