Attention Is All You Need - Part 3

Hello, I'm Ganesh. I'm building git-lrc, an AI code reviewer that runs on every commit. It is free, unlimited, and source-available on Github. Star Us to help devs discover the project. Do give it a try and share your feedback for improving the product.

In a previous article, we discussed the limitations of RNNs and how they were not able to capture long range dependencies and process input in parallel.

How are words converted to numerical value?

Each long text is encoded into a numerical value called a token.

These tokens are totally dependent on the embeddings.

For example:

"Hey, How are you?"

This will be split into tokens with 6 tokens.

Those token are 2, 6750, 235269, 2250, 708, 692

Once the word is tokenized.

Vector Embedding

These tokenized numerical values will be embedded into vectors. Or we can say a matrix of real numbers.

For example:

cat -> "Milk"
dog -> "Bone"

so, if we get distace between cat and dog as vector

We can relate to each other and get other information.
In the image, we mapped the cat and dog and then followed with related embedding vectors to the food that the cat likes.

By the same distance between dog and cat, we can find the dog's favorite food.

This process is part of input preprocessing.

Conclusion

We got to know how input is converted to vectors and then completed the first stage of the transformer model.

Reference: https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf

Any feedback or contributors are welcome! It’s online, source-available, and ready for anyone to use.
⭐ Star it on GitHub: https://github.com/HexmosTech/git-lrc

DEV Community

Attention Is All You Need - Part 3

How are words converted to numerical value?

Vector Embedding

Conclusion

Top comments (0)