Hello, I'm Ganesh. I'm building git-lrc, an AI code reviewer that runs on every commit. It is free, unlimited, and source-available on Github. Star Us to help devs discover the project. Do give it a try and share your feedback for improving the product.
In previous article we discussed about the limitations of RNNs and how they were not able to capture long range dependencies and process input in parallel.
How was Translation done earlier?
There are many way the translation work but in earlier days it was done using seq2seq models.
For example:
Translation: "The cat sat on the mat" -> "Le chat s'est assis sur le tapis"
This translation was done using RNNs.
Let's see how this was done.
Input : Many words
Output : Many words
This had major flaws if the input seqeuence length was long the output was very poor.
so, this made the accuracy of the model very poor for long sentences.
If context is very load and it had accuracy issues then decoder might get confused and couldn't predict the correct output.
How Long sentence problem was solved?
We had to provide additional context to decoder in order to over come this issues.
That's when attention mechanism comes into picture.
This was another improvement over the seq2seq models.
We add context vector to decoder which help them to have full context vector about the input sequence.
What's next?
In this article we discussed how the translation was done earlier and how it was improved with the attention mechanism.
Any feedback or contributors are welcome! Itβs online, source-available, and ready for anyone to use.
β Star it on GitHub: https://github.com/HexmosTech/git-lrc


Top comments (0)