π Introduction
I thought to Build a Machine Learning Model to generate Emojis for My Writing content
Like when typing some sentences, I think I need relevant Emojis in the sentences, as it will give more expressiveness, right
And also because currently I am searching in Web(using some website like Emoji Finder etc., it is based on keywords only) to pick Emojis
So I decided to Develop a LSTM Model to automatically generate Emoji for My Sentences
π₯ First Demo
π§βπ« Explanation
When the user types the paragraph, for each paragraph the Model predicts the approximate Emoji
For example, if you have 5 sentences in a paragraph, the Model will generate 5 Emojis
βοΈ Technical Details
I currently have totally 20 Emojis β€, π, π, π, π₯, π, π, β¨, π, π, π·, πΊπΈ, β, π, π, π―, π, π, πΈ, π
And used Dataset from Hugging Face and used Google Colab for Training
π― Model Selection
I have decided to use BiLSTM(Bidirectional LSTM(Long Short-Term Memory))
We could have just used LSTM rather then BiLSTM as we are not doing Seq2Seq Task, like we are not translating English to Emojis
But, here BiLSTM improves the performance by at least 10%
So, decided to use BiLSTM
Note :
Here we could have used BiLSTM with Attention Mechanism or even Fine-Tuned Pre Trained Model like BERT
But decided to use BiLSTM
ποΈ Architecture
- First have used Embedding Layer
- Then, BiLSTM with 2 Layers
- Dropout for Regularization
- Then Linear Layer with 20 output units for each Emojis
- Note :
- We are using BiLSTM, so it will hidden states from both the directions
- So, before passing to Linear Layer we must concatenate both the hidden states
π² Hyperparameters
- I have used the Maximum length of the Sequence to be
50
- Embedding Dimension as
128
- Hidden Dimension for LSTM States to be
128
- As stated before, we have chosen Number of Layers to be 2
- Learning Rate Ξ± : 0.0005
π¨Β Accuracy and Important Study
Regarding Accuracy, it is not relatively higher although with BiLSTM and even low with Unidirectional LSTM
Was tweaking and trying to improve the performance then LATER ONLY realized that data has Class Imbalance Problem
Where ~21% of the Data has "β€" which dominates most and followed by ~10% of Data has "π" and so on... then "π" was barely minimum
β οΈ Important action
π₯ First Rule : Data in Training Set, Development Set and Test Set should follow the same Distribution
Followed by, may be here, we can Augmented Data Synthetically, or may be with some other Architecture so on
Yes, we can also increase Accuracy with the same Data but as of now saved that for later
π Little History
I have used Regular LSTM and saw the Model was Overfitting
So, of various Hyperparameters(still overfitting) after few tweaking various Hyperparameters along with BiLSTM showed little improvement and continued and seems to work fine
Originally Posted : https://ajithraghavan.github.io/blog/emoji-generator/
Thanks for reading
Top comments (0)