<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Shambhavi Mishra</title>
    <description>The latest articles on DEV Community by Shambhavi Mishra (@shambhavicodes).</description>
    <link>https://dev.to/shambhavicodes</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F458583%2F54996e4e-0f80-408b-a01c-da70e1fa5f19.jpeg</url>
      <title>DEV Community: Shambhavi Mishra</title>
      <link>https://dev.to/shambhavicodes</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/shambhavicodes"/>
    <language>en</language>
    <item>
      <title>All About Autoencoders!</title>
      <dc:creator>Shambhavi Mishra</dc:creator>
      <pubDate>Sat, 24 Oct 2020 09:56:11 +0000</pubDate>
      <link>https://dev.to/shambhavicodes/all-about-autoencoders-3a36</link>
      <guid>https://dev.to/shambhavicodes/all-about-autoencoders-3a36</guid>
      <description>

</description>
    </item>
    <item>
      <title>Introduction to Generative Modeling</title>
      <dc:creator>Shambhavi Mishra</dc:creator>
      <pubDate>Sat, 24 Oct 2020 09:55:08 +0000</pubDate>
      <link>https://dev.to/shambhavicodes/introduction-to-generative-modeling-57jm</link>
      <guid>https://dev.to/shambhavicodes/introduction-to-generative-modeling-57jm</guid>
      <description>

</description>
    </item>
    <item>
      <title>Paying Attention Again!</title>
      <dc:creator>Shambhavi Mishra</dc:creator>
      <pubDate>Sat, 24 Oct 2020 09:54:42 +0000</pubDate>
      <link>https://dev.to/shambhavicodes/paying-attention-again-3c1c</link>
      <guid>https://dev.to/shambhavicodes/paying-attention-again-3c1c</guid>
      <description>&lt;p&gt;The Transformer Architecture [1] introduced by Vaswani et al, is based on attention mechanism and overcomes the challenges faced in recurrence. In continuation to the last blog 'Let's pay some Attention!', let's delve deeper into Attention Mechanism.&lt;/p&gt;

&lt;p&gt;Recurrent Neural Networks or RNNs were introduced to handle the sequential data but optimisation tends to take longer in case of RNN due to :&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The number of iterations or steps in gradient descent is higher in recurrence.&lt;/li&gt;
&lt;li&gt;There are several sequential operations which can not be parallelised easily.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Multi-Head Attention
&lt;/h2&gt;

&lt;p&gt;Let's do a quick recap of the attention mechanism we understood in the last blog. &lt;br&gt;
We have some key - value pairs and a query. We compare the query to each key and the key with the highest similarity score is assigned the highest weight. To generate an output, we now take a weighted combination of the corresponding values. &lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Gf0Wxozy--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/23oavy7fci5j71gkpob0.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Gf0Wxozy--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/23oavy7fci5j71gkpob0.jpg" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the flow chart above, we feed the Value (V), Key (K) and Query (Q) into a &lt;em&gt;linear layer&lt;/em&gt; with, say 3, projections each of V, K and Q. Then we compute a &lt;em&gt;scalar dot product attention&lt;/em&gt; of each of the projection, one head per linear layer, thus we get 3 heads of scaled dot product attention . Now we concatenate these heads into one and feed the concatenated output into a linear layer which then outputs the &lt;strong&gt;multi-head attention&lt;/strong&gt;.&lt;br&gt;
In the case of Multi-Head Attention, &lt;em&gt;we compute multiple attentions per query with different weights&lt;/em&gt;. &lt;/p&gt;

&lt;h2&gt;
  
  
  Masked Multi-Head Attention
&lt;/h2&gt;

&lt;p&gt;When decoding the output from encoder, an output value should only depend upon previous outputs and not the future outputs. Thus, to ensure that future values have zero attention, we mask them. We define masked multi-head attention as the multi-head attention where some values are masked and the probabilities of masked values are nullified to prevent them from being selected.&lt;br&gt;
Mathematically, as illustrated below, Masked Multi-Head Attention is calculated by the addition of 'M', a mask matrix of zeroes and negative infinity to the transpose of query.&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--10DaTfxI--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/j3bvppfefpdq4wiczzyf.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--10DaTfxI--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/j3bvppfefpdq4wiczzyf.jpg" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Q&lt;sup&gt;T&lt;/sup&gt; : Transpose of Query&lt;br&gt;
K : Key&lt;br&gt;
V : Value&lt;br&gt;
dk : Dimensionality of each key&lt;br&gt;
M : Mask Matrix of 0s and -∞&lt;/p&gt;

&lt;h2&gt;
  
  
  Layer Normalisation
&lt;/h2&gt;

&lt;p&gt;In the performance of Transformer, layer normalisation [2] introduced by Lei Ba et al. plays a key role.&lt;br&gt;
The interdependency of weights and their constant change during computation is eliminated with the help of layer normalisation. Normalisation ensures that regardless of how we set the weights, the output of a layer have a mean of '0' and a variance of '1'. Thus, the scale of these outputs is going to be the same leading to faster convergence.&lt;/p&gt;

&lt;p&gt;While in case of &lt;strong&gt;layer normalisation&lt;/strong&gt;, normalisation is at a layer level whereas for &lt;strong&gt;batch normalisation&lt;/strong&gt;, it is performed for one hidden unit but by normalizing across a batch of inputs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding Complexities
&lt;/h2&gt;

&lt;p&gt;In a self attention network, a layer consists of 'n' positions. For each position, the dimensionality is given as 'd'. Computation in one layer is going to be of order 'n&lt;sup&gt;2&lt;/sup&gt;' because for every position we attend to every other position. For each such pair, we are going to compute an embedding of dimensionality 'd'. Thus, the complexity of every layer is 'n&lt;sup&gt;2&lt;/sup&gt;d'. &lt;/p&gt;

&lt;p&gt;In this blog we covered multiple topics associated with the concept of Attention. &lt;/p&gt;

&lt;p&gt;References :&lt;br&gt;
[1] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS'17). Curran Associates Inc., Red Hook, NY, USA, 6000–6010.&lt;br&gt;
[2] J. L. Ba, J. R. Kiros, and G. E. Hinton. Layer normalization. arXiv preprint arXiv:1607.06450,&lt;br&gt;
2016.&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>deeplearning</category>
      <category>neuralnetwork</category>
    </item>
    <item>
      <title>Let's pay some Attention!</title>
      <dc:creator>Shambhavi Mishra</dc:creator>
      <pubDate>Fri, 28 Aug 2020 18:05:12 +0000</pubDate>
      <link>https://dev.to/shambhavicodes/let-s-pay-some-attention-33d0</link>
      <guid>https://dev.to/shambhavicodes/let-s-pay-some-attention-33d0</guid>
      <description>&lt;p&gt;Before discussing a new technology or methodology, we should try to understand the need of it. And so, let us know what gave path to the Transformer Networks. &lt;/p&gt;

&lt;h2&gt;
  
  
  Challenges with Recurrent Neural Networks
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--znbvsl7P--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/ptwajkhdgj3i1zhgsqgv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--znbvsl7P--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/ptwajkhdgj3i1zhgsqgv.png" alt="Image from mc.ai"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;(Image Source : mc.ai)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Gradients are simply vectors pointing in the direction of highest rate of increase of the function. During backpropagation, gradients go through matrix multiplication multiple times using the chain rule. Small gradients get smaller until they vanish and thus it gets harder to train the weights. This is called the vanishing gradient problem.&lt;br&gt;
While smaller gradients vanish, if your gradient is a large value they go on increasing and result in very large updates to our network. This is known as the exploding gradient problem. &lt;/p&gt;

&lt;p&gt;Another challenge one faces with RNNs is that of 'reccurence'. Recurrence prevents parallel computation.&lt;br&gt;
Also, large number of training steps are required to train an RNN.&lt;/p&gt;

&lt;p&gt;Solution to all our problems is - &lt;strong&gt;Transformers&lt;/strong&gt;!&lt;br&gt;
As the title says, &lt;em&gt;Attention is all you need&lt;/em&gt; by &lt;a href="https://arxiv.org/abs/1706.03762"&gt;Vaswani et al, (2017)&lt;/a&gt; is the paper that introduced the concept of transformers.&lt;br&gt;
Let us first understand the &lt;strong&gt;Attention Mechanism&lt;/strong&gt;.&lt;br&gt;
Below attached is an image from my notes of &lt;a href="https://www.youtube.com/channel/UC7ZVvEo7-B7lA6LY2MVX72A"&gt;Prof. Pascal Poupart's&lt;/a&gt; lecture on Transformers.&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Uf3YvhWU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/nvhmz014asq2ygtwip42.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Uf3YvhWU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/nvhmz014asq2ygtwip42.jpg" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Attention Mechanism mimics the retrieval of a value (v) for a query (q) based on a key (k) in the database. &lt;br&gt;
We have a query and some keys (k1, k2, k3, k4), we aim to produce an output which is  a linear combination of values where the weights come from the similarity between our query and keys. &lt;br&gt;
In the above diagram, the first layer consists of the keys (vectors). We generate another layer from the similarity comparison of these keys with the query (q). Thus the second layer consists of similarities (s).&lt;/p&gt;

&lt;p&gt;We take softmax of these values to yield another layer (a). The product of values in (a) with the values (v) gives us the attention value. &lt;/p&gt;

&lt;p&gt;So far we have understood what gave rise to the need of &lt;em&gt;Attention&lt;/em&gt; and what exactly is &lt;em&gt;Attention Mechanism&lt;/em&gt;.&lt;br&gt;
What more will we cover?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multihead Attention&lt;/li&gt;
&lt;li&gt;Masked Multihead Attention&lt;/li&gt;
&lt;li&gt;Layer Normalisation &lt;/li&gt;
&lt;li&gt;Positional Embedding&lt;/li&gt;
&lt;li&gt;Comparison of Self Attention and Recurrent Layers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let's cover all this in the next blog! &lt;br&gt;
You can follow me on &lt;a href="https://twitter.com/ShambhaviCodes"&gt;twitter&lt;/a&gt; where I share all the good content and blogs! &lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>neuralnetwork</category>
      <category>transformers</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Text Classification using ELMo Embedding</title>
      <dc:creator>Shambhavi Mishra</dc:creator>
      <pubDate>Thu, 27 Aug 2020 18:05:40 +0000</pubDate>
      <link>https://dev.to/shambhavicodes/text-classification-using-elmo-embedding-4o79</link>
      <guid>https://dev.to/shambhavicodes/text-classification-using-elmo-embedding-4o79</guid>
      <description>&lt;p&gt;When we talk about supervised learning, a much exploited task is &lt;em&gt;'Text or Image Classification'&lt;/em&gt;. Today we will discuss Text Classification on BBC News Dataset.&lt;/p&gt;

&lt;h3&gt;
  
  
  Dataset
&lt;/h3&gt;

&lt;p&gt;We’ll use a public dataset from the &lt;strong&gt;BBC&lt;/strong&gt; comprised of 2225 articles, each labeled under one of 5 categories: business, entertainment, politics, sport or tech.&lt;br&gt;
The dataset is broken into 1490 records for training and 735 for testing. The goal will be to build a system that can accurately classify previously unseen news articles into the right category.&lt;/p&gt;
&lt;h3&gt;
  
  
  Preprocessing
&lt;/h3&gt;

&lt;p&gt;We can not feed raw text or human understandable text directly to our model. Preprocessing a text involves multiple tasks such as stemming (breaking down) a word to its root, stopword removal to eliminate repetitive and redundant words or simply, stopwords. Preprocessing of text can not be generalised and is very specific to the task and domain of the data. Since our dataset is fairly simple and this is a beginner focussed tutorial we will remove stopwords for preprocessing our data.&lt;/p&gt;

&lt;p&gt;We load our data using &lt;em&gt;Pandas&lt;/em&gt; :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import pandas as pd
data = pd.read_csv('Filename.csv')
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;We use &lt;strong&gt;NLTK&lt;/strong&gt; or Natural Language Toolkit, a Python library for modeling text. Stop words are the repetitive words, articles and conjunctions for example, which do not add value to the text from NLP perspective. NLTK library makes our task easy by providing us a list of commonly occuring stopwords.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from nltk.corpus import stopwords
stop_words = stopwords.words( ' english ' )
print(stop_words)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Output : &lt;br&gt;&lt;br&gt;
[ ' i ' , ' me ' , ' my ' , ' myself ' , ' we ' , ' our ' , ' ours ' , ' ourselves ' , ' you ' , ' your ' , ' yours ' ,&lt;br&gt;
' yourself ' , ' yourselves ' , ' he ' , ' him ' , ' his ' , ' himself ' , ' she ' , ' her ' , ' hers ' ,&lt;br&gt;
' herself ' , ' it ' , ' its ' , ' itself ' , ' they ' , ' them ' , ' their ' , ' theirs ' , ' themselves ' ,&lt;br&gt;
' what ' , ' which ' , ' who ' , ' whom ' , ' this ' , ' that ' , ' these ' , ' those ' , ' am ' , ' is ' , ' are ' ,&lt;br&gt;
' was ' , ' were ' , ' be ' , ' been ' , ' being ' , ' have ' , ' has ' , ' had ' , ' having ' , ' do ' , ' does ' ,&lt;br&gt;
' did ' , ' doing ' , ' a ' , ' an ' , ' the ' , ' and ' , ' but ' , ' if ' , ' or ' , ' because ' , ' as ' , ' until ' ,&lt;br&gt;
' while ' , ' of ' , ' at ' , ' by ' , ' for ' , ' with ' , ' about ' , ' against ' , ' between ' , ' into ' ,&lt;br&gt;
' through ' , ' during ' , ' before ' , ' after ' , ' above ' , ' below ' , ' to ' , ' from ' , ' up ' , ' down ' ,&lt;br&gt;
' in ' , ' out ' , ' on ' , ' off ' , ' over ' , ' under ' , ' again ' , ' further ' , ' then ' , ' once ' , ' here ' ,&lt;br&gt;
' there ' , ' when ' , ' where ' , ' why ' , ' how ' , ' all ' , ' any ' , ' both ' , ' each ' , ' few ' , ' more ' ,&lt;br&gt;
' most ' , ' other ' , ' some ' , ' such ' , ' no ' , ' nor ' , ' not ' , ' only ' , ' own ' , ' same ' , ' so ' ,&lt;br&gt;
' than ' , ' too ' , ' very ' , ' s ' , ' t ' , ' can ' , ' will ' , ' just ' , ' don ' , ' should ' , ' now ' , ' d ' ,&lt;br&gt;
' ll ' , ' m ' , ' o ' , ' re ' , ' ve ' , ' y ' , ' ain ' , ' aren ' , ' couldn ' , ' didn ' , ' doesn ' , ' hadn ' ,&lt;br&gt;
' hasn ' , ' haven ' , ' isn ' , ' ma ' , ' mightn ' , ' mustn ' , ' needn ' , ' shan ' , ' shouldn ' , ' wasn ' ,&lt;br&gt;
' weren ' , ' won ' , ' wouldn ' ]&lt;/p&gt;

&lt;p&gt;We also encode our labels for the classification task.&lt;/p&gt;
&lt;h3&gt;
  
  
  Embedding
&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;Word Embedding Model&lt;/em&gt; was a key breakthrough for learning representations for text where similar words have a similar representation in the vector space. &lt;br&gt;
&lt;em&gt;ELMo is a deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy). These word vectors are learned functions of the internal states of a deep bidirectional language model (biLM), which is pre-trained on a large text corpus. They can be easily added to existing models and significantly improve the state of the art across a broad range of challenging NLP problems, including question answering, textual entailment and sentiment analysis.&lt;/em&gt; - ELMo was developed by &lt;a href="https://allennlp.org/elmo"&gt;Allen&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Contextualized Word Representation is the representation of word which is heavily dependent on the surrounding words. ELMo takes into account all the text before generating an embedding to capture the semantics of the text.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is Model Polysemy?&lt;/strong&gt;&lt;br&gt;
Polysemy is the capability of a word to possess more that one meaning.&lt;br&gt;
&lt;em&gt;Bright&lt;/em&gt; means 'Shining' as well as 'Intelligent'.&lt;/p&gt;

&lt;p&gt;ELMo addresses these problems of text data modeling.&lt;/p&gt;

&lt;p&gt;I shall discuss more about different types of SOTA embeddings in another post.&lt;br&gt;
ELMo Embedding pre-trained model trained on 1 Billion Word Benchmark is available on &lt;a href="https://tfhub.dev/google/elmo/1"&gt;Tensorflow-Hub&lt;/a&gt;.&lt;br&gt;
Let's code!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import tensorflow as tf
import tensorflow_hub as hub
embed = hub.Module("https://tfhub.dev/google/elmo/2", trainable=True)
def ELMoEmbedding(x):
    return embed(tf.squeeze(tf.cast(x, tf.string)), signature="default", as_dict=True)["default"]
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Training our model we achieve an accuracy of 0.91 and a categorical crossentropy loss of 0.28.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;input_text = Input(shape=(1,), dtype=tf.string)
embedding = Lambda(ELMoEmbedding, output_shape=(1024, ))(input_text)
dense = Dense(256, activation='relu')(embedding)
pred = Dense(5, activation='softmax')(dense)
model = Model(inputs=[input_text], outputs=pred)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;You can view the notebook &lt;a href="https://github.com/ShambhaviCodes/Text-Classification-using-ELMo-Embedding/blob/master/Text_Classification.ipynb"&gt;here&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>beginners</category>
      <category>tutorial</category>
      <category>naturallanguageprocessing</category>
    </item>
    <item>
      <title>Python to Neural Networks : A Guide for Beginners</title>
      <dc:creator>Shambhavi Mishra</dc:creator>
      <pubDate>Wed, 26 Aug 2020 17:56:05 +0000</pubDate>
      <link>https://dev.to/shambhavicodes/python-to-neural-networks-a-guide-for-beginners-5ahg</link>
      <guid>https://dev.to/shambhavicodes/python-to-neural-networks-a-guide-for-beginners-5ahg</guid>
      <description>&lt;p&gt;Last Month, I completed a year of my exploration with Machine Learning and Data Science and thus, I decided to pen down the resources I have followed till now. &lt;/p&gt;

&lt;p&gt;I don’t say this is the way to be followed, this is just my share of experiences and mistakes which landed me here. This summary is for all my peers, who like me, find themselves lost on ‘how to-s’ of Data Science. &lt;/p&gt;

&lt;h2&gt;
  
  
  Python 101
&lt;/h2&gt;

&lt;h4&gt;
  
  
  Video Resources I followed :
&lt;/h4&gt;

&lt;p&gt;Telusko’s Python Playlist : I started learning Python with &lt;a href="https://www.youtube.com/playlist?list=PLsyeobzWxl7poL9JTVyndKe62ieoN-MZ3"&gt;this&lt;/a&gt; series by Navin Reddy which built my foundations for this journey&lt;br&gt;
I have also followed lectures by Charles Severance, you can find them &lt;a href="https://www.py4e.com/"&gt;here&lt;/a&gt; . &lt;br&gt;
&lt;a href="https://www.youtube.com/user/sentdex"&gt;Sentdex&lt;/a&gt; is a much recommended YouTube Channel, I came across it pretty late.&lt;/p&gt;

&lt;h4&gt;
  
  
  Books I followed :
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Automate the Boring Stuff with Python&lt;/li&gt;
&lt;li&gt;Fluent Python&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Machine Learning
&lt;/h2&gt;

&lt;h4&gt;
  
  
  Video Resources I followed :
&lt;/h4&gt;

&lt;p&gt;I started by taking up a Udemy Course which gave an overview of all the algorithms and its implementation, &lt;a href="https://www.udemy.com/course/machinelearning/"&gt;Machine Learning A-Z&lt;/a&gt; by Kirill Eremenco.&lt;br&gt;
While I explored the project ideas after the course, I knew I was missing on to something which was deeper and more mathematical when I read that everyone was doing a Machine Learning Course by Andrew Ng. Truly, the mathematical concepts built from this course helped me sail smoothly through the next phase. &lt;a href="https://www.youtube.com/watch?v=PPLop4L2eGk&amp;amp;list=PLLssT5z_DsK-h9vYZkQkYNWcItqhlRJLN"&gt;Here’s&lt;/a&gt; the link to the course. &lt;br&gt;
You can find it on Coursera too. &lt;/p&gt;

&lt;h4&gt;
  
  
  Books I followed :
&lt;/h4&gt;

&lt;p&gt;I have relied majorly on Machine Learning Mastery with Python - Jason Brownlee&lt;br&gt;
I also followed O’Reilly Python Data Science Handbook for a few things.&lt;/p&gt;

&lt;h2&gt;
  
  
  Delving Deeper to the Neural Nets
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Deeplearning.ai ‘s Specialisation Course : Again, I have finished them on YouTube (I didn’t know about financial aid a year back on Coursera). &lt;/li&gt;
&lt;li&gt;
&lt;a href="https://cs230.stanford.edu/"&gt;Stanford cs230&lt;/a&gt; : Another course taken by Andrew Ng which is based on Deep Learning and its applications.&lt;/li&gt;
&lt;li&gt;
&lt;a href="http://web.stanford.edu/class/cs224n/"&gt;Stanford cs224n&lt;/a&gt; : A course based on Natural Language Processing, it was my stepping stone to NLP. I learnt through doing the assignments and following up from books and blogs (say, Olah’s Blog for LSTM).
&lt;a href="http://cs231n.stanford.edu/"&gt;Stanford cs231n&lt;/a&gt; : While learning Computer Vision, this course is my guide. Solving assignments and trying out related projects was enough to substantiate the theory. Also, I can share my handwritten notes on this series, if required. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;I can’t emphasize more on trying out self-paced Projects to implement whatever you learn! Through the year I have finished many projects, some internship assignments that helped me learn so much.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>machinelearning</category>
      <category>beginners</category>
      <category>computerscience</category>
    </item>
  </channel>
</rss>
