In this blog we will be talking about the basics of neural networks and generative models, which are fundamental concepts in the field of artificial intelligence and machine learning.
I have created a YouTube video on similar lines before, would request you to go through it to start with:
Before I move ahead, as a one liner if we define these terms it will read like:
💡Artificial Intelligence (AI): It's the simulation of human intelligence in machines that are programmed to think and act like humans.
💡Machine Learning (ML): It's a subset of artificial intelligence, which is broadly defined as the capability of a machine to imitate intelligent human behavior.
💡Deep Learning (DL): It's a subset of machine learning that uses multilayered neural networks, called deep neural networks, to simulate the complex decision-making power of the human brain.
💡Neural Networks: It's a method in artificial intelligence that teaches computers to process data in a way that is inspired by the human brain.
💡Generative Models: These models focus on understanding how the data is generated. They aim to learn the distribution of the data itself. For instance, if we're looking at pictures of boy and girl, a generative model would try to understand what makes a boy look like a boy and a girl look like a girl.
💡Generative adversarial network (GAN): a deep learning architecture that trains two neural networks to compete against each other to generate more authentic new data from a given training dataset. A GAN is called adversarial because it trains two different networks and pits them against each other.
Artificial Neural Networks (ANN) are the very core of Deep learning. They are versatile, powerful, scalable and ideal to tackle large and highly complex ML tasks.
Applications of ANNs includes google images, powering speech recognition services (like Apple's Siri), recommending the best video to watch to millions of users per day (YouTube), learning to beat the world champion at the game of GO (by examining millions of past games and the playing against itself e.g.: Deep Mind's AlphaGo).
Artificial neural networks were first introduced in 1943 when Warren McCulloch, a neurophysiologist and a young mathematician, Walter Pitts, developed the first models of neural networks.
They presented a model of How biological neurons might work together in animal brains to perform complex computations. This was 1st ANN Architecture.
ANN entered dark era due to lack of resources. Later in 1980 there was a revival of interests in ANN as network architectures were invented and better training techniques were developed.
But by 1990's powerful alternative ML techniques such as SVM (Support Vector Machine) were favoured by most researchers, as they seemed to offer better results and stronger theoretical foundations.
Now the question arises Why are ANNs relevant today?
The answer to it is:
✔️ Firstly, huge quantity of data is available to train neural network.
✔️ Secondly, improved hardware and software resources in terms of more powerful GPUs etc. With tremendous increase in computing power has made it possible to train large neural networks in a reasonable amount of time.
✔️ Thirdly, ANNs frequently outperform other ML techniques on very large and complex problems.
Warren McCulloch and Walter Pitts model of biological neuron
The first computational model of a neuron was proposed by Warren McCulloch (neuroscientist) and Walter Pitts (logician) in 1943. It may be divided into 2 parts. The first part, g takes an input, performs an aggregation and based on the aggregated value the second part, f makes a decision.
The artificial neurons activate its output when more than a certain number of its input are active.
In a simple neural network, every node in one layer is connected to every node in the next layer. There is only a single hidden layer.
Whereas Deep learning systems have several hidden layers that make them deep😊
There are two main types of deep learning systems with differing architectures - convolutional neural networks (CNNs) and recurrent neural networks (RNNs).
The main differences between CNNs and RNNs include the following:
✔️CNNs are commonly used to solve problems involving spatial data, such as images.
✔️RNNs are better suited to analyzing temporal and sequential data, such as text or videos. It good at natural language functions like language modeling, speech recognition, and sentiment analysis.
CNNs and RNNs have different architectures.
Convolutional neural networks (CNN) are one of the most popular models used today. This neural network computational model uses a variation of multilayer perceptrons and contains one or more convolutional layers that can be either entirely connected or pooled.
RNN use other data points in a sequence to make better predictions. They do this by taking in input and reusing the activations of previous nodes or later nodes in the sequence to influence the output.
Perceptron?
It is one of the smallest ANN architectures invented in 1957 by Frank Rosenblatt, based on a slightly different artificial neuron called a Linear Threshold Unit (LTU).
In LTU, the inputs and outputs are now numbers, instead of binary on/off values.
Each input connection is associated with a weight.
It computes a weighted sum of its input. Then applies step function to that sum and gives output.
A simple LTU can be used for simple linear binary classification. It computes a linear combination of the inputs and if the results exceeds a threshold, it outputs a positive class or else outputs negative class. Just live our logistic regression classifier or linear SVM models of ML.
In 2010 Xavier Glorot, Yoshua Bengio published a paper titled "Understanding the difficulty of training deep feedforward neural networks" in International Conference. They suggested that root cause of Vanishing Gradient problem (since gradient becomes really small in early input layers which lead to slow training) in Neural Network, can be solved by using better activation functions.
In a neural network, we update the weights and biases of the neurons on the basis of the error at the output. This process is known as back-propagation.
Activation functions make the back-propagation possible since the gradients are supplied along with the error to update the weights and biases.
Thus we can say that a neural network is just a BIG mathematical function. You could use different activation functions for different neurons in the same/different layer(s). Different activation functions allow for different non-linearities which might work better for solving a specific function.
Before we end this blog and leave further learning for next post.
Let's talk about:
How can AWS help with your deep learning requirements?
Amazon Web Services (AWS) has several Deep Learning offerings. Some examples of AWS services you can use to fully manage specific deep learning applications include:
Amazon Augmented AI (Amazon A2I) enables you to build the workflows required for human review of ML predictions. Amazon A2I brings human review to all developers, removing the undifferentiated heavy lifting associated with building human review systems or managing large numbers of human reviewers.
Amazon CodeGuru Security tracks, detects, and fixes code security vulnerabilities across the entire development cycle.
Amazon Comprehend uses natural language processing (NLP) to extract insights about the content of documents. It develops insights by recognizing the entities, key phrases, language, sentiments, and other common elements in a document.
Amazon DevOps Guru makes it easy for developers and operators to improve the performance and availability of their applications, improves application availability using ML-powered cloud operations.
Amazon Forecast is deep learning service for time-series forecasting, uses ML to forecast sales operations and inventory needs for millions of items.
Amazon Fraud Detector detects online fraud with ML, enhancing business security practices.
Amazon Translate powered by deep-learning technologies, Amazon Translate delivers fast, high-quality, and affordable language translation. It provides highly accurate and continually improving translations with a single API call.
Hope you enjoyed going through this article.
Thanks
Sumbul
Top comments (0)