DEV Community

Cover image for The story of how machines went from mimicking humans to creating ideas of their own.
Sailee Das
Sailee Das

Posted on

The story of how machines went from mimicking humans to creating ideas of their own.

Did we ever think if a machine could write poetry, diagnose diseases, or chat like a human one day?

Surprisingly, it is capable of all of them and more.

Here, we will walk through

· how machines were trained with LLM to do several tasks

· their evolution

· how they work

· how a generative system should be trained to mitigate risks.

Artificial intelligence(discovered in 1956, a machine mimicking the human brain by using AI predicting many possibilities) -> Machine Learning (discovered in 1997, processes structured data and derived features from them manually)-> Deep Learning(discovered in 2017, processes unstructured larger dataset ,explore infinite possibilities and could suggest the promising moves )-> Generative AI(discovered in 2021, generates creative content like ChatGPT,Bard) .


Evolution of large language model

Large language models(LLMs) are deep learning algorithms trained on massive dataset like websites, manuals, provider directories, wikis. GPU is a piece of hardware which LLM uses to perform tons of complex computations.

LLM employs neural network, which are like human brains. Each of them has several layers which passes some information to narrow down the decision space.

1) Recurrent neural network

The input comes in sequence which then combines with the previous output and goes through the processing logic to generate the next output.

Use cases: Language modelling, speech recognition where sequential information is important.

We often use Alexa or search by voice feature in Google search employing this network.

2)Convolutional neural network

Convolution detects patterns and edges ,then pooling layers keeps the important features and processes it, finally the fully connected layer which are already trained categories it based on the result.

Use cases: Facial recognition, text classification where local patterns are important.

We often get our face recognized at airports using Digiyatra screening machines.

3) GAN (Generative adversaries network )

Generator: Generates fake images which undergoes validation in the discriminator

Discriminator: Compares it with the real image and produces the output which is then fed back to the generator and the discriminator.

4)VAE(variational autoencoders)

The encoder takes input data points which are mapped to latent space following probablity distribution to compress the data space. Few random points are selected from the space and reconstructed by the decoder to produce the output.

5) Transformers neural network consists of several layers like input layer ,hidden layer and output layer. The input passed through the hidden layer which is similar to a series of layers including embedding layer, fast forward layer, recurrence layer and self attention layer to produce the output.


Use cases: Summarization, translation, question answering where long range dependencies and context understanding are essential.

All the generative models use transformers having two parts passing over the network.

· Encoder: Consists of multiple layers which extracts the context by capturing long range dependencies

· Decoder: Generates the output

Transformers network layers


For eg: In case of prediction of next words by Generative Pre-trained Transformer(GPT), it converts every word of the input into vectors, which are multi dimensional representation of the data where each element of the vector represent the magnitude in one dimension. Each vector is a list of numbers in a column .

Consider the following sentence has to be completed where every word is represented in X-Y plane.

The next word can be experience or process depending on the context.

Words with similar meaning are closely spaced vectors, much like neighbours residing in a same place on map. Then positional encoding is applied to maintain the sequence of words . Then the vectors pass through attention block which adds weights to the words in the context. Then it passes through feed forward layer which is common set of questions for all the words, creating a broader context. It keeps on repeating till it produces the output which is a vector presenting probabilities of all possible tokens which can come next in the context.

LLM can be further classified based on the direction of attention.

Till now, we covered how generative system are working and making our life easier.

But even in this brilliance , there are still shortcomings to overcome.

As GenAI is trained by human data, they should not be biased. It can result in inequalities.

· Misunderstanding may lead to hallucinations.

· GPT may not always provide accurate pieces of information.

· GPT has a fixed context size.

· There are privacy issues. There are penalties for inputting or outputting(revealing) confidential business data, private individual or intellectual property pieces of information .

As these systems continue to evolve, the real challenge is not just teaching machines to imagine, but teaching them to do so responsibly.

Ideally, a generative AI model should be designed such that all such risks can be mitigated by following few approaches as stated below.

· The machine should be fed data ,considering its copyrights,confidentiality.

· The output should be watermarked to validate its authenticity.

· The output should obey risk compliance to avoid potential cybersecurity attacks.

· It should be trained with updated dataset at regular intervals.

The story of Generative AI is still being written. The question is: how do we, as humans, guide it?

What do you think — should we let machines imagine for us, or should imagination remain a human gift? Share your thoughts below — I’d love to hear your perspective.

Top comments (0)