DEV Community

Cover image for Excited About GitHub Copilot? Use It at Your Own Risk!
Abu Tahir
Abu Tahir

Posted on • Originally published at dev.to

Excited About GitHub Copilot? Use It at Your Own Risk!

A thorough analysis of how GitHub copilot works

So recently I was surfing the web when I came across a YouTube video on GitHub copilot. It amazed me to see how AI is transforming the lives of programmers all around the globe. But something felt wrong. The person was boasting about it too much and it didn't seem right for a test version of the software, so I thought of taking a deep dive into the system about how it works.

If you don't know what GitHub copilot is, then let me tell you, GitHub copilot is an intelligent AI system released by GitHub and OpenAI organization that gives you appropriate suggestions for your code as well as it can generate an entire function based on the comments you provide! That gives it another name called AI pair programmer.

Think it like this, we may write some code, and when stuck we go online and search for some code on stack overflow and paste it or use it as a suggestion, right? But that pain is much, much decreased, because we write a neat comment and we get an outstanding code!

So what may be the next hiring profiles for companies? Good comment writers? just kidding though!

How does GitHub copilot work?

Well, to understand this, let's first understand a similar system developed by OpenAI.

As I mentioned before that it's released by GitHub and OpenAI organization, OpenAI Systems had also developed a similar AI system called Codex, Codex is a system that converts the natural language to neat programming code, which is based on the GPT model, that is a transfer learning model. It is pre-trained on billions of lines of text and gives you a human form of text by using deep learning techniques. Codex is a system that uses GitHub code that is publically available as input data, uses a GPT model which is a transformer model, and provides you with a programming code that you can use as a suggestion or can use directly.

Check out the live demonstration of Codex

Codex was first designed to check the ability of the AI to write Python code and it has performed very well, but its accuracy was quite low, it solved 28.8% of problems when the input was provided as comments, which was evaluated on a program called Human-Eval, but it's fine though because generating a code which is not written by any person is something very unique. However, the results were improved through repeated sampling, which took 100 samples per problem and it really worked! the results were 70% accurate.

How is it related to GitHub copilot?

The amazing technique of codex serves as an underground foundation for GitHub copilot. However, a different variant of OpenAI codex is used here, unlike codex, which was previously designed for Python, the copilot can understand almost every programming language out there which makes it very powerful.

It can give around 10 suggestions to choose from, which you can use as a suggestion code or you can use it directly and change it according to your needs!

The high-level design of GitHub copilot

Let's understand the image in a detailed manner, abstractly we can say that the public code is sent to the model as input, the model processes the information and routes the results to the intermediate API, in our case the GitHub service which returns the suggestions.

One cool thing is the copilot records if you are accepting the suggested code or rejecting it based on which it learns.
In a simple way to explain it, I would say it follows the strategy of any basic chatbot which is to understand the intent and give the patterns.

Moving one step forward something like context remembrance techniques to keep track of the context of comments you provide, however, in this case, only a part of the code is taken into GitHub copilot service to process and not the whole code, so it's safe, you don't have to worry too much.

But what's the risk?

GitHub Copilot is run by OpenAI codex in the background and the input it uses is the public code which is on the internet on various sources and also from public GitHub Repos, This code might be full of bugs, and also it might be sensitive and the copilot can give unexpected results and may not always work. However, it's up to you as a programmer to accept it or reject it.

The comments you provide must be very specific in order to get more out of our pair programmer, otherwise, it may give undesirable results.

According to the most recent official results by the copilot team, the model gave around 43% accuracy and 57% after some more attempts, so you see it's still in a learning phase.

The code generated is based on the data provided. It can be faulty code as well as the best code available, it's not actually tested by the system so use it at your own risk! double-check before using it. Also as it uses billions of lines of code, so code syntax or the libraries or any APIs may be deprecated till now. so we must be very careful about it. This is the reason it is a copilot and you are the actual pilot and responsible to accept or reject the code.

Conclusion

I personally believe AI is booming in this century. We can see AI-powered systems everywhere and the entire world is turning into an AI playground. 

In this article, we saw a marvelous use case of AI. GitHub copilot is a tremendous work and I believe it will improve to the stages we didn't even imagine, however for now, from a beginner's perspective, one has to be more careful while writing an AI-generated code.

With that said, Thank you for reading, Happy Coding!

Top comments (2)

Collapse
 
pdfreviewer profile image
pdfreviewer

It was useful, but I uninstalled the IDE plugin once they announced to charge users.

Collapse
 
krishnaagarwal profile image
Krishna Agarwal

Sometimes the code CoPilot suggests is not related but it's helpful for sure.