DEV Community

Cover image for How Does GPT-3 Work?
Michael Jolley for Deepgram

Posted on • Originally published at dpgr.am

How Does GPT-3 Work?

GPT-3 is a large language model (LLM), and it’s been making headlines in nearly every industry. From the release of the seemingly self-aware. ChatGPT to copywriting and coding AI apps, GPT-3 is spreading like wildfire, first through the tech press and now through mainstream media.  

Originally released in 2020, GPT-3 was developed by OpenAI, an artificial intelligence research lab in San Francisco. It’s a pre-trained universal language model that uses deep learning transformers to generate human-like text—and it’s pretty good at it. 

ChatGPT and the GPT-3 API family have been used to write poetry and fiction, code websites, respond to customer reviews, suggest better grammar, translate languages, generate dialogue, find tax deductions, and automate A/B testing. The use cases are seemingly endless and its results are surprisingly high-quality. 

While this large language model can do some incredible, useful things, it still has its flaws. We’ll cover that later, though. First, let’s cover the basics like “What is GPT-3?” and “How does GPT-3 work?”

What is GPT-3?

GPT-3 stands for Generative Pre-trained Transformer 3, the third iteration of OpenAI’s GPT architecture. It’s a transformer-based language model that can generate human-like text. 

This deep learning model was pre-trained on over 175 billion parameters, among the largest of large language models in production today. Because of its size, GPT-3 is much better than any previous model for producing long-form and specialized text.

Is GPT-3 open source?

GPT-3 is not open source. OpenAI reasons that GPT-3 could be misused and therefore shouldn’t be available for open-source use. Additionally, Microsoft acquired an exclusive license to GPT-3 in September 2020.

Microsoft is the only entity aside from OpenAI to have access to the underlying GPT-3 model. Others can still use GPT-3 via the public API and during ChatGPT’s testing phase. The project was backed by Microsoft (contributing $1 billion). 

The closest open-source alternative to GPT-3 is GPT-JT, released by Together in November 2022. It was trained using a decentralized approach on 3.53 billion tokens, and they claim it can outperform other models. You can find GPT-JT on Hugging Face

Why was GPT-3 created?

Before GPT-3, the largest trained language model was Microsoft’s Turin NLG model, which had 10 billion parameters. Previously, most generative language models could only produce simple sentences or yes and no answers. 

What is GPT-3.5?

GPT-3 was released in 2020. GPT-4’s anticipated release could happen as soon as 2023. In the meantime, OpenAI quietly released GPT-3.5—which has a better grasp of relationships between words, parts of words, and sentences—with no formal announcement in November 2022. 

After complaints that GPT-3 was generating toxic and biased text, OpenAI began experimenting. GPT3.5 is similar to InstructGPT, a version of GPT-3 that was re-trained to better align with users’ intentions. 

OpenAI trained GPT-3 on a corpus of code and text it sourced through a crawl of open web content published through 2021. Its knowledge of events and developments post-2021 is limited. This new version of GPT learned relationships between words, parts of words, and sentences. GPT-3.5 is only available through OpenAI’s APIs and ChatGPT. 

How does GPT-3 Work? 

When a user inputs text, known as a prompt, the model analyzes the language using a text predictor and generates the most helpful result. 

GPT-3 uses patterns from billions of parameters gleaned from over 570GB of internet-sourced text data to predict the most useful output. It expects the next appropriate token in a given sequence of tokens, even ones it hasn’t been trained on. 

GPT-3 is a meta learner, meaning it’s been taught to learn. It understands, to an extent, how to perform new tasks similarly to how a human would. The original baseline GPT-3 doesn’t actually know how to perform any specific task; it knows how to learn. This makes it a powerfully versatile model. 

Let’s dive deeper. First, GPT-3 goes through unsupervised training with its massive, internet-harvested set of parameters. Then, it’s ready to accept text inputs, a.k.a. prompts. It converts the inputted words into a vector representing the word, a list of numbers. Those vectors are used to compute predictions in transformer decoders, 96 of them to be exact. Then they’re converted back to words. 

How is GPT-3 trained? 

OpenAI used internet data to train GPT-3 to generate any type of text. It was trained in a generative, unsupervised manner. In simple terms, it was taught to transform prompts into large amounts of appropriate text without supervision. 

Is GPT-3 conscious?

While they designed GPT-3 to sound like a human and learn, it’s not sentient. GPT-3 can logically reason with a level of understanding that’s not as good as the average adult. It is, however, about as close to human-like verbal output as any model yet released. Some people did wonder if GPT-3 was self-aware. In reality, unlike its predecessors, the model was trained to have a degree of common sense. 

How to use GPT-3

Where to use GPT-3 is simple. Head to OpenAI’s ChatGPT to get started. If you want more control, you can mess around in the playground. Developers can incorporate GPT-3 into their applications through OpenAI's API

What can GPT-3 do?

GPT-3 is used for generating realistic human text. It’s been used to write articles, essays, stories, poetry, news reports, dialogue, humor, advertisements, and social media copy. It can even philosophize, albeit badly. GPT-3 can mimic the styles of specific writers, generate memes, write recipes, and produce comic strips. 

GPT-3 is capable of generating code, too. It’s written working code, created mock-up websites, and designed UI prototyping from just a few sentence descriptions. Users can also create plots, charts, and excel functions with GPT-3. 

In addition to producing convincingly human text and code, it can generate automated conversations. GPT-3, as implemented in ChatGPT, responds to any text that a user types in with a new piece of contextual text. This feature is often implemented to add realistic dialogue to games and provide customer service through chatbots. 

GPT-3’s predecessors, aptly named GPT-1 and GPT-2, were criticized for not actually “knowing” anything and not having common sense. This generation of GPT was trained to follow a sequence of events and predict what’s next. It has at least some common sense, the ability to learn, and even the capability of philosophizing. 

Examples of GPT-3

Here are the top projects and applications powered by GPT-3 that you need to know. 

ChatGPT

From OpenAI, ChatGPT is a chatbot. It answers questions, remembers what a user said earlier in a conversation, answers follow-up questions, admits mistakes, challenges incorrect premises, and rejects inappropriate requests. 

Since it’s based on GPT-3, it’s pretrained. OpenAI has also fine tuned it with supervised and reinforcement learning techniques. Its research release in November 2022 sparked headlines in mainstream news outlets and tech sources alike. 

Copilot

Part of Visual Studio Code, Copilot autocompletes code snippets. It was released by GitHub and OpenAI, based on Codex—a product of GPT-3. Codex came out in 2020, while Copilot was released in autumn 2021. 

Debuild

Debuild creates code for web apps on-demand using GPT-3 as its base model.  It asks for a description of the user’s app and use cases. Then, it generates React components, SQL code, and helps assemble interfaces visually. Currently, Debuild is waitlisted

A/BTesting  

A/BTesting is an automated A/B testing provider. It uses GPT-3 to generate multiple versions of the title, copy, and call to action. It tests them for you on its own using a JavaScript snippet or plugin. When the test reaches statistical significance it mixes them up, mutates them, and runs another batch to ensure their customers have the highest converting copy possible. 

Replier

Replier responds to customer reviews automatically. It learns from previous review responses to create tailored, unique answers. They use GPT-3 and then clean the output using a monitoring system to detect bad behaviors and improve its replies.

Jasper

Jasper focuses on content writing, primarily blog posts. The service offers templates, art, content snippets in 25 languages, emails, reports, and stories. They also have a chatbot feature to help with brainstorming. 

Lex

Lex, created by Every, is an AI-powered word processor. It allows writers to generate essays, articles, stories, and optimized headers with help from GPT-3. While Lex can write full articles, copy, and more, it differs from other AI copy generators by offering a standalone word processor that can also complete a partially written paragraph. Lex is in public beta with a waitlist. 

Duolingo   

Duolingo, the gamified language learning app, uses GPT-3 APIs to provide grammar suggestions. Currently, they’re only using it for French, but this could lead to the other 35 languages they offer in the future.

Keeper Tax

Keeper Tax helps users file their taxes and find more deductions. They market to 1099 contractors and freelancers. GPT-3 is used at Keeper Tax to find tax-deductible expenses in bank statements. 

Quickchat

Another chatbot, Quickchat is a multilingual AI assistant that can automate customer support, online applications, and knowledge base searches. Customers can also use the widget to make automated live chat conversations based on training information uploaded by the user. Users can upload product descriptions, FAQs, and internal documentation to serve their customers automatically. 

Algolia

Algolia is a vertical search engine offering recommendations and suggested searches for websites, mobile, and voice applications. Developers can use Algolia to implement site-specific search, digital content discovery, enterprise search, SaaS application search, customized content and product discovery, and more. 

Lexion

Lexion started as a contract management system. Now, they’re using GPT-3 to help lawyers with their contracts. Their Microsoft Word plugin suggests edits and writes text summaries. Currently, the plugin is only intended to assist lawyers and improve efficiency, not to completely write contracts on its own. 

GPT-3 is imperfect, but it is revolutionizing artificial intelligence.

GPT-3 can power a variety of tools and services in a way that AI couldn’t before. It can speed up professional writing, software engineering, and customer service tasks. It’s making editing, language learning, and coding tools more interactive and accessible to their users. The extent to which GPT-3 has already expanded beyond its predecessors is promising, and it implies an exciting future for GPT-4 and beyond. 

Despite its usefulness, debates over the power and influence of generative AI are already flaring up. Samples have shown GPT-3 can sometimes regurgitate some of the racist, sexist points it learned from its internet training data; some are justifiably wary of the toxic language that occasionally arises in its automated answers. 

OpenAI has adjusted the way GPT-3 learns after instances of lies, sexism, and racism were found in its answers. While the newest version, GPT-3.5, contains fixes, it’s still imperfect. GPT-3 has quickly become a provocative topic that will continue to change as new versions are released and new use cases are found. 

Will it replace programmers, lawyers, and writers? No. So far, GPT-3 can’t produce completely error-free or convincing work independently. It can, however, be a supportive tool for people in these professions and many others.

Top comments (5)

Collapse
 
cmgustin profile image
Chris Gustin

I’ve been blown away by how powerful GPT is as a coding tool. I spun my wheels for an hour working through an app idea the other night, made some progress, and was about to turn in for the night. On a whim, I pulled up GPT, gave it the specifics of the project, and asked it for the code. 15 minutes later, I had a working version of my idea, with features and implementations I hadn’t considered. The code was rough in spots and it took a little massaging to get it working, but I couldn’t believe how much faster it was.

To me, AI won’t replace programmers anytime soon, but I do think it’s possible that programmers who embrace AI as part of their toolset will eventually push out the programmers who don’t embrace it, the same as has happened to countless jobs throughout history (e.g. the shift from hand tools to electric tools).

Collapse
 
paulriviera profile image
Paul Riviera

Great summary, it’s worth noting that GPT-3.5 is now available through Azure OpenAI Services, making more easily accessible to Azure developers: azure.microsoft.com/en-us/blog/gen...

Collapse
 
oskarkaminski profile image
Oskar • Edited

OpenAI reasons that GPT-3 could be misused and therefore shouldn’t be available for open-source use. Additionally, Microsoft acquired an exclusive license to GPT-3 in September 2020.

Either you are concerned about misuse of a technology or you sell it to a corporation like Microsoft.

It's a XOR gate.

You don't sell it to Microsoft IF you are concerned about misuse.
That's a semantic error that should be easily catched by your semantic analysis step in compilation of the article ;)

Collapse
 
codewithbernard profile image
Bernard Bado

Great breakdown of GPT model!

But I'm a bit sad my AI code-review assistant didn't make the list.

Collapse
 
juanfrank77 profile image
Juan F Gonzalez

Now we gotta get ready for GPT-4.