A quick introduction to language models

#machinelearning #chatgpt #python #ai

A language model is a computer program capable of understanding text in the form of natural language. This understanding enables it to achieve tasks such as predicting the sentiment of the text, performing grammatical corrections, translating languages, or generating new text. The generation aspect is currently capturing the interest of everyone.

With text generation, we can still perform the other tasks. The performance of a language model depends on its size: the larger the model, the better its performance. We have always had language models; examples include autocomplete on our phones and spam filters. However, these models were basic and could only perform one task at a time.

The models that are currently astonishing the world are large, hence the name Large Language Models.

Prompting

A compiler and a language model aren't so different; they both work with text. However, a compiler takes in structured text in the form of computer programs, while a Language Model works with unstructured text in the form of human language. The text it takes in is called prompts.

The output of most language models is text. Let's work on a simple example using the following prompt:

What is the capital of Nigeria?

The Language Model will analyze the prompt and generate the response that best completes it. The output of this prompt is:

The capital of Nigeria is Abuja.

The output essentially completes the input prompt. We can demonstrate this by providing an incomplete prompt:

Nigeria is a ....

This will produce the output:

Nigeria is a country located in West Africa.

OpenAI Completion API

OpenAI offers an API endpoint that allows us to interact with their language model, appropriately named the Completion API. It is a HTTP endpoint so all we need is a HTTP client and API keys then you are set to go.

curl https://api.openai.com/v1/completions   
 -H "Content-Type: application/json"   
 -H "Authorization: Bearer $OPENAI_API_KEY"
 -d '{
    "model": "text-davinci-003",
    "prompt": "What is the capital of Nigeria?"
  }'

I utilized curl as my HTTP client and stored my API key in the environment variable $OPENAI_API_KEY. I then passed it as a Bearer token in the header. The response to this request should resemble the following:

{
  "id": "cmpl-7jjcu1cmQqUZy7qawoQ3rlUBgmrqh",
  "object": "text_completion",
  "created": 1691134504,
  "model": "text-davinci-003",
  "choices": [
    {
      "text": "\n\nThe capital of Nigeria is Abuja.",
      "index": 0,
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 7,
    "completion_tokens": 10,
    "total_tokens": 17
  }
}

The response contains several pieces of helpful information, but we are primarily interested in the choices field within our response.

"choices": [
  {
    "text": "\n\nThe capital of Nigeria is Abuja.",
    "index": 0,
    "logprobs": null,
    "finish_reason": "stop"
  }
]

The choices field contains an array of possible responses but we have only one response. We can obtain our completed text from the text field.

This was a concise introduction to language models. Their primary function is to complete the preceding text they receive. If you're keen on delving deeper into the concept of prompting, you can explore this site. Additionally, for detailed information about OpenAI's completions endpoint, take a look at their documentation.

DEV Community

A quick introduction to language models

Prompting

OpenAI Completion API

Top comments (0)

Read next

Living The Dream With AI: CodeNewbie Podcast

Complete Coze tutorial: Build AI Chatbots from scratch

Congrats to the Cloudflare AI Challenge Winners!

Llama 3 Plugin