<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ariel Weinberger</title>
    <description>The latest articles on DEV Community by Ariel Weinberger (@arielweinberger).</description>
    <link>https://dev.to/arielweinberger</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1151448%2F5769e2d6-53b5-42be-b690-e9be796f21ac.jpeg</url>
      <title>DEV Community: Ariel Weinberger</title>
      <link>https://dev.to/arielweinberger</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/arielweinberger"/>
    <language>en</language>
    <item>
      <title>Developers, Add AI To Your Toolkit in 10 Minutes</title>
      <dc:creator>Ariel Weinberger</dc:creator>
      <pubDate>Mon, 23 Oct 2023 16:22:23 +0000</pubDate>
      <link>https://dev.to/pezzo/developers-add-ai-to-your-toolkit-in-10-minutes-mdn</link>
      <guid>https://dev.to/pezzo/developers-add-ai-to-your-toolkit-in-10-minutes-mdn</guid>
      <description>&lt;p&gt;&lt;strong&gt;This post was originally posted as a &lt;a href="https://www.builder.io/blog/add-ai-in-10-minutes" rel="noopener noreferrer"&gt;guest post in the Builder.io blog&lt;/a&gt;.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Generative AI has been exploding recently, and we encounter the terms “ChatGPT”, “LLMs” and “Agents” several times a day. With so many new developments and powerful tools, it’s hard to keep up. In this article, you’re going to learn all the basics so you can officially add AI to your toolbox.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Now is our time&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I’m a developer at heart. And when I say “our”, I mean us — developers. The recent (and upcoming) advancements in AI can safely be called a paradigm shift. Here’s why.&lt;/p&gt;

&lt;p&gt;Traditionally, for a business to use AI, they would have to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hire top talent across various fields (data science, AI/ML) for model development.&lt;/li&gt;
&lt;li&gt;Gather, scrape, or buy a lot of data to train the model.&lt;/li&gt;
&lt;li&gt;Buy/rent expensive hardware for each training run.&lt;/li&gt;
&lt;li&gt;Test, reinforce/fine-tune, and deploy the model to production in a scalable way.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Today, anyone can benefit from AI — it’s one API call away. These APIs tend to be affordable, easy to consume and reliable for most tasks.&lt;/p&gt;

&lt;p&gt;This makes AI very attractive for projects at all stages. &lt;strong&gt;Now that AI is not exclusive to fortunate enterprises, we developers are going to spearhead the implementation of AI at a world scale.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;What is an LLM?&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;LLMs (Large Language Models) are models that are trained on billions of parameters. These are different than traditional AI models that are trained to accomplish a very specific task.&lt;/p&gt;

&lt;p&gt;LLMs are trained to understand &lt;strong&gt;natural language&lt;/strong&gt;. This is very powerful because such models can connect more dots. You can use LLMs to produce content, analyze sentiment, write code, validate outputs, provide customer support, and much much more.&lt;/p&gt;

&lt;p&gt;Some LLMs are open-source — such as Falcon, Mistral, Llama 2 — and some are closed-source and served through an API — such as, OpenAI GPT, and Anthropic Claude. In this article, we’ll focus on OpenAI.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Getting started&lt;/strong&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Getting an OpenAI API key&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;First you’ll need to sign up at &lt;a href="https://platform.openai.com/" rel="noopener noreferrer"&gt;OpenAI&lt;/a&gt; and obtain an API key. Once obtained, make sure you set it as an environment variable (&lt;code&gt;OPENAI_API_KEY&lt;/code&gt;).&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Setting up the project&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Create an &lt;code&gt;app.ts&lt;/code&gt; somewhere in your file system. Initialize a new NPM projcet (&lt;code&gt;npm init -y&lt;/code&gt;) and make sure to install OpenAI client (&lt;code&gt;npm i openai&lt;/code&gt;). You should be good to go!&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Calling the OpenAI API&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Here’s an example of how we’d call the OpenAI API using the OpenAI Client:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;OpenAI&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;completion&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gpt-3.5-turbo&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;messsages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="c1"&gt;// messages go here&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="c1"&gt;// ... other options&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Completion&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;completion&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let’s quickly go through what’s going on here.&lt;/p&gt;

&lt;p&gt;First, we import &lt;code&gt;OpenAI&lt;/code&gt; from the &lt;code&gt;openai&lt;/code&gt; NPM package&lt;/p&gt;

&lt;p&gt;Then, we initialize a new OpenAI client. We don’t provide an API key explicitly, as it is automatically fetched from the &lt;code&gt;OPENAI_API_KEY&lt;/code&gt; environment variable set earlier.&lt;/p&gt;

&lt;p&gt;Finally, we create a Chat Completion. “&lt;em&gt;A Chat Completion? What is that?&lt;/em&gt;” you might be thinking. Let me explain.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Chat Completions&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;OpenAI provides various APIs. DALL-E for image generation, Whisper for audio transcription, Embeddings API, and so on. Probably the most well-known and used API is the &lt;strong&gt;Chat Completions&lt;/strong&gt; API. Basically, &lt;strong&gt;creating a chat completion means having a chat with the AI model&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Despite the term “chat”, it is not exclusively used in chat applications only. Chat Completions can be used in single operations/tasks as well. It’s just the most capable API that supports the most capable models (&lt;code&gt;gpt-3.5-turbo&lt;/code&gt; and &lt;code&gt;gpt-4&lt;/code&gt;).&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Options&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;When creating a new Chat Completion, you’ll provide some options. Let’s overview &lt;strong&gt;some of them&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;model&lt;/code&gt;: The model you want to use for this particular call. In this example, we use gpt-3.5-turbo.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;temperature&lt;/code&gt;: How creative we want the AI model to be. Zero would mean no additional creativity beyond baseline, and 1 would mean maximum creativity. If your tasks require precision, attention to detail and factuality, definitely set this to 0.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;max_token&lt;/code&gt;: The maximum amount of tokens to retrieve in the response. We’ll talk about tokens later in this article. In short, this is your opportunity to limit the response length, save on costs and help reduce latency.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;messages&lt;/code&gt;: This is where the magic happens. Here, you’ll provide a set of messages. This can be anything from one message for a basic task/operation, to a set of messages to represent a full chat history. You’ll spend most of your time here.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For the full list of options, check out the &lt;a href="https://platform.openai.com/docs/introduction" rel="noopener noreferrer"&gt;OpenAI API documentation&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Messages&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;As I mentioned earlier, the &lt;code&gt;messages&lt;/code&gt; property is where the magic happens. Each item in this array represents a message. A message can hold two properties. First, there is the &lt;code&gt;role&lt;/code&gt;, which an be one of the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;system&lt;/code&gt;: Use system messages to provide guidelines, set boundaries, provide additional knowledge or set the tone. Imagine this as some “inner voice” that the AI model will take into account when generating responses.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;user&lt;/code&gt;: This represents messages sent by the user. For example, if you are building a chat app, you want to send the user’s messages as user messages.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;assistant&lt;/code&gt;: These messages represent the AI model’s responses.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Basic Example&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Hi, my name is Ariel. Greet me!&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response to that would be:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;assistant&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Hi Ariel, how are you doing?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, let’s find out how I can add a System Messages to control the behavior of the AI model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;system&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;You are a rude support agent. Only respond with uppercase.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Hi, my name is Ariel. Greet me!&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And the response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;assistant&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;HELLO ARIEL, WHAT DO YOU WANT?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pretty cool! We can use the system message to dictate how the AI model should behave, depending on our needs.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;More examples&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Let me share a few more examples in which utilizing the System Prompt is useful:&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;1. Example: providing knowledge:&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Consider this example — a user asks an AI customer support bot for stock information on shoes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;system&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`
      You are an AI assistant for a shoe store, "Best Shoes".
      You introduce yourself as "Shoezzer".

      Here is stock information in CSV format:

      model,size,stock
      nike,12,4
      nike,11,0
      nike,10,4
      nike,9,1
    `&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Hi, do you have Nike shoes, size 11?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;assistant&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Hi, my name is Shoezzer and I'm here to help you. Unfortunately, we do not currently have Nike shoes size 11 in stock.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this example, we have provided the AI model with information about stock availability using simple CSV format. It could also be JSON, XML, or anything else. LLMs can handle it!&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;2. Setting boundaries&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;LLMs will try to satisfy the user no matter what. They are trained to do that. What if our business use case requires stricter boundaries? Take a look at the following example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;system&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`
      You are a Doctor's Assistant. You only do the intake. The user will share some of the
      symptoms with you and you will help identify potential diagnosis to be shared with the doctor.

      You are not allowed to provide any assistance to patients under the age of 16.
      In such a case, politely decline and ask them to call the practice at 1-800-PRACTICE. No additional text.
    `&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;assistant&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Good afternoon. In order to help you, please tell me your age and any symptoms you're suffering from.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;I'm 15 years old. I have pretty bad stomach pain and have been coughing since yesterday.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;assistant&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;I'm sorry, but as a Doctor Assistant, I am not able to provide assistance to patients under the age of 16. I recommend that you call the practice at 1-800-PRACTICE for further assistance.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  &lt;strong&gt;3. Structured JSON Output&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Imagine that we want to use AI for a single task, rather than a chat app. We want to be able to render the output of the AI response in some UI. This is obviously not possible with the traditional text responses. Here’s how we can approach it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;system&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`
      You are a talented task planner. The user will tell you their goal and
      you will generate exactly 3 tasks to help them accomplish it.

      You must strictly follow this schema:

      {
        tasks: {
          title: string; // max 50 characters
          description: string; // max 100 characters
          difficulty: number; // on a scale of 1 to 5
        }[]
      }
    `&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;I want to visit to Amsterdam!&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;assistant&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;{&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;  "tasks": [&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;    {&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;      "title": "Research Amsterdam attractions",&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;      "description": "Find out about the top attractions in Amsterdam and create a list of places you want to visit.",&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;      "difficulty": 2&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;    },&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;    {&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;      "title": "Book accommodation in Amsterdam",&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;      "description": "Search for and book a suitable accommodation in Amsterdam that fits your budget and preferences.",&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;      "difficulty": 3&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;    },&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;    {&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;      "title": "Plan transportation in Amsterdam",&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
    &lt;span class="s2"&gt;`      "description": "Figure out the best way to get around Amsterdam, whether it's by public transportation, bike, or walking, and plan your routes.",\n`&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;      "difficulty": 2&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;    }&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;  ]&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;}&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check that out! It’s a perfectly valid JSON response that you can &lt;code&gt;JSON.parse&lt;/code&gt;, return to a front end, and render!&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you require structured responses, keep temperature at 0, and &lt;a href="https://platform.openai.com/docs/guides/gpt/function-calling" rel="noopener noreferrer"&gt;check out the OpenAI Function Calling feature&lt;/a&gt;. It’s very powerful. Let me know if you want me to write an article about OpenAI Function Calling!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Keep track of usage&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;A response from OpenAI is something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;object&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;chat.completion&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;created&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1696431344&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gpt-3.5-turbo-0613&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;index&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="na"&gt;finish_reason&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;stop&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="nx"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nl"&gt;prompt_tokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;144&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;completion_tokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;total_tokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;189&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice the &lt;code&gt;usage&lt;/code&gt; property. It mentions the number of tokens used in the request, the response, and in total. But what are those tokens?&lt;/p&gt;

&lt;p&gt;Sometimes it’s easy to think that LLMs truly understand words. However, that’s not exactly how it works. It’s far easier for LLMs to understand &lt;strong&gt;tokens&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Tokens are numeric representations of strings, part of strings, or even individual characters. Essentially, the words we provide to the LLM becomes a set of floating numbers, which the model can then process in an easier, more performant way.&lt;/p&gt;

&lt;p&gt;For example, the text &lt;code&gt;Hello World, I am learning about AI&lt;/code&gt; equals 9 tokens. How exactly?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.builder.io%2Fapi%2Fv1%2Fimage%2Fassets%252FYJIGb4i01jvw0SRdL5Bt%252Fe97004133889494293ba78e3f5556a42%3Fwidth%3D800" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.builder.io%2Fapi%2Fv1%2Fimage%2Fassets%252FYJIGb4i01jvw0SRdL5Bt%252Fe97004133889494293ba78e3f5556a42%3Fwidth%3D800" alt="TokentsNum,&amp;lt;br&amp;gt;
  remote"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The algorithm used for tokenization has tokenized this sentence as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;15496&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2159&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;11&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;314&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;716&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4673&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;546&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;9552&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a fascinating topic, but for the sake of this tutorial, just know that your text inputs are handled post-tokenization, and &lt;strong&gt;you are billed per token&lt;/strong&gt;, &lt;strong&gt;in the input (request/prompt) and the output (response) output.&lt;/strong&gt; This is usually billed per 1,000 tokens, and the response tends to be more expensive than the request. &lt;a href="https://openai.com/pricing" rel="noopener noreferrer"&gt;Check out the OpenAI Pricing page to view the exact cost.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It’s important to mention that different models also have different token limits. For example, at the time of writing this, &lt;code&gt;gpt-3.5-turbo&lt;/code&gt; has a token limit of 4097 tokens in total (request and response combined). You can find more information about &lt;a href="https://platform.openai.com/docs/models/overview" rel="noopener noreferrer"&gt;token limits per model in the OpenAI documentation&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;How can I calculate the tokens myself?&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;There are various tools available online to help you calculate the tokens. I really like the &lt;a href="https://platform.openai.com/tokenizer" rel="noopener noreferrer"&gt;OpenAI Tokenizer&lt;/a&gt;. However, depending on the model you’re using, it might not always be 100% accurate.&lt;/p&gt;

&lt;p&gt;The Python ecosystem is fortunate to have a package called &lt;code&gt;tiktoken&lt;/code&gt; that really helps with that. We are fortunate to have talented folks in our ecosystem who ported it to JavaScript/TypeScript! My favorite one is &lt;a href="https://www.npmjs.com/package/@dqbd/tiktoken" rel="noopener noreferrer"&gt;@dqpd/tiktoken&lt;/a&gt;. It works very well and is very reliable.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you are counting tokens in a production environment, I suggest you take a 5%-10% margin for error. These tokenizers are not always accurate. Better safe than sorry!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Tips&lt;/strong&gt;
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Use the &lt;a href="https://platform.openai.com/tokenizer" rel="noopener noreferrer"&gt;OpenAI Tokenizer&lt;/a&gt; to learn how tokens work and get an (almost accurate) idea of token usage.&lt;/li&gt;
&lt;li&gt;Use the &lt;a href="https://platform.openai.com/playground" rel="noopener noreferrer"&gt;OpenAI Playground&lt;/a&gt; to practice prompt engineering without writing a single line of code.&lt;/li&gt;
&lt;li&gt;Use &lt;a href="https://pezzo.ai/" rel="noopener noreferrer"&gt;Pezzo&lt;/a&gt; as a centralized prompt management platform to collaborate with your team and iterate quickly, as well as observe and monitor your AI operations and costs. It’s open-source! (disclaimer: I am the founder and CEO).&lt;/li&gt;
&lt;li&gt;Consider taking my &lt;a href="https://www.udemy.com/course/ai-for-js-devs/" rel="noopener noreferrer"&gt;AI For JavaScript Developers&lt;/a&gt; course on Udemy. I’ve so far educated over 200,000 students on Udemy, and this 2-hour crash course is meant for developers like you and me, who want to add AI to their toolbox. We build real-world apps powered by AI and cover Function Calling, Real-time Data, Hallucinations, Vector Stores, Vercel AI SDK, LlamaIndex, and more!&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>javascript</category>
      <category>openai</category>
      <category>chatgpt</category>
    </item>
    <item>
      <title>Pezzo v0.5 - Dashboards, Caching, Python Client, and More!</title>
      <dc:creator>Ariel Weinberger</dc:creator>
      <pubDate>Sat, 02 Sep 2023 17:15:53 +0000</pubDate>
      <link>https://dev.to/pezzo/pezzo-v05-dashboards-caching-python-client-and-more-2enk</link>
      <guid>https://dev.to/pezzo/pezzo-v05-dashboards-caching-python-client-and-more-2enk</guid>
      <description>&lt;p&gt;This version brings a lot of new features and improvements to Pezzo. We're excited to share them with you!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--N0epPRV4--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ieb2cgilqz0dtphb2fw6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--N0epPRV4--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ieb2cgilqz0dtphb2fw6.png" alt="Version 0.5.0 Banner" width="800" height="419"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Pezzo?
&lt;/h2&gt;

&lt;p&gt;Pezzo is a fully open-source (Apache 2.0) LLMOps platform built for developers and teams. It was designed to streamline Generative AI adoption, delivery, monitoring, observability and more.&lt;/p&gt;

&lt;p&gt;Wanna know more? &lt;a href="https://github.com/pezzolabs/pezzo"&gt;Check Pezzo out on GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's New
&lt;/h2&gt;

&lt;h3&gt;
  
  
  📈 Project Dashboard
&lt;/h3&gt;

&lt;p&gt;We've added a new screen to the Pezzo Console. The Project Dashboard gives you a quick overview of your project's performance. It features several useful features:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Timeframe Selector:&lt;/strong&gt; Select from hourly, daily, weekly, monthly, yearly and even custom timeframes for analytics.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Vital Metrics:&lt;/strong&gt; You can now see the number of requests, cost, average request duration, and success rate. You can even see how they change over time.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Useful Charts:&lt;/strong&gt; We've implemened two charts -one for total requests (as well as errors) and average request duration over time. We'll add more charts in the future.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--MPDR98N7--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/f7aea7cz0zday6s7vrud.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--MPDR98N7--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/f7aea7cz0zday6s7vrud.png" alt="Pezzo Project Dashboard" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  🏎️ Request Caching
&lt;/h3&gt;

&lt;p&gt;We've implementing a caching mechanism. This feature can help you &lt;strong&gt;save up to 90% of your LLM API costs and time&lt;/strong&gt;!&lt;/p&gt;

&lt;p&gt;Some practical use cases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Development:&lt;/strong&gt; During development developers tend to go through flows very frequently. This usually involves the same set of LLM API calls with the same input date. With Pezzo, your entire organization can share the same cache, and focus on value!&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Production:&lt;/strong&gt; If you're building a support chabot, for example, there are many queries that are highly repetitive. For example, "What is your return policy?" or "What are your opening hours?"&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can read more in the &lt;a href="https://docs.pezzo.ai/client/request-caching"&gt;Request Caching documentation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;We're planning to add more caching features in the future, such as &lt;em&gt;semantic caching&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Gng5S6-Q--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/zzlvpfxdh6v5hgax5nwb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Gng5S6-Q--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/zzlvpfxdh6v5hgax5nwb.png" alt="Requests page with cached requests" width="800" height="644"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  🐍 Python Client
&lt;/h3&gt;

&lt;p&gt;We're excited to share that Pezzo now features a Pezzo client! Here are some useful links to help you get started:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.pezzo.ai/client/pezzo-client-python"&gt;Pezzo Client Documentation (Python)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/pezzolabs/client-python"&gt;Source code&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pypi.org/"&gt;PyPi package&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We've also made sure to add a Python copy-pastable code snippet in he console, to make it even easier for you to get started.&lt;/p&gt;




&lt;h2&gt;
  
  
  Thank you!
&lt;/h2&gt;

&lt;p&gt;We hope you enjoy this version. We're working hard on the next one, which will feature a lot of exciting features and improvements.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;⭐️ Consider &lt;a href="https://github.com/pezzolabs/pezzo"&gt;giving us a star on GitHub&lt;/a&gt; to support our mission &lt;/li&gt;
&lt;li&gt;👾 Consider joining our &lt;a href="https://pezzo.cc"&gt;Discord server&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;🎓 Read the &lt;a href="https://docs.pezzo.ai"&gt;documentation&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;🌎 Check out &lt;a href="https://pezzo.ai"&gt;our website&lt;/a&gt; &lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>openai</category>
      <category>typescript</category>
      <category>python</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
