TL;DR:
Did you know that the Gemini API can be accessed from Google in 2 different places?!? That may surprise some people. Of course they provide an API QuickStart sample from each platform to help you get started. One is a multimodal example dumped into your lap without sufficient explanation, and the other is a giant Jupyter Notebook requiring time and effort to digest. Can we get a better "Hello World!" sample that developers can grok with little effort and be able to tweak and quickly drop into a web app or mobile backend? Not only that, but can someone show us how to access the Gemini API from both places? Don't worry... I got you.
Sep 2024 update: Updated code samples from Gemini 1.0 Pro (
gemini-pro
) to 1.5 Flash (gemini-1.5-flash
) as the LLM model.
Introduction
Are you a developer interested in using Google APIs? You're in the right place as this blog is dedicated to that craft from Python and sometimes Node.js. Today's topic covers Google's answer to ChatGPT: Gemini, Gemini, Gemini. That is, Gemini the app (formerly Bard), Gemini the ML model(s), and Gemini the API. As this is a developer blog, no surprise the focus is on the API. However, Google provides two different platforms to access that API:
Which QuickStart should you use? Neither... at least not initially.
Motivation
While you can get started with either QuickStart, it can be confusing to developers who want to learn it quickly with the least amount of friction—that's the purpose of a "Hello World!" sample. At this time, a single canonical example isn't available for developers.
Instead, a generic web search leads to documentation and multiple samples from different Google teams hosted on different Google domains, leading to code written by different Googlers. For those who want to just do, you want to just get into it without having to decide between examples. Let's look at each though.
The GCP Vertex AI Gemini API Python QuickStart features sample code that is lacking. The example is short and somewhat digestible but no explanation is offered in the tutorial. Furthermore, why is it multimodal whereas a text-only demo suffices? I guess you're supposed to just understand all of it without any help.
There is also a Google AI Gemini API Python QuickStart and corresponding code sample which is just as sparse and lacking detail like the Vertex AI sample, but at least they're consistent and short. The original QuickStart was a bit more overwhelming to say the least. That code sample was/is nearly the opposite of the others: it's a huge Notebook that's got tons of explanation and the kitchen sink. It attempts to show you how to:
- List available models
- Recognize prompt safety concerns & multiple candidate results
- Generate text responses
- Understand embeddings & vector databases
- Send multimodal requests
- Do chat conversations
- Understand context windows & token counts
- Prepare for advanced use cases
That's too much. At this point, you're either knee-deep and stuck, or worse, lost. Notebooks are great tools for learning, demonstration, and sharing purposes, but I can't easily take chunks of one, including its IPython and Colab dependencies, and reuse them in a web app, much less differentiate between Notebook code and Gemini API code.
The purpose of a "QuickStart" tutorial is to give developers working code they can get running in 5 minutes. This is certainly not the case with the Notebook sample. You could possibly get the others running in 5 minutes if you're already a GCP user or familiar with the AI Studio, but chances are, you're new to both, so you'll have to learn on your own. The bottom-line is that none of these three are optimal "Hello World!" samples to get started with.
Cost & availability of Gemini API: Before jumping into the samples, a quick word on cost & availability. TL;DR: Both are free for now and available in most regions globally except for the EU.
Google AI: Google AI offers free access to the Gemini API as well as a paid tier. For details see its pricing page. For today's purposes, you should be able to run the samples in this post without cost as long as you stay within the free limits (even once paid plans are in-place). At the time of this writing, the limit is one request per second (1 QPS). See the availability regions page to determine where the API can be used.
GCP Vertex AI: On the other hand, GCP is "pay-per-use," meaning you pay a little bit each time you use a service, and this includes Vertex AI which also has a pricing page. This means that using Vertex AI is not free. (Other GCP products do have a daily or monthly free tier of usage.) See where you can access Generative AI on Vertex AI from its available regions page.
New users to GCP can access the Free Trial, which is $300USD good for 90 days. It is offered only once, and the clock starts ticking immediately, so don't activate it until you've planned out some workloads to run on multiple platform services to make the most of it. Independent of the Free Trial, some GCP products offer an "Always Free" tier of service, meaning a monthly (or daily) quota you can consume without cost before they start to incur billing. Read more about both on the GCP free programs page.
A better "Hello World!" sample app
As a developer, I don't need to know everything while getting up-to-speed. Just show me the fastest way to code the API, a basic, self-explanatory example I can modify and drop into anything I already have. Also, show me the differences between calling the Gemini API from both Google AI as well as GCP Vertex AI. All of this points to me wanting a better "Hello World!", so I built one... well, four. ;-)
Python
As you know, I generally provide both Python 2 & 3-compatible examples to help 2.x users migrate to 3.x, but the GenAI package from Google isn't available for Python 2, so everything here and in future posts covering the Gemini API is Python 3-only. Let's start with the prerequisites:
- Install the Google GenAI Python package:
pip install -U pip google-generativeai
(orpip3
) - Create an API key
- Save API key as a string to
settings.py
asAPI_KEY = 'YOUR_API_KEY_HERE'
⚠️ WARNING: Keep API keys secure |
---|
Storing API keys in files (or hard-coding them for use in actual code or even assigning to environment variables) is for prototyping and learning purposes only. When going to production, put them in environment variables or in a secrets manager. Files like settings.py or .env containing API keys are susceptible. Under no circumstances should you upload files like those to any public or private repo, have sensitive data like that in TerraForm config files, add such files to Docker layers, etc., as once your API key leaks, everyone in the world can use it. |
Now for the code sample which you can find in the repo:
import google.generativeai as genai
from settings import API_KEY
PROMPT = 'Describe a cat in a few sentences'
MODEL = 'gemini-1.5-flash'
print('** GenAI text: %r model & prompt %r\n' % (MODEL, PROMPT))
genai.configure(api_key=API_KEY)
model = genai.GenerativeModel(MODEL)
response = model.generate_content(PROMPT)
print(response.text)
The first few lines import the Google AI GenAI package and your API_KEY
. That's followed by constants for the LLM prompt and the model, in this case, the text-based, Gemini Pro model. The model and prompt are then displayed.
The core part of the application configures the use of your API key, creates a model object to use, sends the prompt over to Gemini, stores the generated response
from the LLM, and displays it to the end-user. Now that's a reasonable "Hello World!" for the Gemini API in under 10 lines of code. The API's default temperature is set to a value where running the script results in a slightly different LLM response each time... this time the output was:
$ python3 gemtxt-simple-gai.py
** GenAI text: 'gemini-1.5-flash' model & prompt 'Describe a cat in
a few sentences'
A cat is a curious, playful, and affectionate creature, with a
sleek body, soft fur, and sharp claws. It has a distinctive
face with two piercing eyes, a small nose, and whiskers that
twitch constantly. Cats are agile and graceful, and they love
to explore their surroundings, climb trees, and chase after
anything that moves.
Now let's repeat this exercise with JavaScript.
Node.js (JavaScript)
Node.js has similar prerequisites to perform first:
- Install the Dotenv and Google GenAI NPM packages:
npm i dotenv @google/generative-ai
- Create an API key
- Save API key as a string to
.env
asAPI_KEY='YOUR_API_KEY_HERE'
Now for the code sample which you can find in the repo:
require('dotenv').config();
const { GoogleGenerativeAI } = require("@google/generative-ai");
const PROMPT = 'Describe a cat in a few sentences';
const MODEL = 'gemini-1.5-flash';
console.log(`** GenAI text: '${MODEL}' model & prompt '${PROMPT}'\n`);
async function main() {
const genAI = new GoogleGenerativeAI(process.env.API_KEY);
const model = genAI.getGenerativeModel({model: MODEL});
const result = await model.generateContent(PROMPT);
console.log(await result.response.text());
}
main();
Like the Python "Hello World!", import the Google GenAI package and use dotenv
to get your API_KEY
at the top, then set constants for the LLM prompt and model, in this case, the text-based, Gemini Pro model. The console.log()
call displays the model and prompt.
The core part of the application configures the use of your API key, creates a model object to use, sends your prompt over, stores the generated response from the model to result
, and displays it to the screen to wrap up. While the JavaScript version isn't less than 10 lines of code, it's still short enough to digest in its entirety. Here's the output from running this Node.js version of the script:
$ node gemtxt-simple-gai.js
** GenAI text: 'gemini-1.5-flash' model & prompt 'Describe a cat in
a few sentences'
A cat is a small, furry mammal that is typically kept as a pet.
They are known for their independent and aloof nature, but they
can also be affectionate and playful. Cats come in a wide variety
of colors and patterns, and they can have short, medium, or long
hair. They are carnivores and their diet consists mainly of meat.
Dotenv now optional
You can remove explicit installation and configuration of
dotenv
in the most recent Node.js releases (20.6.0+). Be sure to point to the.env
file when executing your script if you take the relevant lines out:
$ node --env-file=.env gemtxt-simple-gai.js
This also applies to the modular version below.
If you're looking for a more modern JavaScript ECMAScript module, here's the equivalent .mjs
file, also available in the repo:
import dotenv from "dotenv"
import { GoogleGenerativeAI } from '@google/generative-ai';
dotenv.config()
const PROMPT = 'Describe a cat in a few sentences';
const MODEL = 'gemini-1.5-flash';
console.log(`** GenAI text: '${MODEL}' model & prompt '${PROMPT}'\n`);
async function main() {
const genAI = new GoogleGenerativeAI(process.env.API_KEY);
const model = genAI.getGenerativeModel({model: MODEL});
const result = await model.generateContent(PROMPT);
console.log(await result.response.text());
}
main();
Not going to run this one, but you get the idea. Like the Python version, you can tweak and drop something like this into your Node.js app fairly quickly to add the power of GenAI. Both samples use the Gemini API from Google AI, however as mentioned earlier, you can also use the API from the GCP Vertex AI platform.
GCP Vertex AI
With the option of using GCP, you now have to consider where you should use the Gemini API from. If you're new to Gemini, AI/ML, GenAI/LLMs, not currently a GCP customer, or otherwise exploring, I recommend you stick with Google AI to access the Gemini API as it's the most straightforward for experimentation and exploration.
It's also an easy answer if you're currently a GCP customer. Accessing the Gemini API via Vertex AI is probably your best bet because you're already in an environment you're familiar with and using other GCP services anyway. You can read more about the differences accessing the Gemini API between Google AI and Vertex AI in the documentation.
The Vertex AI Gemini API "Hello World!" sample can be executed from your environment, on a GCP compute platform, or from Cloud Shell. Whichever you choose, perform these prerequisites first:
- Create a new GCP project from the console or with
gcloud projects create . . .
; or reuse an existing project - Enable the Vertex AI API from the console or with the
gcloud services enable aiplatform.googleapis.com
if you haven't already- You may be prompted for billing
- Install the Vertex AI Python SDK:
pip install -U pip google-generativeai
(orpip3
) - Acquire new user-based Application Default Credentials (ADC) and set up in your environment with
gcloud auth application-default login
You may notice there's no API key acquisition here. Most GCP services use service accounts for authentication (along with GCP Identity and Access Management [IAM] for authorization) vs. API keys which are authorization-only credentials. Learn more about API key access to Google APIs per an earlier series.
In today's example, the ADC step above sets up use of your user credentials instead of service accounts, very similar to OAuth client IDs credentials as covered in an earlier series on OAuth client ID access to GWS APIs. When moving to production, you will switch to a service account instead because you (a human user) won't be available for authentication requests. (A post series on service accounts is planned.)
Here's the Vertex AI version of the "Hello World!" sample you can grab from the repo:
import vertexai
import vertexai.preview.generative_models as genai
PROMPT = 'Describe a cat in a few sentences'
MODEL = 'gemini-1.5-flash'
print('** GenAI text: %r model & prompt %r\n' % (MODEL, PROMPT))
vertexai.init()
model = genai.GenerativeModel(MODEL)
response = model.generate_content(PROMPT)
print(response.text)
Does it look familiar? It should because it's a near-duplicate of the Google AI version, as illustrated when placed side-by-side:
There are a pair of key differences, the easy one being initialization while the other is what gets imported:
- The Google AI version uses an API key during configuration while Vertex AI uses the service account via the ADC already set earlier with the
gcloud
command. - Instead of the Google AI package and API key, the Vertex AI SDK is imported followed by the import of the generative models, currently in preview. (The code here and in the repo will be updated once it becomes generally available.)
As expected, the output is similar to the Google AI samples. Here is an execution of this script from Cloud Shell as an alternative to my laptop:
USER@cloudshell:~/genai (PROJECT_ID)$ python3 gemtxt-simple-gcp.py
** GenAI text: 'gemini-1.5-flash' model & prompt 'Describe a cat
in a few sentences'
A sleek, furry creature with mesmerizing eyes, a cat is a master
of independence and grace. With a playful curiosity and an air
of aloofness, they weave through life with a quiet confidence,
always ready for a good nap or a sudden burst of energy.
Summary
Developers are eager to jump into the world of AI/ML, especially GenAI & LLMs, and accessing Google's Gemini models via API is part of that picture. While the Gemini API is accessible from Google AI and the GCP Vertex AI platform, their QuickStart samples either don't explain enough or are too large to digest in one sitting. This post attempts to bridge the gap by providing a better "Hello World!" sample that introduces the API with as little code as possible.
With a more user-friendly working sample, you can start exploring on your own, like changing the prompt or switching to multimodal with the Gemini Pro Vision model. Tweak it however you like then drop it into your web app or mobile backend to add generative AI to your code in record time. You can also continue to explore Gemini by going back to the official QuickStarts to further your knowledge. Regardless of what you do, I hope these Python and Node.js samples help kick-start your Gemini API journey. When you're ready, the next post shows you various ways of "upgrading" the baseline Python sample running on Google AI, introducing you to additional features of the Gemini API.
If you found an error in this post, bug in the code, or have a topic you want me to cover in the future, drop a note in the comments below or file an issue at the repo. Thanks for reading, and I hope to meet you if I come through your community... see the travel calendar on my consulting page.
NEXT POST: Part 2: Gemini API 102... Next steps beyond "Hello World"
Resources
-
Code samples
-
Gemini API (Google AI)
-
Gemini API (GCP Vertex AI)
-
Gemini API (differences between both platforms)
-
Other Gemini resources
WESLEY CHUN, MSCS, is a Google Developer Expert (GDE) in Google Cloud (GCP) & Google Workspace (GWS), author of Prentice Hall's bestselling "Core Python" series, co-author of "Python Web Development with Django", and has written for Linux Journal & CNET. He runs CyberWeb specializing in GCP & GWS APIs and serverless platforms, Python & App Engine migrations, and Python training & engineering. Wesley was one of the original Yahoo!Mail engineers and spent 13+ years on various Google product teams, speaking on behalf of their APIs, producing sample apps, codelabs, and videos for serverless migration and GWS developers. He holds degrees in Computer Science, Mathematics, and Music from the University of California, is a Fellow of the Python Software Foundation, and loves to travel to meet developers worldwide at conferences, user group events, and universities. Follow he/him @wescpy & his technical blog. Find this content useful? Contact CyberWeb or buy him a coffee (or tea)!
Top comments (5)
Thanks for making this documentation. I wanted to check out the AI/ML world, and was looking for something like this. This post of yours really, really, really helped with that. (Fun Fact : I joined dev.to just today). Thanks again!!
Congrats Saurav, welcome to the community, and glad the sample really, really, really helped! :-) If you are new to AI/ML, Google (Cloud/GCP) has other tools you can consider: APIs backed by pre-trained models. As long as you (a developer) can call an API, you can leverage the power of machine learning, even if you're new to it. I gave a quick half-hour seminar on it a few years ago: youtu.be/ja4E9Dzr0Gw Hope that helps too!
Yeah, Thanks!
Thanks for sharing!
We access Gemini Ultra through the Gemini chatbot (formerly Bard) but is there a way to access Gemini Ultra through the API?
According to the documention at ai.google.dev/models/gemini Gemini Ultra is not available via API yet, but if it does, it will show up on that page, so keep checking back there!