Writing simple web apps is one thing, but deploying it globally without having to think about servers, VMs (virtual machines), or DNS is another. That's where this post and serverless come into play. Whether you have a Gemini-powered app or agent, serverless lets you focus on the solution you're building, not what it runs on. In this post, we show you how to deploy a basic web app featuring the Gemini API to serverless on Google Cloud.
Introduction
Welcome to another installment on the blog covering Google APIs for Python (and sometimes Node.js) developers. Here, I cover stuff you won't find in Google's documentation, all while showing you how to use Google APIs from different product groups for what you build. Here are some of the categories covered so far:
We return to Gemini again in this post but focus less on the AI and more on how to more easily deploy apps to the cloud, whether AI-powered or not.
Background
Writing web apps is one thing, but deploying them globally without having to think about servers, VMs (virtual machines), DNS, and scalability is another. That's where this post and serverless come into play. Whether you have a Gemini-powered app or agent, serverless lets you focus on the solution you're building, not what it runs on or even how to run it. Is this you?
There won't be much introductory material as we'll get straight into serverless deployments. For more background, it's wise to point out that this is a follow-up to several posts from this blog, all which make for good reading before proceeding:
There are several large "pieces of the puzzle" to look at, and decisions to be made:
API platforms & credentials -- decide which platform to use when calling the Gemini API then get your credentials and set up your Cloud project
Languages -- decide whether you want to run Node.js or Python (should be easy); each language has several options too
GCP Serverless platforms -- decide which serverless platform you want to run, and by the way, what's serverless anyway?
Each has its own set of prerequisites which must be satisfied before running any of the sample apps.
Gemini API
Gemini is a popular, proprietary, multimodal LLM (large language model), accessed via its public chat interface or as the LLM for agents. Its API lets developers tap into its capabilities directly from web apps or mobile backends, and the former is the use case we're addressing in this post. One aspect of using the API is unexpected: Google makes the Gemini API available from two different platforms. Each has its own set of use cases as well as pricing differences. Learn about both platforms via the resources in the following table:
Since this post is mostly introductory, we'll use the Gemini API from Google AI, however, the current, unified client library lets developers transition to GCP easily, representing a significant improvement from when there were two distinct client libraries, one for each platform. Compare, contrast and learn the differences between all three client libraries (two old and one new) in the second post listed above which aims to help those migrating to Gemini 2.5.
There's no doubt that old samples live forever online and vibecoding LLMs have been trained on both old and new code, so that post serves as a PSA (public service announcement) to show readers the differences between all three libraries so they can upgrade their own applications or fix code using old libraries generated by LLMs.
Requirements(s)
In order to run the sample apps locally or deploy to the cloud, you need to complete the prerequisites:
Get credentials: Go to the GAI API keys page and select an existing API key or click "Create API key" to make a new one. Save the API key text string somewhere safe; we'll come back to it next.
Get project: The "Create API key" dialog also lets you create a new project (or import an existing one), so do that too.
Completely new? If you're completely new to GCP, learn more about developer projects.
[IMG] Google AI create API key page
[IMG] Google AI create API key dialog
Existing GCP users: Experienced GCP users may prefer to do the above from the Cloud Console:
Go to the Credentials page and create an API key or select an existing one.
[IMG] Cloud console create project page
[IMG] Cloud console create API key dialog on create credentials page
Languages
The code featured in this post is available in various versions of Node.js and Python:
Language
Version
Web framework
Node.js
ECMAscript module
Express.js
Node.js
CommonJS script
Express.js
Python
Python 3
Flask
Python
Python 3
FastAPI
All code samples are available in this repo. Regular readers will notice it differs from this blog's regular repo, and there's a reason for this. I spent several years on Google's GCP Serverless team representing those products. During that time, I created a repo named (Cloud) Nebulous Serverless to show developers how they can deploy the same app to all platforms without any code changes.
That repo held all sample applications I built meeting this criteria. Google archived that repo after my departure, so I'm continuing all work in my personal fork. The samples in this post also meet that criteria, so that's why they're in this repo and not the normal one. (I also removed "Cloud" from the repo name as there are now non-Cloud sample apps in the fork.)
Requirements(s)
Setup local environment:
Clone the code locally with git clone https://github.com/wescpy/nebulous-serverless.git and go to the app folder in multi/webgem with cd multi/webgem.
Node.js: Ensure you have contemporary versions of Node & NPM (recommend 18+), cd nodejs and install all packages with npm i
Python: Ensure you have a contemporary version of Python (recommend 3.9+) and cd python
For the commands below, depending on your system configuration, you will use one of (pip, pip3, python3 -m pip), but the instructions are generalized to pip.
(optional) Update pip and install uv with pip install -U pip uv
Decide whether to run the Flask or FastAPI version.
Flask: Install all packages: uv pip install -Ur requirements.txt (drop uv if you didn't install it)
FastAPI: Replace Flask files by moving them out of the fastapi folder: mv fastapi/* . then install all packages: uv pip install -Ur requirements.txt (drop uv if you didn't install it)
Set credentials: Previously, you selected a project and created an API key; now save it to one of these files, depending on which language you're using:
Node.js: Save as API_KEY = YOUR_API_KEY to .env
Python: Save as API_KEY = 'YOUR_API_KEY' to settings.py
In addition to API_KEY, these files also contain GCP_METADATA in case you migrate to GCP in the future. See this post on the additional setup and minor code updates you need to make to run your Gemini API-powered apps on Vertex AI.
Use of settings.py as a naming convention follows in Django's footsteps, but alternatively, you can save it to .env and integrate use of python-dotenv to more closely mirror Node.
Serverless platforms
Google Cloud has several serverless platforms to choose from:
Google App Engine (GAE) -- the "OG" serverless platform that launched back in 2008 & somewhat modernized in 2018; uses customized, proprietary containers, free static file edge-caching, and generous outbound networking free tier
Cloud Run (GCR) -- the latest serverless platform; OCI-compliant containers (Docker, Buildpacks, etc.)
As far as deciding between App Engine and Cloud Run, most people would say go with Cloud Run as it is App Engine's next-generation replacement. It's more flexible (has fewer restrictions), supports modern app deployment (containers), and has many new features, including "Jobs," GPUs, and many more.
If you pick Cloud Run, a running application is known as a "service," and you can have any number of services in a project. If you're new to GCR, your Cloud Run dashboard will be empty.
With Cloud Run, you also have an additional decision: Docker or not? While many developers are familiar with containers and specifying how they should be built via a Dockerfile, others prefer to avoid Docker or managing Dockerfiles, and Cloud Run supports both options. Sample Dockerfiles are provided, but you can delete them before deploying or not have them for your own apps.
⚠️ Cost: billing required (but "free?!?")
Before deploying to the cloud, a word on cost. While many Google products are free to use, GCP products are not. In order to run the sample apps, you must enable billing backed by a financial instrument like a credit card (payment method depends on region/currency). If you're new to GCP, review the billing & onboarding guide. Deploying and running the sample app(s) in this post should not incur any cost because basic usage falls under various free tiers:
Cloud Build has its own free quota as does Cloud Storage (GCS), used to store build artifacts. Cloud Build sends application images to the Cloud Artifact Registry (CAR) (or its predecessor), making them accessible to other GCP services. These eat into GCS & CAR (storage) quotas as does transferring images between services & regions. You may be in a region that does not have a free tier however, so monitor your usage to minimize any costs. (Check out storage use and delete old/unwanted build artifacts via the GCS browser.)
This sample app is very basic and works like this:
Start with an empty form asking the user for several input items, an image to upload and an LLM prompt:
Select an image for Gemini using the normal file-picker:
Enter a suitable prompt (or take the default "Describe this image"):
When the submit button is pressed, the button text changes to "Processing...", and once Gemini is done, a thumbnail of the image is provided along with LLM output:
I ripped these from the original post where you can get all the details about this app.
The code
The basic web apps in this post are nearly identical to those in that original post. The only differences between that post's repo and this post's repo are the additional files and packages required to deploy that web app to GCP serverless platforms:
Earlier, you changed to the Node folder (with cd nodejs) and installed all packages (npm i), so to run the app locally, execute: node main.mjs (or `node main.js for CommonJS).
Python
Earlier, you changed to the Python folder (with cd python). If you decided to run the Flask version, you would've installed all packages (with uv pip install -Ur requirements.txt). If you chose FastAPI instead, you would've replaced the Flask files (with mv fastapi/* ., then run the same install command. To run the app locally (either version), execute: python main.py.
Deploy to the cloud
With cloud deployments, you have many more options, all are listed below along with instructions once you've doublechecked you have a .env or settings.py file with your API key.
Node.js (ECMAscript module) on App Engine
Run gcloud app deploy
Node.js (CommonJS script) on App Engine
Edit package.json and change main.mjs to main.js globally
Run gcloud app deploy
Node.js (ECMAscript module) on Cloud Run with Docker
Doublecheck you have a .env file with your API key
Run gcloud run deploy SVC_NAME --allow-unauthenticated --source . --region REGION, replacing SVC_NAME with your service name
Example: gcloud run deploy genai --allow-unauthenticated --source . --region us-west1
Node.js (CommonJS script) on Cloud Run with Docker
Edit package.json and change main.mjs to main.js globally
Run gcloud run deploy SVC_NAME --allow-unauthenticated --source . --region REGION, replacing SVC_NAME with your service name
Node.js (ECMAscript module) on Cloud Run without Docker
Delete Dockerfile
Run gcloud run deploy SVC_NAME --allow-unauthenticated --source . --region REGION, replacing SVC_NAME with your service name
Node.js (CommonJS script) on Cloud Run without Docker
Edit package.json and change main.mjs to main.js globally
Delete Dockerfile
Run gcloud run deploy SVC_NAME --allow-unauthenticated --source . --region REGION, replacing SVC_NAME with your service name
Python Flask on App Engine
Run gcloud app deploy
Python FastAPI on App Engine
Run mv fastapi/* . (if you haven't already)
Edit requirements.txt and uncomment the line for gunicorn (it's required)
Run gcloud app deploy
Python Flask on Cloud Run with Docker
Run gcloud run deploy SVC_NAME --allow-unauthenticated --source . --region REGION, replacing SVC_NAME with your service name
Example: gcloud run deploy genai --allow-unauthenticated --source . --region us-west1
Python FastAPI on Cloud Run with Docker
Run mv fastapi/* . (if you haven't already)
Run gcloud run deploy SVC_NAME --allow-unauthenticated --source . --region REGION, replacing SVC_NAME with your service name
Python Flask on Cloud Run without Docker
Delete Dockerfile
Run gcloud run deploy SVC_NAME --allow-unauthenticated --source . --region REGION, replacing SVC_NAME with your service name
Python FastAPI on Cloud Run without Docker
Run mv fastapi/* . (if you haven't already)
Delete Dockerfile
Run gcloud run deploy SVC_NAME --allow-unauthenticated --source . --region REGION, replacing SVC_NAME with your service name
Here are a couple of results from my deployments to App Engine and Cloud Run:
[IMG] Web app sample result on App Engine
[IMG] Web app sample result on Cloud Run
Wrap-up
The original post covered a basic web app using the Gemini API. In this post, we follow-up with that same code sample, but add in the extra files needed to deploy it to the Cloud, specifically on GCP serverless platforms which have the benefit of incurring billing only if your app is getting traffic, unlike on VMs where you're billed 24x7. Learning how to create basic web apps using the Gemini API is useful, but if you want to bring it to the world, deploying to serverless is one way to do it!
If you found an error in this post, a bug in the code, or have a topic you want me to cover in the future, drop a note in the comments below or file an issue at the repo. Also check out other posts in this series covering the Gemini API. Thanks for reading, and I hope to meet you at an upcoming event soon... see the travel calendar at the bottom of my consulting site.
WESLEY CHUN, MSCS, is a Google Developer Expert (GDE) in Google Cloud (GCP) & Google Workspace (GWS), author of Prentice Hall's bestselling "Core Python" series, co-author of "Python Web Development with Django", and has written for Linux Journal & CNET. He's currently an AI Technical Program Manager at Red Hat focused on upstream open source projects that make their way into Red Hat AI products. In his spare time, Wesley helps clients with Google integrations, App Engine migrations, and Python training & engineering. He was one of the original Yahoo!Mail engineers and spent 13+ years on various Google product teams, speaking on behalf of their APIs, producing sample apps, codelabs, and videos for serverless migration and GWS developers Wesley holds degrees in Computer Science, Mathematics, and Music from the University of California, is a Fellow of the Python Software Foundation, and loves to travel to meet developers worldwide. Follow he/him @wescpy on Tw/X, BS, and his technical blog. Find this content useful? Contact CyberWeb for professional services or buy him a coffee (or tea)!
Top comments (0)
Subscribe
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
Top comments (0)