Wesley Chun (@wescpy) for Google Developer Experts

Posted on Dec 1, 2025 • Edited on Jan 29

Simplifying basic (genAI) web app deployment with serverless

#webdev #ai #googlecloud #api

TL;DR:

Writing simple web apps is one thing, but deploying it globally without having to think about servers, VMs (virtual machines), or DNS is another. That's where this post and serverless come into play. Whether you have a Gemini-powered app or agent, serverless lets you focus on the solution you're building, not what it runs on. In this post, we show you how to deploy a basic web app featuring the Gemini API to serverless on Google Cloud.

Introduction

Welcome to another installment on the blog covering Google APIs for Python (and sometimes Node.js) developers. Here, I cover stuff you won't find in Google's documentation, all while showing you how to use Google APIs from different product groups for what you build. Here are some of the categories covered so far:

Google Cloud/GCP (AI/ML, serverless) (this series)
Google Workspace/GWS (Drive, Docs, Sheets, Gmail, etc.)
YouTube
Maps
Generative AI with Gemini (this series)
Credentials (API keys, OAuth client IDs, service accounts)

We return to Gemini again in this post but focus less on the AI and more on how to more easily deploy apps to the cloud, whether AI-powered or not.

Background

Writing web apps is one thing, but deploying them globally without having to think about servers, VMs (virtual machines), DNS, and scalability is another. That's where this post and serverless come into play. Whether you have a Gemini-powered app or agent, serverless lets you focus on the solution you're building, not what it runs on or even how to run it. Is this you?

There won't be much introductory material as we'll get straight into serverless deployments. For more background, it's wise to point out that this is a follow-up to several posts from this blog, all which make for good reading before proceeding:

The post covering a basic genAI web app whose code was recently updated following advice from...
Another post on upgrading to Gemini 2.5
The post introducing serverless
The post to hosting apps today on Google App Engine
The post on modern app-hosting with Google Cloud Run

Architectural components

There are several large "pieces of the puzzle" to look at, and decisions to be made:

API platforms & credentials -- decide which platform to use when calling the Gemini API then get your credentials and set up your Cloud project
Languages -- decide whether you want to run Node.js or Python (should be easy); each language has several options too
GCP Serverless platforms -- decide which serverless platform you want to run, and by the way, what's serverless anyway?

Each has its own set of prerequisites which must be satisfied before running any of the sample apps.

Gemini API

Gemini is a popular, proprietary, multimodal LLM (large language model), accessed via its public chat interface or as the LLM for agents. Its API lets developers tap into its capabilities directly from web apps or mobile backends, and the former is the use case we're addressing in this post. One aspect of using the API is unexpected: Google makes the Gemini API available from two different platforms. Each has its own set of use cases as well as pricing differences. Learn about both platforms via the resources in the following table:

Platform	Use Cases	Pricing
Google AI ("GAI")	Experimenting, free tier/lower cost, lower barrier-to-entry, hobbyists/students	pricing
Vertex AI (Google Cloud/"GCP")	Production AI workloads, existing GCP customers adding and/or considering AI capabilities	pricing

Since this post is mostly introductory, we'll use the Gemini API from Google AI, however, the current, unified client library lets developers transition to GCP easily, representing a significant improvement from when there were two distinct client libraries, one for each platform. Compare, contrast and learn the differences between all three client libraries (two old and one new) in the second post listed above which aims to help those migrating to Gemini 2.5.

There's no doubt that old samples live forever online and vibecoding LLMs have been trained on both old and new code, so that post serves as a PSA (public service announcement) to show readers the differences between all three libraries so they can upgrade their own applications or fix code using old libraries generated by LLMs.

Requirements(s)

In order to run the sample apps locally or deploy to the cloud, you need to complete the prerequisites:

Get credentials: Go to the GAI API keys page and select an existing API key or click "Create API key" to make a new one. Save the API key text string somewhere safe; we'll come back to it shortly.
Get project: The "Create API key" dialog also lets you create a new project (or import an existing one), so do that too.
- Completely new? If you're completely new to GCP, learn more about developer projects.

Create API key page — [IMG] Google AI create API key page

Create API key dialog — [IMG] Google AI create API key dialog

Existing GCP users: Experienced GCP users may prefer to do the above from the Cloud Console:

Create a new project or select an existing one.

Go to the Credentials page and create an API key or select an existing one.

[IMG] Cloud console create project page

[IMG] Cloud console create API key dialog on create credentials page

Languages

The code featured in this post is available in various versions of Node.js and Python:

Language	Version	Web framework
Node.js	ECMAscript module	Express
Node.js	CommonJS script	Express
Python	Python 3	Flask
Python	Python 3	FastAPI

All code samples are available in this repo. Regular readers will notice it differs from this blog's regular repo, and there's a reason for this. I spent several years on Google's GCP Serverless team representing those products. During that time, I created a repo named (Cloud) Nebulous Serverless to show developers how they can deploy the same app to all platforms without any code changes.

That repo held all sample applications I built meeting this criteria. Google archived that repo after my departure, so I'm continuing all work in my personal fork. The samples in this post also meet that criteria, so that's why they're in this repo and not the normal one. (I also removed "Cloud" from the repo name as there are now non-Cloud sample apps in the fork.)

Requirements(s)

Setup local environment:
- Clone the code locally with git clone https://github.com/wescpy/nebulous-serverless.git and go to the app folder in multi/webgem with cd multi/webgem.
- Node.js: Ensure you have contemporary versions of Node & NPM (recommend 18+), cd nodejs and install all packages with npm i
- Python: Ensure you have a contemporary version of Python (recommend 3.9+) and cd python
  - (optional) Create & activate a virtual environment ("virtualenv") for isolation with python3 -m venv .venv; source .venv/bin/activate
  - For the commands below, depending on your system configuration, you will use one of (pip, pip3, python3 -m pip), but the instructions are generalized to pip.
  - (optional) Update pip and install uv with pip install -U pip uv
  - Decide whether to run the Flask or FastAPI version.
  - Flask: Install all packages: uv pip install -Ur requirements.txt (drop uv if you didn't install it)
  - FastAPI: Replace Flask files by moving them out of the fastapi folder: mv fastapi/* . then install all packages: uv pip install -Ur requirements.txt (drop uv if you didn't install it)
Set credentials: Previously, you selected a project and created an API key; now save it to one of these files, depending on which language you're using:
- Node.js: Save as API_KEY = YOUR_API_KEY to .env
- Python: Save as API_KEY = 'YOUR_API_KEY' to settings.py

You can model yours after the provided templates, .env_TMPL or settings_TMPL.py:

.env_TMPL

API_KEY="YOUR_API_KEY"
GCP_METADATA='{
  "project": "YOUR_GCP_PROJECT",
  "location": "YOUR_GCP_REGION"
}'

settings_TMPL.py

API_KEY = 'YOUR_API_KEY'
GCP_METADATA = {
    'project':  'YOUR_GCP_PROJECT',
    'location': 'YOUR_GCP_REGION',
}

In addition to API_KEY, these files also contain GCP_METADATA in case you migrate to GCP in the future. As mentioned above, you can refer to this post to get an idea of the additional setup and minor code updates you need to make to run your Gemini API-powered apps on Vertex AI.

Use of settings.py as a naming convention follows in Django's footsteps, but alternatively, you can save it to .env and integrate use of python-dotenv to more closely mirror Node.

Serverless platforms

Google Cloud has several serverless platforms to choose from:

Google App Engine (GAE) -- the "OG" serverless platform that launched back in 2008 & somewhat modernized in 2018; uses customized, proprietary containers, free static file edge-caching, and generous outbound networking free tier
Cloud Functions (GCF) -- originally serverless functions to compete with AWS Lambda; latest generation rebranded as Cloud Run Functions
Cloud Run (GCR) -- the latest serverless platform; OCI-compliant containers (Docker, Buildpacks, etc.)

Since Cloud Functions has merged into Cloud Run, this leaves a pair of primary platforms to pick from. As far as deciding between App Engine and Cloud Run, most people would say go with Cloud Run as it is App Engine's next-generation replacement. It's more flexible (has fewer restrictions), supports modern app deployment (containers), and has many new features, including "Jobs," GPUs, and many more.

See this post to learn more about App Engine, including what it can still do that Cloud Run can't (yet), and see this post to learn more about Cloud Run. Those posts provide much more detail than what's in this post which focuses primarily on deployment instructions. The apps in the repo should be run on both platforms without code changes. See this post to learn more about serverless and these platforms.

Requirements(s)

Choose a platform (or experiment and deploy to both).
- If you pick App Engine, a running application is known as an "app," and you can only have one app in any Cloud project. Enable GAE and create the app in the Cloud console.
- If you pick Cloud Run, a running application is known as a "service," and you can have any number of services in a project. If you're new to GCR, your Cloud Run dashboard will be empty.
- With Cloud Run, you also have an additional decision: Docker or not? While many developers are familiar with containers and specifying how they should be built via a Dockerfile, others prefer to avoid Docker or managing Dockerfiles, and Cloud Run supports both options. Sample Dockerfiles are provided, but you can delete them before deploying or not have them for your own apps.

⚠️ Cost: billing required (but "free?!?")
Before deploying to the cloud, a word on cost. While many Google products are free to use, GCP products are not. In order to run the sample apps, you must enable billing backed by a financial instrument like a credit card (payment method depends on region/currency). If you're new to GCP, review the billing & onboarding guide. Deploying and running the sample app(s) in this post should not incur any cost because basic usage falls under various free tiers:

Several GCP products (like GAE & GCR) have an "Always Free" tier, a free daily or monthly usage quota before incurring charges. See the GCR pricing and quotas pages for more information. Furthermore, deploying to GCP serverless platforms incur minor build and storage costs. (Also see similar content in the GAE docs.)

Cloud Build has its own free quota as does Cloud Storage (GCS), used to store build artifacts. Cloud Build sends application images to the Cloud Artifact Registry (CAR) (or its predecessor), making them accessible to other GCP services. These eat into GCS & CAR (storage) quotas as does transferring images between services & regions. You may be in a region that does not have a free tier however, so monitor your usage to minimize any costs. (Check out storage use and delete old/unwanted build artifacts via the GCS browser.)

Use the cost calculator to get monthly estimates. You may qualify for credits to offset GCP costs: If you are a startup, consider the GCP for Startups program grants. If you are in education, check out the GCP education programs for students, faculty, and researchers.

The app

This sample app is very basic and works like this:

Start with an empty form asking the user for several input items, an image to upload and an LLM prompt:
Select an image for Gemini using the normal file-picker:
Enter a suitable prompt (or take the default "Describe this image"):
When the submit button is pressed, the button text changes to "Processing...", and once Gemini is done, a thumbnail of the image is provided along with LLM output:

I ripped these from the original post where you can get all the details about this app.

The code

The basic web apps in this post are nearly identical to those in that original post. The only differences between that post's repo and this post's repo are the additional files and packages required to deploy that web app to GCP serverless platforms:

Node.js: from original web app

File	Description	Platform
`nodejs/.env_TMPL`	`.env` environment settings template	Node
`nodejs/package.json`	3rd-party packages	Node
`nodejs/main.js`	Express.js sample app	Node (CommonJS script)
`nodejs/main.mjs`	Express.js sample app	Node (ECMAScript module)
`nodejs/templates/index.html`	Web template	Nunjucks (identical to Jinja2)

Node.js: new for serverless deployments

File	Description	Platform
`nodejs/app.yaml`	Config file	App Engine
`nodejs/Dockerfile`	Dockerfile	Cloud Run (with Docker)
`nodejs/.dockerignore`	.dockerignore	Cloud Run (with Docker)
`nodejs/Procfile`	Procfile	Cloud Run (without Docker)
`nodejs/.gcloudignore`	.gcloudignore	App Engine & Cloud Run

Python: from original web app

File	Description	Platform
`python/settings_TMPL.py`	`settings.py` environment settings template	Python 3
`python/requirements.txt`	Flask 3rd-party packages	Python
`python/main.py`	Flask sample app	Python 3
`python/templates/index.html`	Web template	Jinja2 (identical to Nunjucks)

`python/fastapi/requirements.txt`	FastAPI 3rd-party packages	Python 3
`python/fastapi/main.py`	FastAPI sample app	Python 3

Python: new for serverless deployments

File	Description	Platform
`python/app.yaml`	Config file (Flask)	App Engine
`python/Dockerfile`	Dockerfile (Flask)	Cloud Run (with Docker)
`python/.dockerignore`	.dockerignore	Cloud Run (with Docker)
`python/Procfile`	Procfile (Flask)	Cloud Run (without Docker)
`python/.gcloudignore`	.gcloudignore	App Engine & Cloud Run

`python/fastapi/app.yaml`	Config file (FastAPI)	App Engine
`python/fastapi/Dockerfile`	Dockerfile (FastAPI)	Cloud Run (with Docker)
`python/fastapi/Procfile`	Procfile (FastAPI)	Cloud Run (without Docker)

Run locally

Node.js

Earlier, you changed to the Node folder (with cd nodejs) and installed all packages (npm i), so to run the app locally, execute: node main.mjs (or `node main.js for CommonJS).

Python

Earlier, you changed to the Python folder (with cd python). If you decided to run the Flask version, you would've installed all packages (with uv pip install -Ur requirements.txt). If you chose FastAPI instead, you would've replaced the Flask files (with mv fastapi/* ., then run the same install command. To run the app locally (either version), execute: python main.py.

Deploy to the cloud

With cloud deployments, you have many more options, all are listed below along with instructions once you've doublechecked you have a .env or settings.py file with your API key.

Node.js (ECMAscript module) on App Engine
1. Run gcloud app deploy
Node.js (CommonJS script) on App Engine
1. Edit package.json and change main.mjs to main.js globally
2. Run gcloud app deploy
Node.js (ECMAscript module) on Cloud Run with Docker
1. Run gcloud run deploy SVC_NAME --allow-unauthenticated --source . --region REGION, replacing SVC_NAME with your service name
  - Example: gcloud run deploy genai --allow-unauthenticated --source . --region us-west1
Node.js (CommonJS script) on Cloud Run with Docker
1. Edit package.json and change main.mjs to main.js globally
2. Run gcloud run deploy SVC_NAME --allow-unauthenticated --source . --region REGION, replacing SVC_NAME with your service name (see Example above)
Node.js (ECMAscript module) on Cloud Run without Docker
1. Delete Dockerfile
2. Run gcloud run deploy SVC_NAME --allow-unauthenticated --source . --region REGION, replacing SVC_NAME with your service name (see Example above)
Node.js (CommonJS script) on Cloud Run without Docker
1. Edit package.json and change main.mjs to main.js globally
2. Delete Dockerfile
3. Run gcloud run deploy SVC_NAME --allow-unauthenticated --source . --region REGION, replacing SVC_NAME with your service name (see Example above)
Python Flask on App Engine
1. Run gcloud app deploy
Python FastAPI on App Engine
1. Run mv fastapi/* . (if you haven't already)
2. Edit requirements.txt and uncomment the line for gunicorn (it's required)
3. Run gcloud app deploy
Python Flask on Cloud Run with Docker
1. Run gcloud run deploy SVC_NAME --allow-unauthenticated --source . --region REGION, replacing SVC_NAME with your service name (see Example above)
Python FastAPI on Cloud Run with Docker
1. Run mv fastapi/* . (if you haven't already)
2. Run gcloud run deploy SVC_NAME --allow-unauthenticated --source . --region REGION, replacing SVC_NAME with your service name (see Example above)
Python Flask on Cloud Run without Docker
1. Delete Dockerfile
2. Run gcloud run deploy SVC_NAME --allow-unauthenticated --source . --region REGION, replacing SVC_NAME with your service name (see Example above)
Python FastAPI on Cloud Run without Docker
1. Run mv fastapi/* . (if you haven't already)
2. Delete Dockerfile
3. Run gcloud run deploy SVC_NAME --allow-unauthenticated --source . --region REGION, replacing SVC_NAME with your service name (see Example above)

Here are a couple of results from my deployments to App Engine (*.appspot.com) and Cloud Run (*.run.app):

Web app on App Engine — [IMG] Web app sample result on App Engine

Web app on Cloud Run — [IMG] Web app sample result on Cloud Run

Wrap-up

The original post covered a basic web app using the Gemini API. In this post, we follow-up with that same code sample, but add in the extra files needed to deploy it to the Cloud, specifically on GCP serverless platforms which have the benefit of incurring billing only if your app is getting traffic, unlike on VMs where you're billed 24x7. Learning how to create basic web apps using the Gemini API is useful, but if you want to bring it to the world, deploying to serverless is one way to do it!

If you found an error in this post, a bug in the code, or have a topic you want me to cover in the future, drop a note in the comments below or file an issue at the repo. Also check out other posts in the Gemini API series as well as the GCP serverless series. Thanks for reading, and I hope to meet you at an upcoming event soon... see the travel calendar at the bottom of my consulting site.

PREV POST: Basic web apps using the Gemini API

References

Below are various links relevant to this post. Also included below are links to numerous amount of GCP serverless content I produced during my time at Google.

Code samples

Sample in this post (Python & Node.js)
Code samples for Gemini posts (except this one)
Code samples for all posts

Gemini API (Google AI) general

General GenAI docs
API reference
API SDKs/supported languages
API quickstart
Jupyter Notebook QuickStart
Gemini API pricing (free tier available)
Google AI migration guide
GCP Vertex AI for Google AI users

Gemini models

GCP serverless platforms

Google App Engine (GAE)
Google Cloud Functions (Gen1) and Cloud Run Functions (Gen2) (GCF/GCRF)
Google Cloud Run (GCR)

DEV Community

Simplifying basic (genAI) web app deployment with serverless

TL;DR:

Introduction

Background

Architectural components

Gemini API

Requirements(s)

Languages

Requirements(s)

Serverless platforms

Requirements(s)

The app

The code

Node.js: from original web app

Node.js: new for serverless deployments

Python: from original web app

Python: new for serverless deployments

Run locally

Node.js

Python

Deploy to the cloud

Wrap-up

References

Code samples

Gemini API (Google AI) general

Gemini models

Other Gemini API content by the author

GCP serverless platforms

Other GAE content by the author

Other GCR content by the author

Other GCP serverless content by the author

Top comments (0)