DEV Community

Wesley Chun (@wescpy)
Wesley Chun (@wescpy)

Posted on

Simplifying basic (genAI) web app deployment with serverless

TL;DR:

Writing simple web apps is one thing, but deploying it globally without having to think about servers, VMs (virtual machines), or DNS is another. That's where this post and serverless come into play. Whether you have a Gemini-powered app or agent, serverless lets you focus on the solution you're building, not what it runs on. In this post, we show you how to deploy a basic web app featuring the Gemini API to serverless on Google Cloud.

Build with Gemini

Introduction

Welcome to another installment on the blog covering Google APIs for Python (and sometimes Node.js) developers. Here, I cover stuff you won't find in Google's documentation, all while showing you how to use Google APIs from different product groups for what you build. Here are some of the categories covered so far:

We return to Gemini again in this post but focus less on the AI and more on how to more easily deploy apps to the cloud, whether AI-powered or not.

Background

Writing web apps is one thing, but deploying them globally without having to think about servers, VMs (virtual machines), DNS, and scalability is another. That's where this post and serverless come into play. Whether you have a Gemini-powered app or agent, serverless lets you focus on the solution you're building, not what it runs on or even how to run it. Is this you?

There won't be much introductory material as we'll get straight into serverless deployments. For more background, it's wise to point out that this is a follow-up to several posts from this blog, all which make for good reading before proceeding:

  1. The post covering a basic genAI web app whose code was recently updated following advice from...
  2. Another post on upgrading to Gemini 2.5
  3. The post introducing serverless
  4. The post to hosting apps today on Google App Engine
  5. The post on modern app-hosting with Google Cloud Run

Architectural components

There are several large "pieces of the puzzle" to look at, and decisions to be made:

  1. API platforms & credentials -- decide which platform to use when calling the Gemini API then get your credentials and set up your Cloud project
  2. Languages -- decide whether you want to run Node.js or Python (should be easy); each language has several options too
  3. GCP Serverless platforms -- decide which serverless platform you want to run, and by the way, what's serverless anyway?

Each has its own set of prerequisites which must be satisfied before running any of the sample apps.

Gemini API

Gemini is a popular, proprietary, multimodal LLM (large language model), accessed via its public chat interface or as the LLM for agents. Its API lets developers tap into its capabilities directly from web apps or mobile backends, and the former is the use case we're addressing in this post. One aspect of using the API is unexpected: Google makes the Gemini API available from two different platforms. Each has its own set of use cases as well as pricing differences. Learn about both platforms via the resources in the following table:

Platform Use Cases Pricing
Google AI ("GAI") Experimenting, free tier/lower cost, lower barrier-to-entry, hobbyists/students pricing
Vertex AI (Google Cloud/"GCP") Production AI workloads, existing GCP customers adding and/or considering AI capabilities pricing

Since this post is mostly introductory, we'll use the Gemini API from Google AI, however, the current, unified client library lets developers transition to GCP easily, representing a significant improvement from when there were two distinct client libraries, one for each platform. Compare, contrast and learn the differences between all three client libraries (two old and one new) in the second post listed above which aims to help those migrating to Gemini 2.5.

There's no doubt that old samples live forever online and vibecoding LLMs have been trained on both old and new code, so that post serves as a PSA (public service announcement) to show readers the differences between all three libraries so they can upgrade their own applications or fix code using old libraries generated by LLMs.

Requirements(s)

In order to run the sample apps locally or deploy to the cloud, you need to complete the prerequisites:

  1. Get credentials: Go to the GAI API keys page and select an existing API key or click "Create API key" to make a new one. Save the API key text string somewhere safe; we'll come back to it next.
  2. Get project: The "Create API key" dialog also lets you create a new project (or import an existing one), so do that too.

Create API key page

[IMG] Google AI create API key page

Create API key dialog

[IMG] Google AI create API key dialog

 

Existing GCP users: Experienced GCP users may prefer to do the above from the Cloud Console:

Create developer project

[IMG] Cloud console create project page

Create API key page/dialog

[IMG] Cloud console create API key dialog on create credentials page

Languages

The code featured in this post is available in various versions of Node.js and Python:

Language Version Web framework
Node.js ECMAscript module Express.js
Node.js CommonJS script Express.js
Python Python 3 Flask
Python Python 3 FastAPI

All code samples are available in this repo. Regular readers will notice it differs from this blog's regular repo, and there's a reason for this. I spent several years on Google's GCP Serverless team representing those products. During that time, I created a repo named (Cloud) Nebulous Serverless to show developers how they can deploy the same app to all platforms without any code changes.

That repo held all sample applications I built meeting this criteria. Google archived that repo after my departure, so I'm continuing all work in my personal fork. The samples in this post also meet that criteria, so that's why they're in this repo and not the normal one. (I also removed "Cloud" from the repo name as there are now non-Cloud sample apps in the fork.)

Requirements(s)

  1. Setup local environment:
    • Clone the code locally with git clone https://github.com/wescpy/nebulous-serverless.git and go to the app folder in multi/webgem with cd multi/webgem.
    • Node.js: Ensure you have contemporary versions of Node & NPM (recommend 18+), cd nodejs and install all packages with npm i
    • Python: Ensure you have a contemporary version of Python (recommend 3.9+) and cd python
      • (optional) Create & activate a virtual environment ("virtualenv") for isolation with python3 -m venv .venv; source .venv/bin/activate
      • For the commands below, depending on your system configuration, you will use one of (pip, pip3, python3 -m pip), but the instructions are generalized to pip.
      • (optional) Update pip and install uv with pip install -U pip uv
      • Decide whether to run the Flask or FastAPI version.
      • Flask: Install all packages: uv pip install -Ur requirements.txt (drop uv if you didn't install it)
      • FastAPI: Replace Flask files by moving them out of the fastapi folder: mv fastapi/* . then install all packages: uv pip install -Ur requirements.txt (drop uv if you didn't install it)
  2. Set credentials: Previously, you selected a project and created an API key; now save it to one of these files, depending on which language you're using:
    • Node.js: Save as API_KEY = YOUR_API_KEY to .env
    • Python: Save as API_KEY = 'YOUR_API_KEY' to settings.py

You can model yours after the provided templates, .env_TMPL or settings_TMPL.py:

.env_TMPL

API_KEY="YOUR_API_KEY"
GCP_METADATA='{
  "project": "YOUR_GCP_PROJECT",
  "location": "YOUR_GCP_REGION"
}'
Enter fullscreen mode Exit fullscreen mode

settings_TMPL.py

API_KEY = 'YOUR_API_KEY'
GCP_METADATA = {
    'project':  'YOUR_GCP_PROJECT',
    'location': 'YOUR_GCP_REGION',
}
Enter fullscreen mode Exit fullscreen mode

In addition to API_KEY, these files also contain GCP_METADATA in case you migrate to GCP in the future. See this post on the additional setup and minor code updates you need to make to run your Gemini API-powered apps on Vertex AI.

Use of settings.py as a naming convention follows in Django's footsteps, but alternatively, you can save it to .env and integrate use of python-dotenv to more closely mirror Node.

Serverless platforms

Serverless computing with Google

Google Cloud has several serverless platforms to choose from:

  1. Google App Engine (GAE) -- the "OG" serverless platform that launched back in 2008 & somewhat modernized in 2018; uses customized, proprietary containers, free static file edge-caching, and generous outbound networking free tier
  2. Cloud Run (GCR) -- the latest serverless platform; OCI-compliant containers (Docker, Buildpacks, etc.)

As far as deciding between App Engine and Cloud Run, most people would say go with Cloud Run as it is App Engine's next-generation replacement. It's more flexible (has fewer restrictions), supports modern app deployment (containers), and has many new features, including "Jobs," GPUs, and many more.

See this post to learn more about App Engine, including what it can still do that Cloud Run can't (yet), and see this post to learn more about Cloud Run. Those posts provide much more detail than what's in this post which focuses primarily on deployment instructions. The apps in the repo should be run on both platforms without code changes. See this post to learn more about serverless and these platforms.

Requirements(s)

  1. Choose a platform (or experiment and deploy to both).
    • If you pick App Engine, a running application is known as an "app," and you can only have one app in any Cloud project. Enable GAE and create the app in the Cloud console.
    • If you pick Cloud Run, a running application is known as a "service," and you can have any number of services in a project. If you're new to GCR, your Cloud Run dashboard will be empty.
    • With Cloud Run, you also have an additional decision: Docker or not? While many developers are familiar with containers and specifying how they should be built via a Dockerfile, others prefer to avoid Docker or managing Dockerfiles, and Cloud Run supports both options. Sample Dockerfiles are provided, but you can delete them before deploying or not have them for your own apps.

⚠️ Cost: billing required (but "free?!?")
Before deploying to the cloud, a word on cost. While many Google products are free to use, GCP products are not. In order to run the sample apps, you must enable billing backed by a financial instrument like a credit card (payment method depends on region/currency). If you're new to GCP, review the billing & onboarding guide. Deploying and running the sample app(s) in this post should not incur any cost because basic usage falls under various free tiers:

  1. Several GCP products (like GAE & GCR) have an "Always Free" tier, a free daily or monthly usage quota before incurring charges. See the GCR pricing and quotas pages for more information. Furthermore, deploying to GCP serverless platforms incur minor build and storage costs. (Also see similar content in the GAE docs.)

  2. Cloud Build has its own free quota as does Cloud Storage (GCS), used to store build artifacts. Cloud Build sends application images to the Cloud Artifact Registry (CAR) (or its predecessor), making them accessible to other GCP services. These eat into GCS & CAR (storage) quotas as does transferring images between services & regions. You may be in a region that does not have a free tier however, so monitor your usage to minimize any costs. (Check out storage use and delete old/unwanted build artifacts via the GCS browser.)

  3. Use the cost calculator to get monthly estimates. You may qualify for credits to offset GCP costs: If you are a startup, consider the GCP for Startups program grants. If you are in education, check out the GCP education programs for students, faculty, and researchers.

The app

This sample app is very basic and works like this:

  1. Start with an empty form asking the user for several input items, an image to upload and an LLM prompt: webgem-empty
  2. Select an image for Gemini using the normal file-picker: webgem-imgpick
  3. Enter a suitable prompt (or take the default "Describe this image"): webgem-imgNpromptSet
  4. When the submit button is pressed, the button text changes to "Processing...", and once Gemini is done, a thumbnail of the image is provided along with LLM output: webgem-results

I ripped these from the original post where you can get all the details about this app.

The code

The basic web apps in this post are nearly identical to those in that original post. The only differences between that post's repo and this post's repo are the additional files and packages required to deploy that web app to GCP serverless platforms:

Node.js original web app

File Description Platform
nodejs/.env_TMPL .env environment settings template Node
nodejs/package.json 3rd-party packages Node
nodejs/main.js Express.js sample app Node (CommonJS script)
nodejs/main.mjs Express.js sample app Node (ECMAScript module)
nodejs/templates/index.html Web template Nunjucks (identical to Jinja2)

Node.js GCP serverless deployments

File Description Platform
nodejs/app.yaml Config file App Engine
nodejs/Dockerfile Dockerfile Cloud Run (with Docker)
nodejs/.dockerignore .dockerignore Cloud Run (with Docker)
nodejs/Procfile Procfile Cloud Run (without Docker)
nodejs/.gcloudignore .gcloudignore App Engine & Cloud Run

Python original web app

File Description Platform
python/settings_TMPL.py settings.py environment settings template Python 3
python/requirements.txt Flask 3rd-party packages Python
python/main.py Flask sample app Python 3
python/templates/index.html Web template Jinja2 (identical to Nunjucks)
python/fastapi/requirements.txt FastAPI 3rd-party packages Python 3
python/fastapi/main.py FastAPI sample app Python 3

Python GCP serverless deployments

File Description Platform
python/app.yaml Config file (Flask) App Engine
python/Dockerfile Dockerfile (Flask) Cloud Run (with Docker)
python/.dockerignore .dockerignore Cloud Run (with Docker)
python/Procfile Procfile (Flask) Cloud Run (without Docker)
python/.gcloudignore .gcloudignore App Engine & Cloud Run
python/fastapi/app.yaml Config file (FastAPI) App Engine
python/fastapi/Dockerfile Dockerfile (FastAPI) Cloud Run (with Docker)
python/fastapi/Procfile Procfile (FastAPI) Cloud Run (without Docker)

Run locally

Node.js

Earlier, you changed to the Node folder (with cd nodejs) and installed all packages (npm i), so to run the app locally, execute: node main.mjs (or `node main.js for CommonJS).

Python

Earlier, you changed to the Python folder (with cd python). If you decided to run the Flask version, you would've installed all packages (with uv pip install -Ur requirements.txt). If you chose FastAPI instead, you would've replaced the Flask files (with mv fastapi/* ., then run the same install command. To run the app locally (either version), execute: python main.py.

Deploy to the cloud

With cloud deployments, you have many more options, all are listed below along with instructions once you've doublechecked you have a .env or settings.py file with your API key.

  • Node.js (ECMAscript module) on App Engine
    1. Run gcloud app deploy
  • Node.js (CommonJS script) on App Engine
    1. Edit package.json and change main.mjs to main.js globally
    2. Run gcloud app deploy
  • Node.js (ECMAscript module) on Cloud Run with Docker
    1. Doublecheck you have a .env file with your API key
    2. Run gcloud run deploy SVC_NAME --allow-unauthenticated --source . --region REGION, replacing SVC_NAME with your service name
      • Example: gcloud run deploy genai --allow-unauthenticated --source . --region us-west1
  • Node.js (CommonJS script) on Cloud Run with Docker
    1. Edit package.json and change main.mjs to main.js globally
    2. Run gcloud run deploy SVC_NAME --allow-unauthenticated --source . --region REGION, replacing SVC_NAME with your service name
  • Node.js (ECMAscript module) on Cloud Run without Docker
    1. Delete Dockerfile
    2. Run gcloud run deploy SVC_NAME --allow-unauthenticated --source . --region REGION, replacing SVC_NAME with your service name
  • Node.js (CommonJS script) on Cloud Run without Docker
    1. Edit package.json and change main.mjs to main.js globally
    2. Delete Dockerfile
    3. Run gcloud run deploy SVC_NAME --allow-unauthenticated --source . --region REGION, replacing SVC_NAME with your service name
  • Python Flask on App Engine
    1. Run gcloud app deploy
  • Python FastAPI on App Engine
    1. Run mv fastapi/* . (if you haven't already)
    2. Edit requirements.txt and uncomment the line for gunicorn (it's required)
    3. Run gcloud app deploy
  • Python Flask on Cloud Run with Docker
    1. Run gcloud run deploy SVC_NAME --allow-unauthenticated --source . --region REGION, replacing SVC_NAME with your service name
      • Example: gcloud run deploy genai --allow-unauthenticated --source . --region us-west1
  • Python FastAPI on Cloud Run with Docker
    1. Run mv fastapi/* . (if you haven't already)
    2. Run gcloud run deploy SVC_NAME --allow-unauthenticated --source . --region REGION, replacing SVC_NAME with your service name
  • Python Flask on Cloud Run without Docker
    1. Delete Dockerfile
    2. Run gcloud run deploy SVC_NAME --allow-unauthenticated --source . --region REGION, replacing SVC_NAME with your service name
  • Python FastAPI on Cloud Run without Docker
    1. Run mv fastapi/* . (if you haven't already)
    2. Delete Dockerfile
    3. Run gcloud run deploy SVC_NAME --allow-unauthenticated --source . --region REGION, replacing SVC_NAME with your service name

Here are a couple of results from my deployments to App Engine and Cloud Run:

Web app on App Engine

[IMG] Web app sample result on App Engine

 

Web app on Cloud Run

[IMG] Web app sample result on Cloud Run

Wrap-up

The original post covered a basic web app using the Gemini API. In this post, we follow-up with that same code sample, but add in the extra files needed to deploy it to the Cloud, specifically on GCP serverless platforms which have the benefit of incurring billing only if your app is getting traffic, unlike on VMs where you're billed 24x7. Learning how to create basic web apps using the Gemini API is useful, but if you want to bring it to the world, deploying to serverless is one way to do it!

If you found an error in this post, a bug in the code, or have a topic you want me to cover in the future, drop a note in the comments below or file an issue at the repo. Also check out other posts in this series covering the Gemini API. Thanks for reading, and I hope to meet you at an upcoming event soon... see the travel calendar at the bottom of my consulting site.

PREV POST: Basic web apps using the Gemini API

References

Below are various links relevant to this post:

Code samples

Gemini API (Google AI) general

Gemini models

Other relevant content by the author



WESLEY CHUN, MSCS, is a Google Developer Expert (GDE) in Google Cloud (GCP) & Google Workspace (GWS), author of Prentice Hall's bestselling "Core Python" series, co-author of "Python Web Development with Django", and has written for Linux Journal & CNET. He's currently an AI Technical Program Manager at Red Hat focused on upstream open source projects that make their way into Red Hat AI products. In his spare time, Wesley helps clients with Google integrations, App Engine migrations, and Python training & engineering. He was one of the original Yahoo!Mail engineers and spent 13+ years on various Google product teams, speaking on behalf of their APIs, producing sample apps, codelabs, and videos for serverless migration and GWS developers Wesley holds degrees in Computer Science, Mathematics, and Music from the University of California, is a Fellow of the Python Software Foundation, and loves to travel to meet developers worldwide. Follow he/him @wescpy on Tw/X, BS, and his technical blog. Find this content useful? Contact CyberWeb for professional services or buy him a coffee (or tea)!

Top comments (0)