Build a serverless AI App with Cloud run, Python, Gemini and Vertex AI

This is part 1 in a series!

Show me the code:
https://github.com/dllewellyn/hello-gemini

Introduction

In this series of tutorials, we’re going to use a suite of google tools - AI, cloud and dev - in order to create and deploy an AI powered application. We’ll steer clear on chatbots, because they are a tedious in the extreme and focus on building something more interesting.

Creating a "Hello World" Flask Application with Google IDX and the Gemini API

To get started, we're going to use the 'hello world' IDX - Gemini and flask application. This gives us a really quick way to get setup with some AI.

Project Setup

Navigate to Google IDX: Begin by accessing the Google IDX platform in your web browser.
Select the Gemini API Template: From the IDX welcome screen, locate and click on the "Gemini API" template under the "Start something new with a template" section.

Configure New Workspace: A "New Workspace" window will appear. Here, customise the following settings:
- Name your workspace: Provide a descriptive name for your project, such as "hello-gemini."
- Environment: Choose the "Python Web App (Flask)" option from the dropdown menu.

Create Workspace: Once configured, click the "Create" button to initialise the workspace creation.
Await Setup Completion: IDX will set up the necessary environment for your Flask application. This process might take a few moments, and a progress bar will indicate the status.

Next Steps

With your workspace ready, you can proceed to develop your Flask application within the provided environment.

Looking through the Hello World application

Obtain a Gemini API Key

Before you begin, you'll need an API key to access the Gemini API.

Visit the Google Cloud Console.
Navigate to the 'API Keys' section.
Click 'Create API Key'.
Choose an existing Google Cloud project or create a new one.
Copy the generated API key. Remember to store your API key securely!

Set up the Flask Application

Create a Python file (e.g., main.py) and install the necessary libraries:

import os
import json
from google.generativeai import genai
from flask import Flask, render_template, request, jsonify

# Replace 'YOUR_API_KEY' with your actual API key
API_KEY = 'YOUR_API_KEY'
genai.configure(api_key=API_KEY)
app = Flask(__name__)

# ... (Rest of the code will be added in the following steps)

Create the HTML Template

Create an HTML file (e.g., index.html) to serve as the front-end for your web application. This template will display images of baked goods, an input field for the prompt, and a results section.

<!DOCTYPE html>
<html>
<head>
    <title>Baking with the Gemini API</title>
</head>
<body>
    <h1>Baking with the Gemini API</h1>
    <div>
        <img src="images/baked-good-1.jpg" alt="Baked Good 1">
        <img src="images/baked-good-2.jpg" alt="Baked Good 2">
        <img src="images/baked-good-3.jpg" alt="Baked Good 3">
    </div>
    <div>
        <label for="prompt">Provide an example recipe for the baked goods in:</label>
        <input type="text" id="prompt" name="prompt">
        <button onclick="generateRecipe()">Go</button>
    </div>
    <div id="results">
        <h2>Results will appear here</h2>
    </div>

    <script>
        function generateRecipe() {
            // ... (JavaScript code to handle user input and send requests to the Flask app)
        }
    </script>
</body>
</html>

Define Flask Routes and Functions

In your main.py file, define the routes and functions to handle requests from the front-end and interact with the Gemini API.

@app.route('/')
def index():
    return render_template('index.html')

@app.route('/generate', methods=['POST'])
def generate_recipe():
    if request.method == 'POST':
        selected_image = request.form['selected_image']
        prompt = request.form['prompt']

        # ... (Code to process the image and prompt, and send a request to the Gemini API)

        response = genai.generate_text(
            model='models/gemini-1.5-pro-vision',
            prompt=prompt,
            image=image_data,
            stream_output=True
        )

        # ... (Code to stream the response back to the front-end)

        return jsonify({'status': 'success', 'response': streamed_response})
        ```
{% endraw %}


### Handle User Input and Display Results

In your {% raw %}`index.html`{% endraw %} file, add JavaScript code to handle user input, send requests to the Flask app, and display the streamed response from the Gemini API.
{% raw %}


```javascript
// ... (Previous code from step 3)

function generateRecipe() {
    const selectedImage = // ... (Get the URL of the selected image)
    const prompt = document.getElementById('prompt').value;

    fetch('/generate', {
        method: 'POST',
        headers: {
            'Content-Type': 'application/x-www-form-urlencoded'
        },
        body: `selected_image=${selectedImage}&prompt=${prompt}`
    })
    .then(response => response.json())
    .then(data => {
        if (data.status === 'success') {
            const resultsDiv = document.getElementById('results');
            resultsDiv.innerHTML = ''; // Clear previous results

            // ... (Code to display the streamed response in the resultsDiv)
        } else {
            // ... (Handle errors)
        }
    })
    .catch(error => {
        // ... (Handle errors)
    });
}

Run the Flask Application

Run your Flask app using the following command:

#!/bin/sh

python -m flask --app main run -p $PORT --debug

Now, you should be able to access your web application in your browser, select an image of a baked good, enter a prompt, and generate a baking recipe using the Gemini API. If you're running inside of IDX, you can just run the 'web preview' - click Cmd+Shift+P and enter 'Web preview' (if on a mac) and you'll see the preview window.