Sebastián Rojas Ricaurte

Posted on Mar 14

Integrate Google Gemini AI to your Angular 17 app

#angular #gemini #ai #programming

A step-by-step guide: build a simple application to test Gemini Pro and Gemini Pro Visual via the official client.

Index

What is Google Gemini?
Google AI Studio API key
Create the Angular Application
Logic code
- Initialize model
- Generate text from text-only input (text)
- Generate text from text-and-images input (multimodal)
- Build multi-turn conversations (chat)
- Generate content using streaming (stream)
Template code

What is Google Gemini?

Gemini is a multimodal AI model that can understand and generate text, as well as other types of information like audio, images, videos, and code. Gemini is Google's most capable AI model, and is the first to outperform human experts on MMLU (Massive Multitask Language).

Ultra. 1.0 (preview): Most capable model for large-scale, highly complex text and image reasoning tasks coming in early 2024. Used by Gemini Advanced (formerly Bard):
Pro. 1.01.5 (available): The best performing model with features for a wide variety of text and image reasoning tasks.
Nano. 1.0 (preview): The most efficient model built for on-device experiences, enabling offline use cases. Leverages device processing power at no cost.

In addition to Gemini models, exists the Gemma open models.
Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models.

Google AI Studio API key

Go to aistudio.google.com and create an API key.

Create the Angular Application

Run ng new google-gemini-angular-demo --ssr=false.

Setting up

Once done, let's create an environment variable to store the just generated API key. Go to the created folder and run ng g environments. Then add the googleAiApiKey to both environment files.

In the src/environments/environment.ts file:

export const environment = { googleAiApiKey: '' };

In the src/environments/environment.development.ts file:

export const environment = {
  googleAiApiKey: 'AIzaSyD-XZY53eApU74AkLfUZPrWfv49geU1dfw',
};

Add the official client to access Gemini models. Run npm i @google/generative-ai.

Logic code

Initialize model

Add this to your app.component.ts file:

...
/**
   * Creates, configure with defaults, and returns the Google Gemini model from the SDK
   * @param model 'gemini-pro' | 'gemini-pro-vision'
   * @returns
   */
  private initializeModel(model: 'gemini-pro' | 'gemini-pro-vision') {
    const googleGenerativeAI = new GoogleGenerativeAI(
      environment.googleAIApiKey
    );
    const generationConfig = {
      safetySettings: [
        {
          category: HarmCategory.HARM_CATEGORY_HARASSMENT,
          threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
        },
      ],
      temperature: 0.9,
      top_p: 1,
      top_k: 32,
      maxOutputTokens: 100, // limit output
    };
    return googleGenerativeAI.getGenerativeModel({
      model: model,
      ...generationConfig,
    });
  }

You can utilize the https://ai.google.dev/docs/safety_setting_gemini (block medium or high) or change them to suit your needs. In the example, we raised the harassment level to prohibit outputs with a low or high risk of being dangerous.

https://ai.google.dev/models/gemini that are currently available, along with their default settings. There is a limit of 60 requests per minute. Learn more about model parameters here.

Now, let's add the four demo methods to the app.component.ts file. Fist, add some code:

...
  history: any;
  prompt?: string;
  multipartPrompt?: Part[];
  stringsToBeTyped: string[] = [];
...
  private clean() {
    this.history = undefined;
    this.prompt = undefined;
    this.multipartPrompt = undefined;
    this.stringsToBeTyped = [];
  }

Generate text from text-only input (text)

 /**
   * Demonstrates Gemini Pro with a text-only input.
   */
  async textOnlyDemo() {
    this.clean();

    this.prompt =
      "You are a local historian researching the Bennington Triangle disappearances. Write a news report for a national audience detailing your recent findings, including interviews with eyewitnesses (vary details for each response - sightings of strange lights, unusual sounds, personal connection to a missing person). Maintain a neutral tone, presenting the facts while acknowledging the case's lack of resolution.";
    const result = await this.initializeModel('gemini-pro').generateContent(
      this.prompt
    );
    const response = await result.response;
    this.stringsToBeTyped = [response.text()];
  }

Generate text from text-and-images input (multimodal)

/**
   * Demonstrates how to use Gemini Pro Vision with text and images as input (using an image in src/assets for convenience).
   */
  async multimodalDemo() {
    this.clean();

    try {
      let imageBase64 = await inject(FileConversionService).convertToBase64(
        'assets/cheesecake.jpg'
      );

      // Check for successful conversion to Base64
      if (typeof imageBase64 !== 'string') {
        console.error('Image conversion to Base64 failed.');
        return;
      }

      this.multipartPrompt = [
        {
          inlineData: {
            mimeType: 'image/jpeg',
            data: imageBase64,
          },
        },
        {
          text: 'Provide a recipe.',
        },
      ];
      const result = await this.initializeModel(
        'gemini-pro-vision'
      ).generateContent(this.multipartPrompt);
      const response = await result.response;
      this.stringsToBeTyped = [response.text()];
    } catch (error) {
      console.error('Error converting file to Base64', error);
    }
  }

Remember to put a cheesecake image under assets/cheesecake.jpg.

Image requirements for Gemini:

MIME types supported include: image/png, image/jpeg, image/webp, image/heic, and image/heif.
There are a maximum of 16 photos.
A maximum of 4MB, including photos and text.
Large photos are scaled down to 3072 by 3072 pixels while maintaining their original aspect ratio.

In order to convert the input image into Base64, let's create a FileConversion service. Run ng g s FileConversion. In the file-conversion.service.ts generated file, replace contents with:

import { HttpClient } from '@angular/common/http';
import { Injectable } from '@angular/core';
import { firstValueFrom } from 'rxjs';

@Injectable({
  providedIn: 'root',
})
export class FileConversionService {
  constructor(private http: HttpClient) {}
  async convertToBase64(
    filePath: string
  ): Promise<string | ArrayBuffer | null> {
    const blob = await firstValueFrom(
      this.http.get(filePath, { responseType: 'blob' })
    );
    return new Promise((resolve, reject) => {
      const reader = new FileReader();
      reader.onloadend = () => {
        const base64data = reader.result as string;
        resolve(base64data.substring(base64data.indexOf(',') + 1)); // Extract only the Base64 data
      };
      reader.onerror = (error) => {
        reject(error);
      };
      reader.readAsDataURL(blob);
    });
  }
}

Be sure to provide the HttpClient service in your app configuration file app.config.ts:

import { ApplicationConfig } from '@angular/core';
import { provideRouter } from '@angular/router';

import { routes } from './app.routes';
import { provideHttpClient } from '@angular/common/http';

export const appConfig: ApplicationConfig = {
  providers: [provideRouter(routes), provideHttpClient()],
};

Build multi-turn conversations (chat)

You can utilize the first user message in the history as a system prompt. Just remember to include a model response that acknowledges the directions. For example:

User: could take on the character of a superhero and write in that style. Do not lose your character. Please respond if you understand these instructions.

Model: I understand.

/**
   * Demonstrates how to use Gemini Pro to build a multi-turn conversation
   */
  async chatDemo() {
    this.clean();
    this.history = [
      {
        role: 'user',
        parts: 'Hi, Gemini!',
      },
      {
        role: 'model',
        parts: "It's great to meet you. What do you want to know?",
      },
    ];

    const chat = this.initializeModel('gemini-pro').startChat({
      history: this.history,
      generationConfig: {
        maxOutputTokens: 100,
      },
    });

    this.prompt = 'What is the largest number with a name? Brief answer.';
    const result = await chat.sendMessage(this.prompt);
    const response = await result.response;
    this.stringsToBeTyped = [response.text()];
  }

Generate content as is created using streaming (stream)

/**
   * Demonstrates how to use Gemini Pro to generate content using streaming
   */
  async streamDemo() {
    this.clean();
    this.prompt = 'Generate a poem.';

    const prompt = {
      contents: [
        {
          role: 'user',
          parts: [
            {
              text: this.prompt,
            },
          ],
        },
      ],
    };
    const streamingResp = await this.initializeModel(
      'gemini-pro'
    ).generateContentStream(prompt);

    for await (const item of streamingResp.stream) {
      console.log('stream chunk: ' + item.text());
      this.stringsToBeTyped.push('stream chunk:  ' + item.text());
    }
    console.log(
      'aggregated response: ' + (await streamingResp.response).text()
    );
  }

Template code

Let's add some code to our main component template in order to try this on the browser.

Add the Typed.js wrapper for Angular so we can create the type effect. Run npm i ngx-typed-js.

Add this to the app.component.html file (complete code here):

<main class="main">
  <div class="content">
...
      <h1>Google Gemini Angular Demo</h1>
      <h2>Please choose a demo:</h2>
      <div class="btn-group">
        <button (click)="textOnlyDemo()">Unique prompt</button>
        <button (click)="multimodalDemo()">Image prompt</button>
        <button (click)="chatDemo()">Prompt with chat history</button>
        <button (click)="streamDemo()">Streaming</button>
      </div>
      @if(prompt || multipartPrompt) { @if(history) {
      <h2>Chat history</h2>
      @for(item of history; track item) {
      <p>
        <b>{{ item.role }}</b
        >: @if (item.parts[0].length > 1) {
        <i>{{ item.parts.join(", ") }}</i> }@else {<i>{{ item.parts }}</i
        >}
      </p>
      } } @if (multipartPrompt) {
      <h2>Prompt</h2>
      @for(part of multipartPrompt; track part) { @if(part.inlineData) {
      <img
        src="{{ 'data:{{part.inlineData}};base64,' + part.inlineData.data }}"
        style="max-width: 300px; max-height: 200px"
      />
      }@else {
      <h3>{{ part.text }}</h3>
      } } }@else {
      <h2>Prompt: {{ prompt }}</h2>
      } @for (string of stringsToBeTyped; track string) {
      <ngx-typed-js [strings]="[string]">
        <pre class="typing"></pre>
      </ngx-typed-js>
      } @empty {
      <p>Waiting for response to start</p>
      <br />
      <div class="dot-flashing"></div>
      } }
    </div>
  </div>
</main>

Add this to src/style.scss file:

.typed-cursor {
    display: none !important;
}

Andd change the src/app/app.component.scss contents to this.

Great! You are now able to use Gemini features. If you get lost along the way, refer to the complete code.

Top comments (2)

ASP • Mar 16

You are not mentioned CORS issue. Do you really able send request to the Google AI Gemini API?

Sebastián Rojas Ricaurte • Mar 19

What's your issue? The tutorial was developed and tested locally as described.

DEV Community