DEV Community

Evan Lin
Evan Lin

Posted on • Originally published at evanlin.com on

Google Gemma2/PaliGemma: Notes on Learning and Applications

title: [Google Gemma2/PaliGemma] Gemma2/PaliGemma Study Notes, Application Scope
published: false
date: 2024-07-11 00:00:00 UTC
tags: 
canonical_url: https://www.evanlin.com/google-gemma2_study_note/
---

## Google AI Dev - Gemma2 && PaliGemma

![img](https://www.evanlin.com/images/2022/mmexport1720575701272.jpg)

This image briefly explains the two main products of the Gemma family:

- Gemma 2: The second generation of Gemma
- PaliGemma: The first generation VLM (Visual Language Model)

### PaliGemma Related Resources:

- [Here's the demo code for paligemma](https://huggingface.co/spaces/big-vision/paligemma-hf) (on HuggingFace)

![image-20240711223813438](https://www.evanlin.com/images/2022/image-20240711223813438.png)

The above are the relevant Benchmarks for the PaliGemma model, as you can see.

By comparing the relevant test data and methods, you can achieve good accuracy.

### Directly Deploying PaliGemma on GCP

[https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/paligemma](https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/paligemma)

![](https://www.evanlin.com/images/2022/image-20240711224714480.png)

On the other hand, Gemma 2 can also: [https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/gemma2](https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/gemma2)

![image-20240712105925552](https://www.evanlin.com/images/2022/image-20240712105925552.png)

# Applicable Content for Gemma

By using Gemma, the number of tokens can be significantly reduced. The following are several directions to consider:

## Detection of Personal Privacy

Allowing some parts that may contain personal privacy to be screened more effectively.

### Previous Approach:

Detecting and removing personal information has always been a difficult technique, requiring many regular expressions for control. Even so, there may still be omissions. This section on personal information detection can actually be used in LLMs. However, according to information security regulations, directly transmitting users' personal information to a third party does not comply with the regulations. Therefore, this section can be implemented through Gemma.

### How to use Gemma (on-device LLM) to process

Enter fullscreen mode Exit fullscreen mode

Check if the following content contains personal information, address, ID number, bank account number, and reply with Yes or No

I want to find a house in Taipei


### How to use PaliGemma to process?

Enter fullscreen mode Exit fullscreen mode

Check if the image content contains personal information, address, identity number, or bank account number, and reply with Yes or No.


### Practical Testing - Gemma2 / PaliGemma

#### Text Testing Gemma2-9B

![iTerm2 2024-07-12 11.49.58](https://www.evanlin.com/images/2022/iTerm2%202024-07-12%2011.49.58.png)

Changed to "someone" and "spouse", and can effectively remove ID numbers and bank account numbers.

#### Image Testing PaliGemma

![Google Chrome 2024-07-12 11.10.02](https://www.evanlin.com/images/2022/Google%20Chrome%202024-07-12%2011.10.02.png)

![image-20240712111911951](https://www.evanlin.com/images/2022/image-20240712111911951.png)

## Determining if "Large" LLM Intervention is Needed

The biggest fear when putting a chatbot into a group is the "Token" explosion. Because each sentence needs to determine whether LLM intervention is needed. At this time, Gemma can effectively handle it.

### Previous Approach

I've seen some people do it like this: `Set the prefix "@" "!" to call LLM`

These methods are not wrong, but they often make the text in the chat room difficult to understand.

### How to use Gemma (on-device LLM) to process

You can use a simple Prompt to determine

Enter fullscreen mode Exit fullscreen mode

Check if the following text requires the assistance of a customer service representative. Answer yes/no

I ate danbing this morning


### Physical Testing Gemma

![image-20240712122131249](https://www.evanlin.com/images/2022/image-20240712122131249.png)

## Future Outlook

The so-called "small models" or "Open Models" are not only useful in on-device applications. Looking at the concept of Cloud Services, it is also possible to use on-device LLM models for related applications. There should be more related applications in the future, and I believe that in the future
Enter fullscreen mode Exit fullscreen mode

Top comments (0)