DEV Community

ptvty
ptvty

Posted on • Updated on

Next-Level Web Applications with On-Device Generative AI: A Look at Google Chrome's Built-In Gemini Nano LLM

Web development is on the brink of a significant transformation with Google Chrome Canary's latest experimental feature, a new tool called the window.ai API, allowing websites to harness the power of on-device generative AI. With Google’s Gemini Nano AI model built into the browser, websites can offer smarter, more personalized experiences directly on the user's device. Let's dive into what this means and how you can use it to supercharge your web applications.

Meet Gemini Nano

Gemini Nano is a compact yet powerful AI model from Google. It is the same model used in some Google Pixel phones for offline AI features. Its small size and impressive capabilities make it perfect for on-device applications, ensuring users benefit from advanced AI without needing to be online.

What is the window.ai API?

Google Chrome Canary consistently introduces exciting new features for developers, and the window.ai API is no exception. This API enables your website's JavaScript code to interact directly with Gemini Nano, a model that operates on the user's device. This means all AI tasks are performed locally using the computer’s GPU, ensuring no data is sent over the internet. This approach significantly enhances privacy and allows for offline functionality.

How to Use the window.ai API

Getting started with the window.ai API is straightforward. Here’s how you can integrate it into your website:

  1. Install Google Chrome Canary: Ensure you have the latest version of Google Chrome Canary installed.

  2. Enable the Experimental Feature: Follow the instructions to enable the window.ai experimental feature.

  3. Check for API Availability: In your code, verify that window.ai is defined to ensure the Local AI API is available in the user’s browser.

  4. Start Sending Prompts: Once you know the API is available, you can begin sending prompts to Gemini Nano and receive responses using ai.createTextSession() and session.prompt() APIs.

  5. Utilize the AI's Response: Use the AI's response in various ways, such as displaying it to the user, updating your UI, or for more complex processing.

if (window.ai) {
    logOceanPoem();
} else {
    console.log("The window.ai API is not available on this browser.");
}

async function logOceanPoem(prompt) {
    try {
        const session = await ai.createTextSession({
            temperature: 0,
            topK: 1
        });
        const poem = await session.prompt("Write a poem about the ocean.");
        console.log(poem);
    } catch (error) {
        console.error("Error generating text:", error);
    }
}

// => ' In the realm of vast and boundless blue,
//     Where secrets hide and mysteries brew,
//     There lies a realm of ...'
Enter fullscreen mode Exit fullscreen mode

Some Exciting Use Cases

The possibilities with on-device generative AI are endless. Here are a few exciting ways you can use it:

Privacy-Focused Apps

Use local AI for sensitive applications like health, finance, or where strict data protection policies are required, ensuring complete privacy and data security.

Offline-First PWAs

Enhance chatbots and virtual assistants with more natural and responsive interactions, even when offline.

Enhanced User Inputs

Create smart user interfaces by integrating AI into the existing form components. For example, a select box can suggest the most relevant options if a user types an invalid option, rather than just displaying "Not found."

// The user tries to type in "android developer" in a searchable select box to fill his occupation, not knowing the granularity of the available options

await session.prompt(`which are the 3 closest phrases to "android" from the following phrases:
1. DevOps Engineer
2. SEO specialist
3. Dentist
4. Cashier
5. Mobile Developer
6. Web Developer
7. Carpenter
8. Desktop Developer`);

// => ' The three closest phrases to "Android" from the given phrases are: 1. Mobile Developer 2. Web Developer'
Enter fullscreen mode Exit fullscreen mode

Assisted Writing

Enhance text areas with AI writing assistance, allowing text summarization and rephrasing without sending user data to external servers.

await session.prompt(`Rephrase this sentence: 
Facebook suggested your profile, I looked at your profile and I found your story inspiring.`);

// => ' Facebook displayed your profile to me, and upon reviewing it, I found your story to be incredibly inspiring.'
Enter fullscreen mode Exit fullscreen mode

Accessibility and Improved Website Navigation

Create user interfaces that people with disabilities can use more comfortably. For example, a prompt box can intelligently recommend the most relevant actions, pages, sub-menus, and settings based on users' intent. LLMs can easily understands different words that users might use to express the same idea.

const choices = ['Like post', 'Save post', 'Share post', 'Block user', 'Follow user'];

await session.prompt(`Which choice is the most relevent to "I do not want to see any posts from this user again!":
${choices.map((choice, idx) => `${idx + 1}. ${choice}\n`)}
`)

// => ' The most relevant choice to the command "I do not want to see any posts from this user again!" is **"Block user"**.'
Enter fullscreen mode Exit fullscreen mode
await session.prompt(`Which choice is the most relevent to "I want to see future posts from this user!":
${choices.map((choice, idx) => `${idx + 1}. ${choice}\n`)}
`)

// => ' The most relevant choice is **5: Follow user.**'
Enter fullscreen mode Exit fullscreen mode

Improved Text Search

// When a "Find in page" fails to find an exact match, try to search for similar words!

await session.prompt(`Generate 10 words similar to "battery runtime", in the context of "smart phones"`);

// => '1. Battery life 2. Battery endurance 3. Battery longevity 4. Battery capacity 5. Battery duration 6. Battery power 7. Battery energy 8. Battery performance 9. Battery efficiency 10. Battery usage'
Enter fullscreen mode Exit fullscreen mode

Chat With Your Data

Although the Gemini Nano is not the best model for factual reasoning, it can handle basic questions. For example, it can allow users to ask questions about the data in a table.

await session.prompt(`Who is the tallest person in the following list:
1. Alice, Weight: 52 kg, Height: 169 cm.
2. Bob, Weight: 71 kg, Height: 174 cm.
3. Eve, Weight: 66 kg, Height: 172 cm.
`);

// => 'The tallest person in the list is Bob, who weighs 71 kg and is 174 cm tall.'
Enter fullscreen mode Exit fullscreen mode

Tips for Prompt Engineering

To get the best results from the AI, follow these prompt engineering tips:

  • Play with Parameters: For factual questions, set the temperature to 0, and set TopK to 1 for more predictable and accurate answers.
  • Use Line Breaks: Keep related data on the same line to maintain context.
  • Use Parentheses: Add elaborations and clarifications in parentheses to vague words and phrases.
  • Main Question First: Put the main question at the beginning of the prompt, followed by related data.
  • Response Parsing: The model usually uses Markdown to format its response. Use this to your benefit. For example, if you're looking for a few keywords but the response is long, just take the bold words (inside double asterisks).

Why On-Device AI is Awesome

  • Privacy and Security: All data stays on the user's device, enhancing privacy and reducing the risk of data breaches.
  • Offline Functionality: Apps work seamlessly without an internet connection, providing uninterrupted service.
  • Better Performance: Using the device's GPU can lead to faster response times and a more responsive application.
  • Cost Savings: Less reliance on cloud-based services can lower your operational costs.

Things to Keep in Mind

While the window.ai API is fantastic, there are a few things to consider:

  • Device Compatibility: Currently, this feature is exclusive to Google Chrome Canary. Ensure compatibility with other browsers and devices that may not have a GPU available.
  • Resource Use: Running AI models can be heavy on resources, especially on lower-end hardware.
  • Model Limits: On-device models might not be as powerful as their cloud-based counterparts, so balance functionality accordingly.

Conclusion

The window.ai API in Google Chrome Canary introduces a groundbreaking tool that harnesses the capabilities of on-device generative AI for web development. By utilizing local computation, it can enhance privacy, performance, and user experience. While currently an experimental feature, the potential for wider adoption remains uncertain. Nonetheless, exploring its capabilities now can pave the way for developing smarter, more responsive, and secure web applications in the future.

Top comments (0)