Like many of us, I have been fascinated by the capabilities of tools like ChatGPT, Google Gemini, Claude and so on. As someone who loves coding and trying new tools, I wanted to explore what these models can offer via their APIs.
I previously experimented with OpenAI so I wanted to try something different. I opted for Google Gemini AI mainly because of its generous free tier and low costs.
My aim was to create a simple proof-of-concept (POC) page to check grammatical errors by pasting or typing text into a text box and receiving feedback from the AI. I quickly set up the app with the following tech stack:
- Nuxt 3 (Vue)
- Typescript
- SCSS
I then started by creating an API endpoint - using Nuxt’s server/api folder - to manage all LLM interactions server side.
🔌 11 lines of code to connect to the model
Surprisingly I only needed these 11 lines of code to get started, using the javascript SDK provided by Google https://www.npmjs.com/package/@google/generative-ai
import { GoogleGenerativeAI, SchemaType } from '@google/generative-ai';
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);
const model = genAI.getGenerativeModel({
model: 'gemini-1.5-flash',
generationConfig: {
responseMimeType: 'application/json',
responseSchema: schema,
maxOutputTokens: 800,
temperature: 0.1,
},
});
After defining the model, I could send instructions to the model, along with the text provided by the user in the frontend app.
Below is a simplified version of the function that initiates a chat with the model.
const chat = model.startChat({
history: [
{
role: 'user',
parts: [
{
text: 'Instructions to check grammar correctness goes here',
},
],
},
...textToCheck,
],
});
const result = await chat.sendMessage(text);
const structuredResponse = result.response;
return structuredResponse;
🖌️ Frontend
The frontend is made by a simple form with a textbox and a 'Submit' button where the user can type or paste text.
We then send a request to the API endpoint created, which processes it with Gemini AI and returns the results back to the frontend.
On the dashboard’s right panel, the response provides a summary and suggested changes, when available
⚒️ Shaping the response
One standout feature of Google Gemini's API is its support for custom response schemas.
Using this feature, I could configure the model to indicate whether a sentence is grammatically correct and, if not, to provide a corrected version.
Here’s the schema I used:
const schema = {
type: 'object' as SchemaType.OBJECT,
properties: {
feedback: { type: 'string' as SchemaType.STRING },
suggested_response: {
type: 'string' as SchemaType.STRING,
description: "",
nullable: true,
},
},
required: ['feedback'],
};
For example, when evaluating the sentence: “Their going to be here soon.” the model returns:
{
"feedback": "The sentence has a grammatical error. \"Their\" should be \"They're\" (They are). Also, there should be a space after \"going\".",
"suggested_response": "They're going to be here soon."
}
This structured output makes it easy to process and display results in the frontend.
It's interesting that you can also describe each property and what you expect from the response using the description key within the schema object.
You can find out more at this link https://ai.google.dev/gemini-api/docs/structured-output?lang=node
Final thoughts
It certainly feels like writing this short post to share my journey took longer than implementing the LLM model itself.
Although my knowledge of AI is limited, I see immense potential in this technology. With much of the complexity abstracted away, we can focus on building products and exploring endless opportunities for innovation. Whether it’s a tool that simplifies our day-to-day tasks or a larger, more ambitious project, we can only embrace the possibilities this technology offers.
If you think this grammar checker could be useful to you, feel free to try it out here https://grammaco.alessioch.com/
Cover image from Novoto Studio
Top comments (0)