DEV Community: Ashwin Mehta

Best use of Gemini in everyday life.

Ashwin Mehta — Sun, 31 May 2026 14:19:30 +0000

Ashwin Mehta

May 31

Google's Agentic Leap: How Gemini Turned Workspace Into Your Autonomous Executive Assistant

#ai #googlecloud #gemini #agentaichallenge

3 min read

Google's Agentic Leap: How Gemini Turned Workspace Into Your Autonomous Executive Assistant

Ashwin Mehta — Sun, 31 May 2026 14:13:55 +0000

Imagine opening your laptop, typing a single sentence, and watching your AI assistant pull up a photo from last summer, draft a follow-up email to your professor based on that photo's context, and organize the relevant syllabus files from your Drive—all in seconds, without you clicking through a single tab.

This isn't a future roadmap. It is the Agentic Era of personal computing, and Google just quietly moved everyone into it.

By deeply embedding Gemini into Gmail, Google Drive, and Google Photos, Google has evolved from a standard search-and-retrieval ecosystem into a proactive, connected network of AI agents. Here is a breakdown of how this shift changes everything, and exactly how you can use it right now.

From "Search Box" to "Action Agent"
For decades, using Google tools meant doing the heavy lifting yourself. If you needed to find a specific document, you opened Drive, guessed the keywords, found the file, copied the text, opened Gmail, and pasted it in.

In the agentic era, Gemini handles the coordination. Because it has secure, cross-app access to your digital footprint, it acts as a central brain that can read context, find information across fragmented silos, and execute multi-step tasks on your behalf.

1. Google Photos: Natural Language Visual Search
Instead of scrolling endlessly through thousands of images to find a receipt, a certificate, or a specific memory, Gemini treats your photo gallery like an indexable database. You can ask it to isolate images based on highly specific, contextual descriptions, bypassing traditional metadata tags entirely.

2. Gmail: The Auto-Drafting Inbox
Gemini doesn't just suggest quick replies anymore. It acts as an email concierge. It can analyze massive email threads, synthesize the core action items, and draft complex, formal responses or follow-ups that match the required tone—saving you the friction of starting from a blank page.

3. Google Drive: Instant Knowledge Synthesis
Hunting down PDFs, sheets, or slide decks is a massive time sink. Gemini can parse through your entire cloud storage instantly. You can ask it to compare data across two different documents, summarize a massive project proposal, or pull out specific system architectures from a presentation deck without ever opening the files.

How to Use Gemini’s Agentic Superpowers
To get the most out of this integrated ecosystem, you need to change how you talk to the AI. Instead of asking generic questions, give it action-oriented, cross-app commands.

Here are three powerful ways to use it today:

1. Cross-App Summarization & Outreach
The Prompt: "Look through my Google Drive for the latest project architecture document on 'AuraRAG'. Summarize the key methodology steps, and draft a formal email to my HOD updating them on the progress."

What happens: Gemini instantly queries your Drive, reads the technical details, condenses the data, and opens a perfectly formatted draft in Gmail ready for your review.

2. Visual Retrieval & Contextual Processing
**The Prompt: "Find my latest photo of the presentation whiteboard from campus on Google Photos, extract the text from the System Architecture section, and save it as a bulleted list in a new document."

What happens: It bridges the gap between your visual media and text processing, pulling the exact image you need and converting raw pixels into actionable text data.

3. Deep In-Inbox Research
The Prompt: "Search my Gmail for all confirmation emails regarding 'The Arcade' or 'Code Vipassana' programs from the last two months. Create a neat table summarizing my points earned and leaderboard positions."

What happens: Instead of opening multiple emails and manually copying numbers, Gemini crawls the specific sub-set of emails, parses the data points, and builds a clean Markdown table right in your chat interface.

Welcome to the Agentic Era
We are moving away from the era of "software as a tool" and entering the era of "software as a collaborator." Google's decision to weave Gemini directly into the fabric of Workspace means your apps no longer live in isolation. They talk to each other, understand your context, and execute workflows that used to take ten minutes of tedious clicking.

Meme for this week

Ashwin Mehta — Thu, 15 Jan 2026 15:38:00 +0000

Stop Chatting, Start Building: A Developer’s Guide to Google AI Studio

Ashwin Mehta — Wed, 07 Jan 2026 19:42:20 +0000

Introduction
We’ve all been there. You’re building a feature, you open ChatGPT or Claude, you paste in your requirements, you get some code, and then you copy-paste it back into your IDE.It works, but it’s manual. It’s brittle. And it’s hard to automate.If you are a developer, you need to stop using consumer chatbots for your workflow and start.
using Google AI Studio. It is arguably the most underrated tool in the AI stack right now—effectively an IDE for prompt engineering that hands you API-ready code on a silver platter.
Here is how to go from a vague idea to a running Python script in less than 5 minutes

1 Why Google AI Studio?
Before we dive in, why switch?

It’s Fast: The ”Flash” models (Gemini 1.5 and the new 2.5 Flash) are incredibly fast and cheap.
Huge Context: You can paste entire codebases or hour-long videos into the prompt window (1M+ tokens).
The ”Get Code” Button: This is the killer feature. One click converts your playground session into Python, JavaScript, or cURL.

2 Step 1: The Setup (No Credit Card Required)
Go to AI-Studio You can sign in with your standard Google account.
You’ll see an interface that looks like a chatbot, but with more knobs and dials.
• Left Panel: Your history and prompt library.
• Middle: The prompt interface (Chat, Freeform, or Structured).
• Right Panel: Model settings (Temperature, Safety settings).
Pro Tip: Select Gemini 2.5 Flash (or the latest Flash model available). It is the perfect balance of intelligence and speed for most dev tasks.

3 Step 2: Structure Your Prompt with ”System Instructions”
In a standard chat app, you have to constantly remind the bot: ”You are a senior Python engineer,
don’t give me explanations, just code.”
In AI Studio, you set this once in the ”System Instructions” box at the top left.
Example System Instruction:
”You are a rigid data extraction assistant. You only output valid JSON. You never explain your work. If data is missing, use null.”
Now, every message you send will adhere to these rules automatically.

4 Step 3: The ”Get Code” Workflow
Let’s build a simple tool: A Jargon Buster that takes complex tech paragraphs and simplifies them for a non-technical manager.

Set your System Instruction: ”You are a technical translator. Rewrite the input text to be understood by a non-technical PM.”
Test it: Type ”The K8s pod crashlooped because the OOMKiller terminated the container.” → Result: ”The server kept restarting because it ran out of memory.”
Export it: Look for the ”Get Code” button (usually top right, near the ”Run” button). Click it, and select Python. You will get something like this:

import os
from google import genai

# Make sure to set your GEMINI_API_KEY environment variable
client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])

response = client.models.generate_content(
    model="gemini-2.5-flash",
    config={
        "system_instruction": "You are a technical translator. Rewrite the input text to be understood by a non-technical PM."
    },
    contents=["The K8s pod crashlooped because the OOMKiller terminated the container."]
)

print(response.text)

5 Advanced Feature: Structured Outputs ( JSON Mode)
This is where AI Studio separates itself from the pack. If you are building an app, you don’t want
text; you want JSON.

Click the plus (+) icon or look for ”Structured Prompt” options.
Define your Schema. You can literally tell it: ”I want an object with sentiment (enum:positive, negative) and keywords (list of strings).”
Gemini is now forced to follow this structure. It cannot hallucinate a new key or give you a conversational intro.

6 Practical Use Cases
Here are three things I’ve built using this exact workflow:

PR Summarizer: A script that reads a git diff and generates a bulleted summary for the Pull Request description.
Error Log Analyzer: I paste a stack trace, and the model outputs the file name and line number of the likely culprit in JSON format.
Meeting Notes to Tickets: I drop an audio file of a standup meeting into AI Studio (yes,it accepts audio!) and ask it to extract ”Action Items” as a list.

7 Conclusion
The gap between ”using AI” and ”building with AI” is smaller than you think. Google AI Studio bridges that gap by letting you prototype visually and export programmatically.Stop writing your prompt templates from scratch. Build them in the Studio, click ”Get Code,”and ship it.

Google Nano Banana: How Prompt Structure Changes AI Image Results

Ashwin Mehta — Tue, 30 Dec 2025 14:09:42 +0000

Introduction
While experimenting with Google’s Nano model (popularly called Nano Banana 🍌), I realized something interesting:

AI image quality doesn’t depend only on the model—it heavily depends on how you prompt it.

In this post, I’ll share a simple prompting framework I learned that makes AI-generated images more controlled, expressive, and realistic, even for beginners.

This blog is written from a learning-by-doing perspective, not a theoretical one.

What Is Google Nano Banana?
Google Nano Banana is a lightweight multimodal AI model that focuses on:

Image understanding
Reasoning-based generation
Predicting what happens next instead of just static outputs The real power comes from structured prompts.

The 5-Step Prompt Formula (Core Learning)

Through experimentation, I found that breaking prompts into components dramatically improves results.

The 5 Key Prompt Elements

Subject – Who or what is in the image
Action – What the subject is doing
Scene – Where it happens
Style – Visual aesthetic or era
Composition – Camera angle or framing

Example Prompt - Create an image of me (subject) laughing (action)
in a 1960s café (scene).Make it a close-up shot in a vintage photography style (composition and style).

Going Beyond Static Images: “What If” Reasoning

One of the coolest things about Nano Banana is reasoning-based continuation.

Step 1: Set a clear stage
Generate an image of a person standing and holding a 3-tier cake.

Step 2: Trigger an action
Now generate an image showing what would happen if they tripped.

The model doesn’t just redraw—it predicts the next logical outcome, including:

Body posture
Object movement
Environmental reaction This feels closer to storytelling, not image generation.

What I Learned from This Experiment

Key Takeaways
AI models perform better with structured context “What if” prompts unlock reasoning ability Prompting is becoming a skill, not just typing text
Composition matters as much as description

Common Mistakes Beginners Make

Writing very long, unstructured prompts
Mixing multiple scenes at once
Ignoring camera composition
Expecting AI to “guess” intent

Best Practices for Prompting

Think like a director, not a user
Separate what, where, and how
Add actions to make images dynamic
Test small changes and iterate