How I Built a Secret Agent

#agents #serverless #architecture #llm

I recently made an accidental but interesting discovery while building an app. I managed to create an agent-like system using nothing more than Gemini's function calling feature, effectively building an agent’s brain without the traditional, continuous infrastructure required to host a full agent.

The key finding❓ This $0/hr serverless approach not only significantly reduced infrastructure costs but also proved to be a far more helpful debugger than the broad, general-purpose agent provided by my IDE.

֎ Persistent Agents

Traditional AI agents (which I call Persistent Agents) require continuous hosting using managed services and underlying infrastructure. Big tech companies are offering impressive designer spaces and no-code interfaces, but this can quickly become prohibitively expensive.

The issue lies in the idle cost. Immediately upon deployment, infrastructure is required to host the agent. Even if the agent is inactive or receiving no traffic, at least one compute node is required to run the service, and these costs are incurred continuously, often hourly.

So what exactly does this buy you, anyway?

A persistent agent is generally equipped with tools and can use them to perform:

Complex, multi-step reasoning.
Dynamic decision-making on when and how to call tools.
Management of long-running conversational memory.
External actions, like authenticating on your behalf (when permission is granted).

🛠️ Function Calling as Your Agent

I realised that for my application's specific workflows, the most valuable part of an agent was its dynamic reasoning and ability to use tools and not its continuous hosting status and I had no need for external activities.

I decided to capture the core functionality of an agent without the overhead of continuous deployment. I applied tool-use logic directly via Gemini’s function calling. The tools themselves, including the logic for search, retrieval, etc., are hardcoded into my conversational frontend.

The AI's role becomes the Stateless Agent 🧠. It uses function calling to translate the user’s natural language query into a structured function call and arguments.

The application executes the call, and the resulting data is sent back to the model for a natural language response to the user.

Since I am already making calls to the Gemini model for text generation and other things, this method allows me to combine the reasoning and response steps into a single API call, reducing the transaction cost. This is how I anticipate achieving an 80% reduction in operating costs compared to maintaining a persistent agent infrastructure.

🪲 How I Discovered My Agent

My application is designed to fall back to a fuzzy text-matching search when vector search fails. I was coding in my IDE with a popular code assistant model running. Yet, my search pipeline was failing, and the IDE agent could not find the issue. It was writing new unit tests that were passing in the development environment but failing repeatedly in production.

The agent was overcomplicating things, drowning in the specifics of the code, unit tests, and the immediate task. Each time I summarised the issue, its lack of persistent memory about the operational environment made it feel like I was talking to a blank slate.

Finally, in sheer desperation, I ran my own application’s frontend and typed into the message input: “What is the problem??”

The response from my little agent's brain was immediate and shockingly direct. It informed me that it could not communicate with the backend and, therefore, could not perform the search function it was supposed to execute.

The issue, it turned out, was a simple CORS policy error preventing the backend from communicating with the frontend. The traditional IDE agent was trapped in code complexity; my function-calling agent could immediately identify what was wrong.

🔒 The Security Lesson in Focus

This unexpected diagnostic capability is actually due to its architectural limitations. The agent was forced to reason only about the predefined tool functions available in its system instructions.

I then asked it how it was performing the search. It began referencing internal file paths and implementation details. This was an unintended data leak because I had not provided specific instructions or response settings on how to constrain its reply.

That’s the real value of the Stateless Agent: it lives intrinsically inside the code's purpose, defined solely by the functions it is permitted to use. It doesn't need vast context; it needs focused context.

The biggest takeaway from this experiment is that tooling isn't a massive, stateful "IDE Agent" that watches your every keystroke. Instead, there is value in composing stateless, focused expert agents that live intrinsically inside the purpose of the code.

DEV Community

How I Built a Secret Agent

֎ Persistent Agents

🛠️ Function Calling as Your Agent

🪲 How I Discovered My Agent

🔒 The Security Lesson in Focus

Top comments (0)