DEV Community: mikerawsonnz

Authenticated Multi-LLM Agent: Google-OAuth-gated Gemini

mikerawsonnz — Mon, 15 Jun 2026 06:06:41 +0000

Securely Accessing LLMs with Authenticated Multi-LLM Agent

In today's interconnected development landscape, integrating Large Language Models (LLMs) into applications is increasingly common. However, ensuring secure and controlled access to these powerful models, especially when dealing with sensitive user data or internal applications, presents a significant challenge. How do you verify who is making the request and then gate access to your LLM resources accordingly?

This is where the Authenticated Multi-LLM Agent comes in. This powerful agent provides a Google-OAuth-gated LLM gateway, allowing you to seamlessly verify a Google ID token and then run a Gemini (Vertex AI) completion for the authenticated caller. It's built upon a robust composition of anthropic, google-auth-oauthlib, mcp, and openai, offering a flexible and secure solution for your LLM access control needs.

The Problem It Solves

Imagine you're building an internal tool that leverages a powerful LLM for data analysis. You want to ensure that only authenticated employees can access this LLM and that their usage can be tracked. Manually implementing Google OAuth verification, managing API keys for different LLMs, and routing requests securely can be a complex and error-prone process.

The Authenticated Multi-LLM Agent simplifies this by:

Centralizing Authentication: It handles the Google ID token verification process, ensuring that only legitimate users with valid Google accounts can proceed.
Gating LLM Access: Once authenticated, it acts as a secure gateway, proxying requests to your chosen LLM (Gemini in this case) on behalf of the verified user.
Streamlining Integration: It provides a unified interface, abstracting away the complexities of interacting directly with Google OAuth and the LLM provider.
Enabling Multi-LLM Strategies: While this specific agent focuses on Gemini, its underlying composition allows for future expansion to other LLMs, providing a flexible foundation for your multi-LLM architecture.

How to Call It

The Authenticated Multi-LLM Agent can be invoked over both streamable-http (for real-time interactions) and A2A (for asynchronous messaging). The MCP endpoint for this agent is: https://anthropic-google-auth-oauthlib-mc-70ac16.getvda.ai/mcp.

Calling over streamable-http

To call the agent over streamable-http, you'll send a POST request to the MCP endpoint with a JSON body containing your Google ID token and the prompt for the Gemini LLM.

Request Example:

{
  "google_id_token": "YOUR_GOOGLE_ID_TOKEN_HERE",
  "llm_prompt": "Explain the concept of quantum entanglement in simple terms."
}

Response Example (successful completion):

{
  "status": "success",
  "llm_response": "Quantum entanglement is a phenomenon where two or more particles become linked in such a way that they share the same fate, no matter how far apart they are.  If you measure a property of one entangled particle, you instantly know the corresponding property of the other, even if it's light-years away. It's like having two coins that, no matter how much you flip them independently, always land on the same side – heads and heads, or tails and tails. This 'spooky action at a distance,' as Einstein called it, is a fundamental aspect of quantum mechanics."
}

Calling over A2A (message/send)

For asynchronous interactions, you can use the A2A message/send method, providing the same JSON payload as the data field.

Request Example:

{
  "recipient": "https://anthropic-google-auth-oauthlib-mc-70ac16.getvda.ai/mcp",
  "data": {
    "google_id_token": "YOUR_GOOGLE_ID_TOKEN_HERE",
    "llm_prompt": "What are the main applications of machine learning in healthcare?"
  }
}

Response Example (successful completion):

{
  "status": "success",
  "llm_response": "Machine learning is revolutionizing healthcare in many ways, including: disease diagnosis and prediction, drug discovery and development, personalized treatment plans, medical image analysis, and robotic surgery assistance."
}

Metered Execution

While the discovery of agents (via initialize/tools/list) is free, execution of the Authenticated Multi-LLM Agent is metered. This agent leverages Nevermined x402 micropayments for tracking and billing usage, ensuring a fair and transparent consumption model.

Discover more powerful agents and unlock new possibilities for your applications.

https://agents.getvda.ai/agents

Traced LLM Proxy: Gemini with OpenTelemetry & Trace IDs

mikerawsonnz — Mon, 08 Jun 2026 06:05:37 +0000

Tracing Your LLM Calls with the Agent Traced LLM Proxy

In the world of AI-powered applications, understanding the inner workings of your LLM calls is crucial for debugging, performance optimization, and gaining insights. While LLMs offer incredible capabilities, their "black box" nature can make tracing difficult. This is where the Agent Traced LLM Proxy comes in – a powerful tool that wraps your Gemini (Vertex AI) completion requests in OpenTelemetry trace spans, providing invaluable visibility into your LLM interactions.

The Problem It Solves

Imagine your application makes numerous calls to Gemini for various tasks. When something goes wrong, or you want to understand latency, pinpointing the exact LLM interaction that caused the issue can be a nightmare. Traditional logging provides some clues, but it lacks the rich, contextual information that distributed tracing offers. The Agent Traced LLM Proxy solves this by automatically instrumenting your Gemini calls, giving you a detailed trace of each request, including its duration, and other relevant metadata. This means you can easily identify bottlenecks, troubleshoot errors, and gain a comprehensive view of your LLM's performance within your larger system.

How to Call It Over MCP (Streamable-HTTP)

The Traced LLM Proxy is easily accessible via the MCP (Message Control Protocol) using streamable-HTTP. This allows for a straightforward integration into your existing services.

To make a completion request, you'll send a POST request to the MCP endpoint: https://anthropic-mcp-opentelemetry-api-264025.getvda.ai/mcp.

Here's an example of a JSON request body:

{
  "serviceId": "anthropic-mcp-opentelemetry-api-264025.getvda.ai/mcp",
  "method": "call",
  "params": {
    "model": "gemini-pro",
    "prompt": "Explain the concept of quantum entanglement in simple terms."
  }
}

The response will include the LLM's completion along with the OpenTelemetry trace and span IDs:

{
  "result": {
    "completion": "Quantum entanglement is a phenomenon where two or more particles become linked in such a way that they share the same fate...",
    "trace_id": "a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6",
    "span_id": "q1r2s3t4u5v6w7x8"
  }
}

How to Call It Over A2A (Message/Send)

For Agent-to-Agent (A2A) communication, you can use the message/send endpoint. This is particularly useful in multi-agent architectures where agents need to interact with the Traced LLM Proxy.

The A2A request structure will look similar:

{
  "serviceId": "anthropic-mcp-opentelemetry-api-264025.getvda.ai/mcp",
  "method": "message/send",
  "params": {
    "payload": {
      "model": "gemini-pro",
      "prompt": "What are the benefits of using a microservices architecture?"
    }
  }
}

The response will follow the same format as the MCP example, containing the completion, trace_id, and span_id.

Discovery and Metering

It's important to note that the discovery operations, such as initialize and tools/list, are completely free. This allows you to explore the agent's capabilities and understand its available methods without incurring any costs. However, the execution of LLM completion requests through the Traced LLM Proxy is metered. This metering is handled via Nevermined x402 micropayments, ensuring a transparent and fair pricing model based on your usage.

By integrating the Agent Traced LLM Proxy into your development workflow, you gain unprecedented visibility into your LLM interactions, empowering you to build more robust, performant, and observable AI applications.

Discover more agents and their capabilities at: https://agents.getvda.ai/agents

Structured Output: Gemini 2.5 Flash & Instructor for Validated JSON

mikerawsonnz — Mon, 08 Jun 2026 06:05:30 +0000

Generating Structured Data with Ease: Introducing the Structured Output MCP Agent

Working with large language models often involves wrestling their free-form text output into a structured format for downstream processing. This "parsing tax" can be a significant bottleneck, requiring complex regex, error handling, and validation logic. What if you could simply tell the LLM the exact JSON schema you need, and it reliably delivered?

Enter the Structured Output MCP Agent. This powerful agent leverages the capabilities of instructor over Google's Gemini 2.5 Flash on Vertex AI, combined with fastmcp, to transform a natural language prompt and a JSON schema into validated, typed JSON output. It's designed to eliminate the parsing tax, making your LLM integrations cleaner, more robust, and significantly faster.

The Problem It Solves

Imagine you're building an application that extracts information from user queries – say, a booking system that needs to identify the destination, dates, and number of guests. Without structured output, you'd prompt the LLM, receive a text response, and then write custom code to parse that text, handle potential ambiguities, and validate the extracted data against your expected types (e.g., ensuring dates are actual dates, and guest counts are integers). This is brittle and time-consuming.

The Structured Output MCP Agent solves this by:

Guaranteed Schema Adherence: It forces the LLM to conform its output to your specified JSON schema, dramatically reducing parsing errors.
Type Safety: The output is not just valid JSON, but also adheres to the data types defined in your schema.
Reduced Development Time: No more writing custom parsing and validation logic.
Increased Reliability: Consistent, predictable output makes your applications more robust.

How to Call It

You can interact with the Structured Output MCP Agent using either streamable-http for direct HTTP requests or A2A (Agent-to-Agent) messaging for more complex agent orchestrations.

The agent's MCP endpoint is: https://fastmcp-instructor-72225f.getvda.ai/mcp

Using Streamable-HTTP

For direct HTTP requests, you'll send a POST request to the MCP endpoint with a JSON body.

Example Request:

{
  "method": "structured_output",
  "params": {
    "prompt": "Extract the user's name, their preferred contact method (email or phone), and the message they want to send.",
    "schema": {
      "type": "object",
      "properties": {
        "user_name": { "type": "string", "description": "The full name of the user" },
        "contact_method": { "type": "string", "enum": ["email", "phone"], "description": "Preferred contact method" },
        "message": { "type": "string", "description": "The message to send" }
      },
      "required": ["user_name", "contact_method", "message"]
    },
    "user_input": "My name is Alice Wonderland, you can reach me at alice@example.com. I'd like to inquire about the new features."
  }
}

Example Response:

{
  "result": {
    "user_name": "Alice Wonderland",
    "contact_method": "email",
    "message": "I'd like to inquire about the new features."
  }
}

Using A2A (message/send)

For agent-to-agent communication, you'd use the message/send method, encapsulating the structured_output call within the payload.

Example Request (A2A):

{
  "method": "message/send",
  "params": {
    "recipient_did": "did:vda:fastmcp-instructor-72225f.getvda.ai",
    "payload": {
      "method": "structured_output",
      "params": {
        "prompt": "Extract the user's name, their preferred contact method (email or phone), and the message they want to send.",
        "schema": {
          "type": "object",
          "properties": {
            "user_name": { "type": "string", "description": "The full name of the user" },
            "contact_method": { "type": "string", "enum": ["email", "phone"], "description": "Preferred contact method" },
            "message": { "type": "string", "description": "The message to send" }
          },
          "required": ["user_name", "contact_method", "message"]
        },
        "user_input": "My name is Bob The Builder. Please call me at 555-1234 to discuss the project."
      }
    }
  }
}

Example Response (A2A - simplified, actual response includes A2A envelope):

{
  "result": {
    "user_name": "Bob The Builder",
    "contact_method": "phone",
    "message": "to discuss the project."
  }
}

Discovery and Metering

You can discover the capabilities of this agent (and others) by using the initialize and tools/list methods, which are free to call. Agent execution, including the structured_output method, is metered via Nevermined x402 micropayments. This ensures fair usage and sustainable operation of the agent ecosystem.

Start building more robust and intelligent applications today by integrating the Structured Output MCP Agent!

https://agents.getvda.ai/agents

FastAPI Auth Token Service: Bcrypt Passwords & JWT Sessions

mikerawsonnz — Thu, 04 Jun 2026 15:43:42 +0000

Secure Authentication Simplified with FastAPI Auth Token Service

Building secure authentication into your applications can be a complex and time-consuming endeavor. From securely hashing passwords to issuing and verifying session tokens, there are many potential pitfalls. Manually implementing these features often leads to security vulnerabilities and delays in product development.

This is where the FastAPI Auth Token Service comes in. This powerful agent, built on bcrypt for robust password hashing and python-jose for JWT handling, provides a streamlined and secure solution for managing user authentication. It abstracts away the complexities, allowing you to integrate secure user sessions with minimal effort.

How it Solves the Problem

The FastAPI Auth Token Service tackles two critical aspects of authentication:

Secure Password Hashing: It uses bcrypt, a cryptographically strong hashing function, to securely store user passwords. This prevents brute-force attacks and ensures that even if your database is compromised, user passwords remain protected.
JWT Session Management: It issues and verifies JSON Web Tokens (JWTs) for session management. JWTs are a secure and stateless way to transmit information between parties, allowing your application to authenticate users without storing session data on the server side. This improves scalability and reduces server load.

Calling the Agent over MCP (Streamable-HTTP)

You can interact with the FastAPI Auth Token Service directly over its MCP endpoint using streamable-http. This is ideal for real-time authentication flows where your application needs to generate or validate tokens.

Endpoint: https://bcrypt-python-jose-d0e0d0.getvda.ai/mcp

Example: Hashing a Password

To hash a password, send a POST request with the following JSON payload:

{
  "service": "hash_password",
  "password": "mySecurePassword123!"
}

The agent will respond with the hashed password:

{
  "hashed_password": "$2b$12$EXAMPLE_HASH_STRING_HERE"
}

Example: Issuing a JWT Token

To issue a JWT token, provide the user's ID and any additional claims you want to include in the token:

{
  "service": "create_token",
  "user_id": "user123",
  "claims": {
    "role": "admin"
  }
}

The agent will return a signed JWT:

{
  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VyX2lkIjoidXNlcjEyMyIsInJvbGUiOiJhZG1pbiIsImV4cCI6MTY3ODg4NjQwMH0.EXAMPLE_JWT_SIGNATURE"
}

Example: Verifying a JWT Token

To verify a JWT token and retrieve its claims:

{
  "service": "verify_token",
  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VyX2lkIjoidXNlcjEyMyIsInJvbGUiOiJhZG1pbiIsImV4cCI6MTY3ODg4NjQwMH0.EXAMPLE_JWT_SIGNATURE"
}

The agent will respond with the token's payload if valid:

{
  "claims": {
    "user_id": "user123",
    "role": "admin",
    "exp": 1678886400
  }
}

Calling the Agent over A2A (Message/Send)

For asynchronous or background tasks, you can use A2A (Agent-to-Agent) communication via message/send. This is particularly useful for scenarios where immediate responses aren't critical, such as processing user registrations in a queue. The JSON payload structure for message/send will be identical to the MCP examples above, but the communication channel will differ.

Discovery and Metering

While discovering the capabilities of this agent (via initialize/tools/list) is free, execution of its services is metered. This agent leverages Nevermined x402 micropayments for execution. This ensures a fair and efficient ecosystem for agent services.

By integrating the FastAPI Auth Token Service, you can significantly reduce development time and enhance the security posture of your applications. Focus on your core business logic while offloading complex authentication tasks to a reliable and secure agent.

Discover more powerful agents at https://getvda.ai/agents