DEV Community: Thamindu Hatharasinghe

Securing AI Agents: A Deep Dive into MCP Authorization

Thamindu Hatharasinghe — Tue, 10 Mar 2026 04:30:00 +0000

The Model Context Protocol (MCP) is rapidly becoming the standard for connecting AI models to external tools, databases, and APIs. While experimenting with MCP on local environments is seamless, transitioning these autonomous AI agents to production systems introduces a massive security challenge: authorization. Without strict access controls, every connected LLM client essentially gets unrestricted access to all exposed tools. Let's dive into how MCP authorization works and the architectural patterns required to keep your data safe.

The Shift to Server-Side, Request-Time Enforcement

A common misconception is that securing the initial connection to an MCP server is enough. However, MCP authorization relies on server-side enforcement at request time.

Every single attempt an AI agent makes to read data, execute a task, or call an external API must pass through an authorization gateway. This is evaluated dynamically using:

Token-based Authorization: Validating cryptographic tokens (like JWTs) passed with the payload.

Scoped Capability Access: Ensuring the token only permits specific actions (e.g., Read-only vs. Write).

Role-Based Access Control (RBAC): Checking against established policies to see if the identity behind the agent is permitted to perform the task.

Implementing the Gateway Pattern

When building an MCP Server, your middleware needs to intercept tool execution requests.

Developer Impact & Best Practices

As developers, deploying MCP means adopting a Zero-Trust architecture for AI. You must build your systems around these core principles:

Enforce Least Privilege: Never grant an agent blanket access. If an agent only needs to read a ticket, do not give it API credentials to delete tickets.
Use Short-Lived Scoped Tokens: Tokens should expire quickly and be strictly scoped to the current active session or specific task context.
Authorize Every Call: Never rely on session state alone. Validate permissions on every single tool execution request.
Strict Auditing: Every allowed and denied request must be logged with identity context. If an AI agent hallucinates and attempts a destructive action, you need the audit trail to prove your gateway stopped it.

Conclusion

MCP unlocks incredible potential for AI agents, but it also opens direct pipelines into our databases and APIs. Building robust, request-time authorization layers isn't just a best practice—it's a fundamental requirement for production.

How are you currently managing API keys and permissions for the LLM agents in your projects? Let's discuss in the comments below!

No QA? No Problem! Replacing Manual Testing with Google Antigravity Agents

Thamindu Hatharasinghe — Mon, 09 Mar 2026 04:30:00 +0000

If you've ever worked in a fast-paced development environment, you know the struggle: you push a critical feature, but there is no dedicated QA engineer available to validate it. The burden of end-to-end (E2E) testing falls back on the developers. Writing resilient automated UI tests takes almost as much time as writing the feature itself, and manual testing breaks your flow state.

Enter Google Antigravity, a completely new approach to this problem utilizing autonomous browser agents.

Beyond Autocomplete: The Agent-First Paradigm

Google Antigravity is not just another LLM wrapper that auto-completes your code in VS Code. It is fundamentally built as an Agent-first platform. This means it is designed for action and autonomous execution rather than just text generation.

Instead of writing brittle Selenium or Playwright scripts relying on hardcoded CSS selectors (which break the moment a designer changes a class name), you deploy an Antigravity Agent. You define the intent of the test, and the agent figures out the execution.

Bash
# Example: Initializing an Antigravity agent in your local environment
antigravity init --role qa-tester --target http://localhost:3000

# Instructing the agent using natural language
antigravity run "Navigate to the auth page, create a new user account, verify the email input validation, and attempt to access the protected dashboard route."

How the Agent Navigates the DOM

When executing a command, the Antigravity agent doesn't just ping APIs. It seamlessly integrates with your browser environment. It autonomously opens a headless (or headed) browser instance, navigates to the specified URLs, and parses the DOM visually and structurally.

It locates elements based on context and accessibility trees—just like a real human user. It clicks buttons, types text into input fields, handles dropdowns, and waits for dynamic content to load without needing explicit waitForTimeout commands. This is true Automated UI Testing powered by Agentic reasoning.

Trust through Artifacts

The biggest hurdle with AI agents is trust. How do you know the agent actually tested the application and didn't just hallucinate a "Test Passed" result?

Antigravity solves this by generating comprehensive Verification Artifacts. It doesn't just give you a boolean output. For every execution, the agent provides:

High-resolution Screenshots of key interaction points.
Browser Recordings (Video traces) showing the exact cursor movements and page navigations.
Task Completion Reports detailing the steps taken, network requests intercepted, and console errors caught during the session.

This gives developers deterministic proof of the test execution, making debugging incredibly straightforward.

The Developer Impact

Integrating Google Antigravity drastically reduces the feedback loop. The absence of a dedicated manual QA team is no longer a bottleneck that slows down your CI/CD pipeline. By leaning into Agentic Development, you maintain high software quality while actually accelerating your development speed. You write the code, the Agent tests the user journey.

Have you started integrating AI agents into your testing workflows yet, or are you still relying on manual E2E scripts? Let's discuss in the comments below!

How to Scale Claude Code with an MCP Gateway: Centralize Tools and Control Costs

Thamindu Hatharasinghe — Sun, 08 Mar 2026 04:30:00 +0000

Hey developers, @thamindudev here. If you have been utilizing Anthropic's Claude Code as your primary terminal agent, you already know how significantly it can accelerate your daily development workflows. However, as your team scales and your reliance on various LLM providers increases, you inevitably hit a wall. Managing multiple API keys, tracking erratic token costs, and maintaining a fragmented set of tools across different environments quickly turns into a logistical nightmare. This is where introducing a Model Context Protocol (MCP) Gateway, such as Bifrost, becomes a critical architectural decision.

The Technical Deep Dive: The Gateway Architecture
An MCP Gateway acts as a dedicated control plane situated directly between your Claude Code terminal agent and your backend infrastructure, which includes both your LLM providers (OpenAI, Anthropic, Azure) and your various MCP servers. Instead of Claude Code establishing direct, unmonitored connections to these external services, all traffic is routed through the gateway.

This architecture introduces a highly necessary layer of abstraction. For instance, instead of hardcoding provider logic within your local environment, you can configure the gateway to handle Multi-Provider Routing dynamically based on availability or cost parameters.

Bash
# Conceptual: Initializing Claude Code to route through an MCP Gateway instead of direct API endpoints
export ANTHROPIC_API_KEY="mcp-gateway-token-xyz"
export MCP_GATEWAY_ENDPOINT="https://gateway.internal.corp/v1"

# The gateway intercepts the request, logs the intent, 
# applies budget policies, and routes to the cheapest/fastest LLM.
claude "Refactor the authentication module using the centralized auth tool"

The gateway handles the heavy lifting by maintaining a centralized Tool Registry. When Claude Code requests a specific tool execution, the gateway verifies permissions, resolves the tool endpoint, and proxies the execution securely.

Developer Impact: Governance and Observability

Implementing this pattern shifts your AI operations from a chaotic, decentralized state to a highly governed workflow. The primary impacts on your development team include:

Cost & Budget Control: You can finally set hard limits on token expenditure per project or developer. The gateway tracks exact usage across all LLM providers, preventing unexpected billing surprises at the end of the month.
Seamless Multi-Provider Switching: If a specific model goes down or a better alternative is released, you update the routing logic at the gateway level. Your local Claude Code configuration remains entirely unchanged.
Comprehensive Logging: Every prompt, tool execution, and LLM response is logged centrally. This observability is vital for debugging complex agentic workflows and ensuring compliance with internal security policies.

Conclusion
Scaling AI terminal agents like Claude Code requires more than just distributing licenses; it requires robust infrastructure. By leveraging an MCP Gateway, you abstract the complexity of LLM management, enforce strict cost controls, and provide a unified, secure tool registry for your entire engineering team.

Have you started integrating MCP Gateways into your AI workflows yet, or are you still relying on direct API connections? Let's discuss in the comments below!

Decoding the Visual Architecture of Gemini AI: Gradients, Motion, and Trust

Thamindu Hatharasinghe — Sat, 07 Mar 2026 04:30:00 +0000

AI isn't just about massive parameter counts and backend APIs anymore; it's heavily about how humans interface with constantly evolving, non-linear machine logic. Google’s design team recently unveiled the visual design system behind Gemini AI, and it provides a masterclass in UI/UX architecture. As developers, we often focus on response latency and token limits, but the frontend presentation—how an AI communicates its "thinking" state—is what ultimately builds user trust. Let's break down the mechanics of Gemini's dynamic visual language.

The Technical Deep Dive: Beyond Static Components
At the core of Gemini's frontend is a complete departure from static UI components. The system relies heavily on directional gradients and foundational circular shapes. Instead of rendering a traditional loading spinner, Gemini utilizes purposeful animations and sharp leading edges within gradients to indicate the directional flow of data and energy.

Google drew inspiration from their design heritage, specifically leveraging the negative space of circles to convey harmony and comfort. In code, achieving this fluid, amorphous gradient state without burning excessive GPU cycles requires highly optimized CSS and potentially WebGL for complex states. Here is a conceptual representation of how you might structure such an active thinking state in CSS:

CSS
/* Conceptual representation of an AI 'thinking' gradient */
.gemini-gradient-container {
  background: radial-gradient(circle at 50% 50%, rgba(66, 133, 244, 0.8), transparent 70%);
  animation: pulse-synthesis 3s infinite cubic-bezier(0.4, 0, 0.2, 1);
  border-radius: 50%;
  filter: blur(12px);
  will-change: transform, opacity;
}

@keyframes pulse-synthesis {
  0% { transform: scale(0.95); opacity: 0.7; }
  50% { transform: scale(1.05); opacity: 1; }
  100% { transform: scale(0.95); opacity: 0.7; }
}

The Developer Impact

What does this mean for those of us building AI-integrated applications? The key takeaway is the concept of "softness in the face of change." When your application's output is generative and inherently unpredictable, the UI must compensate by being approachable and familiar.

Google draws a direct parallel to Susan Kare's pioneering work on the original Macintosh—translating abstract machine logic into human-friendly visual metaphors. If you are building an AI agent, a chatbot, or integrating LLMs into existing SaaS workflows, relying on static text boxes isn't enough anymore. You need to implement responsive motion that maps directly to the AI's processing lifecycle (listening, analyzing, synthesizing, and responding) to make the complex processes transparent.

Conclusion
Designing for AI is fundamentally different from traditional CRUD app design. The interface itself must feel alive, adaptable, and inherently trustworthy. The shift from rigid layouts to fluid, motion-driven states is the next big leap in front-end architecture. How are you handling loading states and "AI thinking" visual cues in your current projects? Let's discuss in the comments below!

Mastering the Core: Why Fundamentals Beat Frameworks Every Time

Thamindu Hatharasinghe — Fri, 06 Mar 2026 13:25:19 +0000

Hello devs, @thamindudev here.

If you look at the JavaScript ecosystem or backend tooling today, it feels like a new framework is born every week. We spend countless hours migrating from React to Next.js, figuring out Nuxt for Vue, or debating whether Angular is making a comeback. But while we are busy chasing the shiny new tools, we often neglect the bedrock of our profession: Computer Science Fundamentals.

The reality is that frameworks are ephemeral, but fundamentals are eternal. Let's dive into why betting on the core concepts is the best investment you can make for your engineering career.

The Illusion of the Framework Developer

When you only learn a framework, you are essentially learning a highly opinionated API wrapper around core technologies. It feels incredibly productive at first. You can spin up a routing system, manage state, and deploy an application in minutes.

But what happens when the framework abstracts away a performance bottleneck? If you don't understand how the DOM actually works, React's Virtual DOM reconciliation just feels like magic—until your application grinds to a halt due to unnecessary re-renders.

Consider a scenario where you need to look up a value in a massive dataset. A developer heavily reliant on libraries might just reach for a heavy array method or a third-party utility. A developer who understands fundamental Data Structures will instantly recognize that a Hash Map (or a JavaScript Set/Map) changes the time complexity from O(n) to O(1).

// The Framework/Library dependent way (O(n) time complexity)
const users = [{id: 1, name: "Alice"}, {id: 2, name: "Bob"}];
const findUser = (id) => users.find(user => user.id === id);

// The Fundamental Data Structure way (O(1) time complexity)
const userMap = new Map([
  [1, {id: 1, name: "Alice"}],
  [2, {id: 2, name: "Bob"}]
]);
const findUserOptimized = (id) => userMap.get(id);

AI, DevOps, and the Shift in Engineering
As full-stack developers and system administrators, we are seeing AI tools write our boilerplate code faster than ever before. AI can easily scaffold a React component or an Express server. However, what AI struggles with is high-level System Architecture and complex Database Design.

If you know how relational databases handle indexing, how B-Trees work under the hood, or how TCP/IP handshakes affect your microservices latency, you transition from being a "coder" to an "architect." Frameworks teach you how to write an app. Fundamentals teach you why an app scales, fails, or gets compromised.

The Developer Impact

Focusing on fundamentals directly impacts your daily workflow:
Debugging: You debug faster because you understand the engine, not just the steering wheel.
Adaptability: When your company inevitably switches from Framework A to Framework B, your transition takes days, not months.
System Design: You make better tech nical decisions when planning features, naturally avoiding technical debt.

What's your take on this? Should junior developers still start with building raw HTML/JS and fundamental logic, or dive straight into frameworks to stay motivated? Let's discuss in the comments below!

Your AI App Will Never Crash Again: Building High Availability with LiteLLM

Thamindu Hatharasinghe — Thu, 05 Mar 2026 05:28:26 +0000

If there is one absolute truth in software development, it is that external dependencies will eventually fail. When building full-stack applications powered by Large Language Models (LLMs), tying your entire architecture to a single API provider like OpenAI introduces a massive single point of failure. If their servers go down, or you hit an unexpected rate limit, your application crashes.

Enter LiteLLM, a 100% open-source AI gateway that fundamentally changes how we handle AI API integrations. With over 33.8K stars on GitHub, it serves as a universal proxy, allowing you to seamlessly swap between OpenAI, Anthropic, Gemini, and over 100 other models.

The Architecture of Resilience: Automatic Fallback Routing

The standout feature of LiteLLM is its built-in router, designed specifically for high availability (HA). It allows you to define fallback mechanisms directly in your code or via a centralized proxy server.

If a primary request to OpenAI times out or returns a 500 Internal Server Error, LiteLLM instantly intercepts the failure and routes the exact same prompt to a designated secondary model (like Claude 3 or Gemini 1.5 Pro). Your users experience slightly higher latency, but they never see a crash screen.

Implementation Example

Here is how you can set up a robust routing mechanism using the Python SDK:

from litellm import Router
import os

# Define your available models and credentials
model_list = [
    {
        "model_name": "gpt-4o",
        "litellm_params": {
            "model": "gpt-4o",
            "api_key": os.environ.get("OPENAI_API_KEY")
        }
    },
    {
        "model_name": "claude-3",
        "litellm_params": {
            "model": "claude-3-opus-20240229",
            "api_key": os.environ.get("ANTHROPIC_API_KEY")
        }
    }
]

# Initialize the router with fallback logic
router = Router(
    model_list=model_list,
    fallbacks=[{"gpt-4o": ["claude-3"]}], # If gpt-4o fails, use claude-3
    num_retries=1
)

# Execute the completion
try:
    response = router.completion(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Explain Kubernetes ingress controllers."}]
    )
    print(response.choices[0].message.content)
except Exception as e:
    print(f"All routed attempts failed: {e}")

Why This Matters for Developers and DevOps

Integrating LiteLLM isn't just about preventing downtime; it streamlines the entire development lifecycle:

Universal API Standard: You no longer need to write custom wrapper classes for different SDKs. LiteLLM standardizes everything into the OpenAI API format. You write your prompt logic once.
Zero Vendor Lock-in: Want to migrate your entire production system from OpenAI to Anthropic? With LiteLLM, it's a configuration change, not a massive code refactor.
Cost Optimization & Load Balancing: You can route simpler queries to cheaper, faster models (like Gemini Flash) and reserve heavy logical tasks for GPT-4o, effectively managing your API budgets dynamically.

Final Thoughts

As AI integration becomes a standard requirement in web and system development, treating LLM endpoints with the same rigorous infrastructure standards as databases or microservices is non-negotiable. LiteLLM provides the missing infrastructure layer to make your AI apps enterprise-ready.

Have you started implementing fallback strategies for your AI integrations yet? Let me know your approach in the comments below!