DEV Community: Apurba Singh

Traffic vs Truth: Building a Global Auction Platform with DynamoDB and Aurora DSQL

Apurba Singh — Wed, 17 Jun 2026 21:49:54 +0000

This article was created as part of my submission to the H0: Hack the Zero Stack with Vercel v0 and AWS Databases Hackathon.

The Core Philosophy

Over the years I've worked on systems that experienced unpredictable traffic spikes, from consumer-facing platforms to real-time security monitoring systems.

One pattern kept appearing.

Distributed systems often treat every incoming event as if it deserves the same level of consistency, durability, and global coordination.

In reality, most events are temporary.

A bid is a perfect example.

Thousands of bids may arrive for the same advertising slot. Most are evaluated and discarded. Only one eventually becomes the winning settlement that affects a financial ledger.

This led to a simple design philosophy:

Traffic and truth are different data products.

Not every event deserves global consistency.

Quant Edge Exchange was built to explore what happens when we separate those responsibilities and allow each database to focus on the workload it handles best.

The Problem

Imagine a global advertising exchange.

us-east-1        → $5.20
eu-west-1        → $5.30
ap-southeast-1   → $5.40
sa-east-1        → $5.35

Thousands of bids arrive.

Most are temporary.

Only one becomes authoritative.

Winner = ap-southeast-1
Amount = $5.40

The winning settlement becomes part of a financial ledger and must remain globally consistent.

The question became:

Should all 10,000 bids pay the same coordination cost as the single winning settlement?

My answer was no.

Architecture Overview

Transient Bid Traffic
        ↓
     DynamoDB
        ↓
 Bid Evaluation
        ↓
 Authoritative Outcome
        ↓
   Aurora DSQL
        ↓
 OCC + Full Jitter
        ↓
 Financial Truth
        ↓
 Observability Layer

The architecture follows one rule:

The cost of coordination should match the value of the data.

Why DynamoDB?

DynamoDB became the ingestion layer.

The platform stores bid events using a single-table design.

PK = SLOT#<slotId>
SK = REGION#<region>#BID#<bidId>

Example:

await dynamodb.send(
  new PutCommand({
    TableName: "bid_events",
    Item: {
      PK: `SLOT#${slotId}`,
      SK: `REGION#${region}#BID#${bidId}`,
      bidAmount,
      latencyMs,
      qualityScore,
      timestamp: new Date().toISOString(),
      ttl: expiryTime,
    },
  })
);

This allows:

High-throughput writes
Regional traffic aggregation
Hot slot detection
TTL cleanup
Analytics without joins

DynamoDB acts as a high-speed ingestion buffer.

Its job is not to determine truth.

Its job is to absorb traffic.

Why Aurora DSQL?

Eventually an auction closes.

The data is no longer traffic.

It becomes truth.

Aurora DSQL stores the authoritative state of the platform.

CREATE TABLE settlements (
    settlement_id UUID PRIMARY KEY,
    slot_id UUID NOT NULL,
    winner_account_id UUID NOT NULL,
    winning_bid NUMERIC(18,4),
    settled_at TIMESTAMPTZ
);

Other core tables include:

enterprise_accounts
ad_slots
ad_bids
settlements
financial_ledger
conflict_events
simulation_runs

These represent:

Winning bids
Settlement history
Account balances
Financial audit trails
Conflict telemetry

Only events that become business truth cross this boundary.

Why Vercel and Next.js?

The database architecture was the primary focus, but I also wanted to explore how globally distributed application runtimes interact with globally distributed databases.

Quant Edge Exchange was built using Next.js and deployed on Vercel.

The frontend dashboards, simulation engine, ingestion analytics, settlement telemetry, and ledger monitoring all run through server-side API routes.

One challenge was avoiding database initialization during build time.

Instead of creating clients globally, services are loaded lazily at runtime.

export async function getDsqlClient() {
  const { pool } = await import("./dsql-client");
  return pool;
}

This allowed:

Successful Vercel builds
Runtime-only AWS connections
Clean separation of UI and infrastructure
Centralized database observability

The frontend visualizes the system.

The backend owns ingestion, settlement, conflict handling, and telemetry.

The Interesting Part: Settlement Contention

Ingestion is easy.

Settlement is where distributed systems become interesting.

Imagine two workers trying to settle the same auction.

Worker A
settle(slot_123)

Worker B
settle(slot_123)

Without coordination, duplicate settlements become possible.

Aurora DSQL protects correctness through serializable transactions.

Under contention, one transaction may succeed while another receives a serialization conflict.

Rather than treating this as failure, Quant Edge Exchange treats contention as a normal distributed systems event.

The platform implements:

Optimistic Concurrency Control (OCC)
Exponential Backoff
Full Jitter Retry

Example retry logic:

const delay =
  Math.random() *
  (BASE_DELAY * Math.pow(2, attempt));

await sleep(delay);

Example execution:

Attempt #1
❌ Serialization Conflict

Attempt #2
❌ Serialization Conflict

Attempt #3
✅ Settlement Accepted

This prevents synchronized retry storms while preserving transactional correctness.

Observability

Most demos stop after a successful write.

I wanted to expose what was happening underneath.

The platform includes:

Ingestion Analytics Dashboard
Settlement Monitoring
Financial Ledger Activity
Regional Traffic Distribution
Winning Region Analysis
OCC Conflict Telemetry

Example conflict metrics collected:

{
  conflicts: 12,
  retries: 31,
  resolved: 12,
  resolutionRate: 100
}

The goal was to make contention visible instead of hiding it.

What I Learned

Building Quant Edge Exchange reinforced a lesson I had already seen in production systems.

The fastest architecture is not the one with the biggest database.

The fastest architecture is the one that assigns the correct consistency model to the correct workload.

DynamoDB and Aurora DSQL are often compared against each other.

After building this project, I think they are strongest when used together.

DynamoDB absorbs traffic.

Aurora DSQL protects truth.

Everything else becomes easier to reason about.

Try It Yourself

Demo:
https://quant-edge-exchange.vercel.app/

Devpost:
https://devpost.com/software/quant-edge-exchange

GitHub:
https://github.com/apurba-labs/quant-edge-exchange

Traffic is cheap.

Truth is expensive.

The architecture should know the difference.

Karate Platform: Finishing the Tournament Experience

Apurba Singh — Fri, 05 Jun 2026 09:41:07 +0000

This is a submission for the GitHub Finish-Up-A-Thon Challenge

What I Built

Karate Platform is a tournament experience platform designed for martial arts competitions.

The project helps spectators, athletes, coaches, and organizers stay connected through:

🔴 Live Tournament Dashboard
🥋 Live Match Arena
🏆 Tournament Brackets
🏢 Tournament Center
Real-time match visualization
Spectator-focused tournament experience

Links

👉 Live Demo: https://karate-platform.vercel.app/

👉 GitHub Repository: https://github.com/apurba-labs/karate-platform

The Comeback Story

🏆 Completion Arc: The Before & After Journey

Before the Challenge

This repository sat unfinished for months.

My original vision was a much larger SaaS platform covering athlete registration, dojo management, tournament operations, rankings, examinations, and event administration.

Ironically, the biggest blocker was not backend development.

Database schemas, CRUD operations, APIs, and authentication are familiar engineering tasks that can be implemented incrementally. What repeatedly stopped the project was the presentation layer.

I had individual components, partial database models, and unfinished management screens, but I could not clearly visualize how a spectator, parent, coach, or organizer would actually experience a live tournament.

After the Challenge

The GitHub Finish-up-a-thon helped me rethink the project from the user's perspective.

Instead of focusing on unfinished backend features, I focused entirely on the tournament journey:

Home Page → Live Tournament Dashboard → Live Match Arena → Brackets → Tournament Center

That shift became the breakthrough.

Once the user experience was defined, the project transformed from a collection of disconnected screens into a cohesive product with a clear purpose and direction.

What the Finish-up-a-thon Helped Me Finish

During the challenge I:

Migrated the frontend experience to Tailwind CSS
Rebuilt the Home Page
Created the Live Tournament Dashboard
Built the Live Match Arena experience
Added interactive bracket visualization
Added a Tournament Center for organizers
Connected the entire tournament journey together
Deployed the project publicly

🛠️ Use of Underlying Technology

The project uses modern frontend technologies to simulate a live tournament environment.

Key implementations include:

React and TypeScript for component-driven development
Tailwind CSS for responsive and modern UI design
React Router dynamic routing for arena-specific navigation (/live-match/:ringId)
State-driven live match simulation for scores, penalties, timers, and event feeds
Browser-native Web Audio API integration for referee whistle effects and match feedback

These technologies allowed me to rapidly prototype and validate the tournament experience before investing in a larger backend infrastructure.

📱 Usability, User Experience & Creativity

The inspiration came from a real-world challenge faced by many parents, guardians, coaches, and supporters who cannot always attend tournaments in person.

The interface was designed around a simple goal:

Help someone follow a tournament remotely without feeling disconnected from the event.

Key UX decisions include:

High-contrast AKA (Red) and AO (Blue) competitor layouts inspired by real karate scoring systems
Dedicated Live Arena experience for following a specific match
Real-time event feed and match timer
Visual bracket progression for quick tournament tracking
Mobile-friendly responsive layouts
Dashboard-first navigation for quick access to tournament activity

Rather than focusing on administrative workflows, I prioritized the spectator experience and designed the platform around the emotional journey of following a live competition.

One of the most important improvements during the Finish-up-a-thon was transforming a collection of disconnected pages into a complete tournament flow:

Home Page → Live Tournament Dashboard → Live Match Arena → Brackets → Tournament Center

This created a much more intuitive and engaging experience for both spectators and organizers while giving the project a clear identity and direction.

⭐ What I'm Most Proud Of

The Live Match Arena.

This feature represents the original vision that inspired the project.

A spectator can move from the tournament dashboard into a dedicated arena view, follow a specific match, monitor scores and penalties, and experience a focused view of tournament action.

For the first time, I can clearly see how the platform could evolve into a complete tournament solution.

My Experience with GitHub Copilot

GitHub Copilot became much more than a code completion tool during this project.

The biggest challenge wasn't writing backend APIs or database models—it was figuring out how to present a live martial arts tournament experience in a way that felt intuitive and engaging.

Throughout the Finish-up-a-thon, I used Copilot as a collaborative implementation partner. It helped me rapidly explore UI layouts, iterate on component structures, experiment with tournament dashboard concepts, and refine the overall user journey.

Some of the areas where Copilot had the biggest impact were:

Accelerating the Tailwind CSS migration
Prototyping the Live Tournament Dashboard
Building the Live Match Arena experience
Refining responsive layouts and UI components
Exploring bracket visualization approaches
Helping connect multiple screens into a cohesive tournament flow

What I found most valuable was the ability to quickly test ideas.

Instead of spending hours debating a layout or interaction pattern, I could explore multiple approaches, evaluate them, and continue iterating.

The result wasn't simply writing code faster—it was finally gaining enough momentum to complete a project that had been sitting unfinished for months.

For me, GitHub Copilot's biggest contribution was helping transform a collection of disconnected ideas into a finished experience that I could confidently deploy, demonstrate, and continue building after the challenge.

What's Next

Future development will focus on:

Real-time WebSocket updates
Live streaming integration
Athlete registration workflows
Automated bracket generation
Multi-dojo support
Tournament management tools
SaaS capabilities for martial arts organizations

The Finish-up-a-thon helped me solve the hardest problem: defining the tournament experience itself.

Now that the presentation layer and user journey are clear, the remaining engineering work feels significantly more achievable.

We Didn’t Want Another AI Wrapper — So We Explored a High-Speed Hermes Orchestrator for Engineering Crews

Apurba Singh — Mon, 25 May 2026 20:53:59 +0000

This is a submission for the Hermes Agent Challenge

Our goal was not to build another AI wrapper, but to explore how Hermes Agent behaves as a persistent orchestration layer coordinating specialized autonomous workers inside real engineering governance workflows.

Most AI systems today are still fundamentally single-threaded assistants wrapped inside nicer interfaces.
You type a prompt, the model responds, and the workflow ends there.

But our problem was different.

Over the last few years we worked closely with alumni groups, business operators, SaaS platforms, and community engineering teams. One recurring issue appeared everywhere:

People did not simply want AI-generated text.
They wanted workflow intelligence.

They wanted systems capable of:

coordinating technical tasks,
evaluating operational risks,
planning execution flows,
synthesizing structured engineering decisions,
and operating reliably across multiple autonomous workers.

That realization eventually led us toward Hermes Agent.

Not because we wanted another chatbot.

But because we wanted to explore orchestration.

The Core Idea

We started asking ourselves a simple question:

What happens when Hermes stops behaving like a conversational assistant and starts behaving like a managerial orchestration layer?

That question became the foundation of our experiment.

The result was Gotihub Hermes Crew.

The name itself carries the philosophy behind the project.

Gotihub is derived from the Bengali word Goti (গতি), meaning Speed.

We wanted to explore whether autonomous engineering workers could coordinate quickly, reliably, and structurally inside real governance workflows.

The result became a high-speed multi-agent engineering orchestration system capable of analyzing GitHub repositories through specialized autonomous workers coordinated by Hermes.

Project Links

Live Demo

https://crew.gotihub.com

GitHub Repository

https://github.com/apurba-labs/gotihub-hermes-crew

Why We Didn’t Want a Single Monolithic Agent

One massive prompt window handling:

security analysis,
architecture auditing,
roadmap planning,
and executive synthesis

quickly becomes expensive, unstable, and difficult to govern.

So instead of forcing one model to think about everything simultaneously, we separated:

Execution from Governance

Execution Layer

Specialized Gemma workers execute focused engineering tasks independently.

Governance Layer

Hermes coordinates, synthesizes, and manages the outputs generated by those workers.

That separation became the most important architectural decision in the project.

The Multi-Agent Architecture

Our orchestration pipeline follows four major stages:

SecurityAgent performs repository security analysis.
ArchitectureAgent evaluates structural and maintainability health.
PlanningAgent generates engineering roadmap recommendations.
Hermes Master synthesizes everything into a structured managerial report.

The important detail is that the first stage executes concurrently.

We intentionally used Python’s native asynchronous execution model instead of sequential blocking pipelines.

Stage 1 Concurrency with `asyncio.gather`

The first orchestration layer launches multiple specialized workers simultaneously:

SecurityAgent
ArchitectureAgent

Both execute inside an asyncio.gather() orchestration block.

This allowed us to explore:

concurrent repository analysis,
isolated engineering responsibilities,
and structured task specialization.

Instead of treating AI as a single giant context window, we treated it like a coordinated engineering crew.

System Workflow Architecture

Here is the orchestration workflow powering the system:

The workflow is intentionally separated into:

concurrent execution,
planning synthesis,
and executive orchestration.

This structure allowed us to keep responsibilities isolated while still producing a consolidated engineering report.

Hermes as the Orchestrator

This is where Hermes became genuinely interesting.

Hermes does not directly parse raw repositories in our architecture.

Instead, Hermes behaves like a managerial synthesis layer.

The worker agents generate:

summaries,
issue reports,
confidence scores,
engineering recommendations.

Hermes then:

resolves overlap,
synthesizes cross-agent conclusions,
generates executive summaries,
and produces structured JSON outputs.

In other words:

The workers execute.
Hermes governs.

That orchestration philosophy changed how we approached agent systems entirely.

Multi-Subdomain Infrastructure Design

As the system evolved, we realized orchestration architecture alone was not enough.

We also needed infrastructure separation.

So we deployed the ecosystem using multiple subdomains and isolated routing layers:

gotihub.com → corporate site
agl.gotihub.com → SaaS engine
crew.gotihub.com → Hermes orchestration platform

Behind the scenes:

FastAPI handled orchestration,
Docker managed runtime isolation,
Nginx routed ingress traffic,
Ollama powered local inference,
and Hermes coordinated the synthesis layer.

Most importantly:

The inference backbone was never exposed directly to the public internet.

Internal AI Backbone Architecture

The deployment topology evolved into something closer to a lightweight orchestration mesh:

This allowed multiple services to share:

one centralized inference core,
isolated application routing,
and internal-only AI communication.

Real Engineering Problems We Hit

This project was not smooth.

And honestly, that’s where most of the learning happened.

The Local Compute Bottleneck

Our earliest orchestration runs were extremely slow.

One real telemetry session looked like this:

[TELEMETRY] GitHubLoader fetched 8 files in 5.91 seconds.

[Orchestrator] Starting Full Pipeline...
[TELEMETRY] Stage 1 took 218.68 seconds.
[TELEMETRY] Stage 2 took 72.19 seconds.
[TELEMETRY] Stage 3 took 120.18 seconds.

[TELEMETRY] Pipeline Complete! Total Runtime: 411.05 seconds.

The bottleneck was not orchestration.

It was:

oversized repository context,
local inference latency,
verbose prompt chains,
and massive token generation overhead.

That distinction mattered.

Because it meant the architecture itself was scalable — but inference strategy needed optimization.

What We Optimized

We eventually began improving runtime by:

reducing repository context size,
prioritizing critical engineering files,
limiting unnecessary token generation,
shrinking synthesis payloads,
and improving async orchestration boundaries.

The system became dramatically more stable once we stopped treating every file equally.

Defensive Failure Engineering

One of the most important lessons came from structured output failures.

Large orchestration chains occasionally returned:

malformed JSON,
partial synthesis blocks,
or incomplete manager responses.

Instead of allowing pipeline collapse, we added:

fallback execution paths,
JSON cleanup layers,
defensive parsing,
and structured failure recovery.

That forced us to think less like prompt engineers and more like systems engineers.

Why Hermes Actually Worked Well

Frameworks like CrewAI are excellent for rapidly assembling conversational agent pipelines.

But our exploration focused on something slightly different:

persistent orchestration,
structured engineering outputs,
governance-oriented workflows,
and isolated worker responsibilities.

We wanted Hermes to operate less like a conversational assistant and more like an engineering coordination layer.

That distinction became the entire philosophy behind the project.

What Fascinated Us Most

The most interesting part was not whether AI could generate text.

It was whether autonomous workers could coordinate reliably inside real operational systems.

That changes the conversation entirely.

Instead of asking:

“Can AI answer questions?”

We started asking:

“Can AI workers collaborate responsibly inside engineering governance workflows?”

Hermes gave us a practical way to explore that future.

And honestly, that exploration became far more valuable than simply building another AI wrapper.

Built With

Hermes Agent
FastAPI
Python AsyncIO
Ollama
Gemma 3
Docker
Nginx
SQLite
Next.js

Final Thoughts

This project is still evolving.

We are actively optimizing:

orchestration runtime,
inference efficiency,
streaming telemetry,
structured synthesis,
and governance reliability.

But the biggest thing we learned was this:

Autonomous systems become genuinely interesting when they stop behaving like isolated chatbots and start behaving like coordinated engineering workers.

That is the future we wanted to explore with Hermes.

And we are excited to continue building toward it.

GotiHub AGL — Building Governance-First AI Workflows with Local Gemma 4

Apurba Singh — Tue, 19 May 2026 17:47:04 +0000

This is a submission for the Gemma 4 Challenge: Write About Gemma 4

AI can recommend. Governance decides. ⚖️

A governance-first institutional workflow platform powered by localized Gemma 4 reasoning and privacy-preserving verification.

🚀 What I Built

I built GotiHub AGL, a governance-first AI workflow platform designed for high-trust institutional operations like:

alumni verification
compliance approvals
governance reviews
sensitive administrative workflows

Instead of giving AI autonomous authority, the platform keeps humans inside the decision loop while allowing Gemma 4 to perform localized reasoning, risk analysis, and workflow orchestration.

The system runs fully local using Gemma 4 via Ollama, ensuring sensitive institutional data never leaves the organization's infrastructure.

💡 Inspiration

Many institutions still rely on:

spreadsheets
fragmented approval chains
manual phone verification
disconnected audit systems

At the same time, organizations want to adopt AI — but they are uncomfortable sending confidential internal records to external cloud providers.

That inspired one central question:

What if AI reasoning could stay local, governance could remain human-controlled, and institutional verification could become cryptographically auditable?

That became the foundation of GotiHub AGL.

🏛️ How Gemma 4 Helps Trusted Communities

Many long-standing institutions — schools, alumni associations, NGOs, cooperatives, and local governance groups — still depend heavily on manual trust systems built over decades.

These communities often face intense, everyday operational challenges:

Verifying historical member records
Approving sensitive financial requests
Validating multi-decade alumni credential rolls
Handling community donation approvals
Preventing duplicate claims or suspicious submissions
Preserving member privacy while maintaining absolute accountability

Traditional cloud AI solutions create an immediate trust roadblock for these organizations because sensitive institutional records must leave their physical control and pass through external commercial APIs.

For true institutional compliance, data leakage is a non-negotiable risk.

Google's Gemma 4 completely changed that for us.

By running Gemma 4 locally inside a containerized workspace, GotiHub AGL allows community organizations to introduce frontier-level AI-assisted governance while keeping institutional data fully inside their own self-hosted infrastructure.

🚀 The Local Institutional Workflow

👉 An alumni secretary submits a verification request.

👉 Gemma 4 locally reviews historical inconsistencies and checks policy parameters.

👉 Low-risk files auto-route to immediate micro-payment or clearance hooks.

👉 High-risk anomalies escalate automatically to senior committee members via a Filament UI.

👉 Approved workflows generate cryptographically sealed, auditable records.

👉 Sensitive community records NEVER leave the organization's server.

This creates a governance model where local intelligence strengthens trusted communities instead of attempting to replace human accountability.

🧠 Why Gemma 4 Worked So Well for This Project

Several deep architectural improvements inside the Gemma 4 family directly enabled us to build GotiHub AGL with enterprise reliability on limited-budget infrastructure.

1️⃣ Interleaved Hybrid Attention for Massive Records

Institutional workflows often involve processing:

long historical registries
multi-step approval documents
large verification chains

Traditionally, long-context evaluation destroys server RAM because the Key-Value (KV) cache grows aggressively.

Gemma 4 completely solves this by introducing a hybrid interleaved attention mechanism, which alternates between:

Local Sliding Window layers
Global Attention layers

Combined with Proportional RoPE (p-RoPE), it drastically compresses the memory footprint.

This architectural breakthrough allowed our lightweight VPS nodes (running a standard 4GB swap space on Contabo infrastructure) to process extensive context windows without triggering Linux Out-Of-Memory (OOM) freezes.

2️⃣ Native System Prompt Support & Rigid JSON Constraints

Our orchestration backend depends entirely on structured, predictable machine outputs.

Brittle regular expression parsing quickly becomes unstable if a model changes formatting slightly.

Gemma 4 solved this elegantly.

{
  "risk_score": 9,
  "decision": "ESCALATE",
  "explanation": "Context reveals a missing historical graduation timestamp."
}

Google DeepMind built native system role support directly into the core layers of Gemma 4.

This unlocked highly reliable schema constraint matching.

By invoking Ollama's native JSON mode with Gemma 4, our Laravel architecture can enforce direct contract compliance, guaranteeing stable payload extraction for machine-readable governance metrics like:

risk_score
decision
escalation_state

3️⃣ Mixture-of-Experts (MoE) & Multi-Token Prediction (MTP)

One of our core goals was proving that community-scale AI does not require massive cloud infrastructure.

Gemma 4 enables this through several major architectural innovations.

⚡ Mixture-of-Experts (MoE)

The 26B Gemma 4 architecture uses:

128 total experts
only 8 active experts per token route

This means the model behaves with the intelligence of a large server-grade network while maintaining the efficiency of a lightweight edge deployment.

For institutional governance systems, this creates:

lower latency
lower infrastructure cost
faster local inference
scalable community deployment

⚡ Multi-Token Prediction (MTP)

Gemma 4 also introduces speculative decoding through Multi-Token Prediction (MTP).

This allows background workers to predict future token sequences in parallel, dramatically improving reasoning throughput and reducing latency bottlenecks.

In practice, this gave our governance workflows noticeably faster response times even on affordable VPS infrastructure.

4️⃣ True Open-Source Sovereignty (Apache 2.0)

Because Google released Gemma 4 under the fully open Apache 2.0 license, it becomes a massive win for:

community ownership
institutional sovereignty
long-term governance stability

Schools, NGOs, and developing regions no longer need to rely entirely on:

volatile API pricing
external commercial dependencies
closed proprietary AI systems

Organizations can safely deploy permanent, localized AI governance infrastructure fully under their own control.

🛠️ Technical Architecture

GotiHub AGL operates across three isolated but connected services:

┌──────────────────────────────────────────┐
│      GotiHub AGL (Laravel Platform)      │
│ Laravel 13 • Filament • MySQL • Nginx    │
└──────────────────────────────────────────┘
                    │
                    ▼
┌──────────────────────────────────────────┐
│      Laravel AGL Intelligence Layer      │
│      Local Gemma 4 via Ollama            │
└──────────────────────────────────────────┘
                    │
                    ▼
┌──────────────────────────────────────────┐
│     Midnight Verification Sidecar        │
│      Bun / Node.js ZK Verification       │
└──────────────────────────────────────────┘

⚡ Infrastructure Challenges We Solved

Running local LLMs alongside traditional web infrastructure introduced several real-world engineering problems:

Linux OOM crashes
inference spikes
Docker memory contention
container orchestration instability
VPS resource exhaustion

To stabilize the platform, we implemented:

swap partition tuning
Docker memory isolation
internal network segmentation
controlled inference boundaries
optimized container orchestration

This became one of the most valuable engineering lessons of the project.

❤️ The Bigger Vision

The future of AI is not about autonomous systems running unchecked.

It is about governed collaboration between:

humans
institutions
localized intelligence

GotiHub AGL explores an architecture where:

AI assists governance
humans remain accountable
privacy stays protected
communities retain sovereignty over their own data

Gemma 4 made that future possible on accessible infrastructure.

AI can recommend. Governance decides. ⚖️

🛠️ Production Verification Details

Core Stack

Laravel 13
Filament Panels
Docker
Nginx
MySQL

AI Orchestration Layer

Laravel AGL
Ollama
Gemma 4 (E4B / MoE Variants)

Cryptographic Verification

Midnight Bridge
Node.js / Bun Sidecar
Zero-Knowledge Verification Isolation

Infrastructure Hardening

UFW Firewall Protection
Fail2Ban Brute-Force Protection
Internal Docker Network Isolation
Localized AI Inference Boundaries

🔗 Project Links

🚀 The End of the Memory Wall — And the Beginning of the Coordination Problem

Apurba Singh — Mon, 27 Apr 2026 19:10:52 +0000

This is a submission for the Google Cloud NEXT Writing Challenge

At Google Cloud NEXT ’26, we didn’t just get faster AI. We removed one of the oldest limits in computing: The Memory Wall.

Now agents can think faster than ever.

But as a Senior Solution Architect, I see a new bottleneck emerging:

Agents can now act faster than we can coordinate them.

From Compute Bottlenecks to Coordination Bottlenecks

For 15 years, building distributed systems meant fighting infrastructure limits:

High-latency networks
Expensive, scarce compute
Drastic memory constraints

At Google Cloud NEXT ’26, the paradigm shifted. With infrastructure like the TPU 8i, we are no longer blocked by raw compute.

We are entering a new phase:

Systems can think fast enough. Now they need to work together reliably.

The Breakthrough Isn’t Just Models; It’s Silicon

While most attention went to models, the real shift for system builders is underneath:

Boardfly topology reduces communication distance to ~7 hops
On-chip memory keeps reasoning context close to compute
Collective acceleration reduces coordination overhead

These changes remove the memory wall—the hidden cost where reasoning slows down because data has to move.

Why the Memory Wall Matters for Agents

AI agents don’t just compute—they reason in loops.

Each step depends on:

context
memory
previous decisions

Previously:

every step incurred a latency penalty
agents spent more time waiting than thinking

Now:

reasoning becomes fast
concurrency becomes cheap

And once thinking becomes cheap, coordination becomes expensive.

We’ve Seen This Before

In the microservices era, we had:

service-to-service chatter
race conditions
distributed state conflicts

We introduced:

queues
locks
orchestration

Now we face the same problem again—just with higher stakes.

Because agents don’t just respond…

They reason over time.

The New Failure Mode: Reasoning Race Conditions

If you run hundreds of agents without coordination:

they read stale state
they overwrite each other
they make decisions based on outdated reality

You don’t get scale.

You get reasoning race conditions.

A Practical Direction: Agent Governance Layer (AGL)

From building production systems, one thing becomes clear quickly:

Coordination cannot be optional.

This leads to what I think of as an Agent Governance Layer (AGL)—a control plane for agent behavior.

1. Identity → Semantic Scoping

Agents need more than roles.

They need:

scoped context
bounded permissions
intent-aware access

What is this agent allowed to do right now?

2. Synchronization → Reasoning Mutex

Agents must not blindly write to shared state.

They need:

controlled execution
conflict awareness
coordination across time

Especially when:

a “transaction” includes human latency

3. State Awareness → Versioned Systems

Shared memory must be:

versioned
validated before commit
conflict-aware

Otherwise:

stale reasoning
silent corruption
unpredictable outcomes

4. Intent Logging → The “Why” Layer

In agent systems, debugging changes:

Not:

what happened?

But:

why did the agent decide this?

Intent becomes the new observability.

A New Metric: Reasoning Health

We used to monitor:

CPU
memory
latency

Now we must also monitor:

conflict frequency
stale reasoning
retry loops
failed commits

Reasoning Health will define system reliability in the agentic era.

Closing Thought

We are moving from systems that execute

to systems that reason

Google solved the infrastructure problem.

Now we have to solve the coordination problem.

Running 1,000 agents is easy.

Making them behave like a system is not.

Discussion

If you’re building with agents today:

How are you handling shared state?

Are you trusting the system—or actively governing it?

🚀 The Architect’s Blueprint: Securing Local Agentic Workflows with OpenClaw

Apurba Singh — Fri, 24 Apr 2026 22:59:07 +0000

This is a submission for the OpenClaw Writing Challenge

The Real Question Behind Agentic AI

Most discussions around agentic AI focus on capability—what agents can do, how autonomous they are, how “smart” they feel.

But in production systems, that’s not the real question.

The real question is governance.

Who is allowed to act?

When are they allowed to act?

And what happens when multiple agents act at the same time?

As someone building high-compliance, scalable systems, these are the constraints that define whether a system survives in production—or fails silently.

Context: From Microservices to Agentic Systems

Over the past several years, I’ve worked on regulated, high-volume architectures where automated responders interact with critical systems.

A consistent pattern emerged:

Intelligence without control becomes a liability.

In my current work on platforms like GotiHub, I separate:

Workflow orchestration
AI processing layers

This separation is not optional—it’s what allows systems to scale safely.

When I explored OpenClaw, I saw an opportunity to apply the same discipline to agentic workflows.

The Local-First Advantage (Done Right)

OpenClaw’s local-first model isn’t just about privacy—it’s about reducing the attack surface.

When implemented properly, it enables:

Zero-Trust Data Sovereignty

Vector data (e.g., Weaviate) stays within controlled environments (local or VPC).
Secure Secret Handling

Skills rely on local environment variables, avoiding exposure through external LLM logging layers.
Deterministic Execution Boundaries

Agent capabilities can be tightly scoped and enforced.

These are not just features—they are architectural primitives for secure systems.

The Concurrency Problem No One Talks About

Here’s the gap I don’t see discussed enough:

What happens when multiple agents share state?

Imagine:

50 OpenClaw instances
All reading and writing to shared Markdown memory files
No coordination mechanism

This is not just a performance issue.

It’s a data integrity problem:

race conditions
inconsistent memory state
unpredictable behavior

In traditional microservices, we solve this with:

Redis locks
message queues
transactional boundaries

But in many agentic setups, this layer is missing.

A Practical Approach: Governance Over Intelligence

From my experience, scaling agentic systems requires two distinct control layers:

1. Identity Layer (Scope Control)

Question: Should this agent be allowed to act?

Using something like laravel-iam, each agent operates within a defined permission scope:

access to specific memory regions
allowed actions
role-based constraints

This ensures agents never operate with a “master key.”

2. Synchronization Layer (State Control)

Question: When is this agent allowed to act?

This is where a centralized control mechanism—like a Laravel Approval Engine—becomes critical.

Before an agent writes to shared memory:

It must request a state lock
If another agent holds the lock → request is queued
Once approved → action proceeds

This transforms:

uncontrolled concurrency → audited, deterministic workflows

Example: An Enterprise Approval Skill

Here’s a simplified example of how a governed skill might look:

# Skill: Enterprise Approval Check

# Description:
Checks if an agent has permission to trigger a deploy.

## Constraints:
- Validate role via `laravel-iam`
- Return 403 if unauthorized

## Execution:
POST {{APP_URL}}/api/v1/approvals/check

Headers:
  Authorization: Bearer {{AGENT_IAM_TOKEN}}

Body:
{
  "action": "deploy",
  "actor": "{{user_id}}"
}

This isn’t about limiting agents—it’s about making their behavior predictable, auditable, and safe.

Lessons from Production Systems

A few principles that consistently hold:

Scoped Skills Over Global Access Narrow permissions reduce risk dramatically.
Audit Logs Are Non-Negotiable Observability is essential to detect reasoning drift and unintended behavior.
Performance Beats “Over-Intelligence” Smaller local models (e.g., LLaMA, Mistral) are often faster, cheaper, and more reliable for most workloads.

Closing Thought

If agentic systems are going to operate in real production environments, they must evolve:

From autonomous scripts → to governed systems.

OpenClaw provides a powerful foundation for local-first experimentation.
The next step is layering identity, synchronization, and control on top of that foundation.

Discussion

I’m curious how others are approaching this:

How are you managing shared state and concurrency in local agent workflows?
Are you relying on implicit behavior—or introducing explicit control layers?

Let’s discuss.

Laravel Is Growing Up — So I Built a Workflow Engine That Matches It (Clean Architecture + IAM + Token Approval)

Apurba Singh — Mon, 20 Apr 2026 00:43:04 +0000

This week’s DEV Community digest highlighted something interesting:

Laravel developers are moving beyond “fat controllers” toward clean architecture and enterprise-grade systems.

That’s exactly what I’ve been working on.

⚠️ The Problem (We All Faced)

If you've built any of these:

Leave approval
Expense approval
Purchase workflows

You already know the reality:

Business logic inside controllers
Role checks everywhere
Email spam for approvals
Hard to scale, harder to maintain

And every project?

You rebuild the same workflow logic again.

💸 Why Most Teams Get It Wrong

Typical solutions:

SaaS tools
Zapier automation
Email-based approvals

Which leads to:

Recurring cost
Limited customization
Poor visibility
No real control

🧠 What I Built Instead

I didn’t build another feature.

I built a reusable approval workflow engine:

Multi-level approval pipelines
Role-based access (IAM-ready)
Event-driven lifecycle
Token-based approvals (no login required)
Smart notification batching

🧩 The Architecture (This Is the Key)

Adapters (API / CLI / Queue)
        ↓
Workflow Manager
        ↓
Workflow Engine (Pure Logic)
        ↓
Domain Models (State)
        ↓
Events → Listeners → Notifications

Key Idea:

The engine knows nothing about HTTP, UI, or SaaS.

🔥 What Makes It Different

1. Headless Workflow Engine

$manager->start('requisition', $payload);
$manager->approve($workflowId, $userId);

No controller dependency. Works anywhere.

2. IAM-Ready (But Decoupled)

Engine does NOT handle auth
It only receives user_id
IAM handles permissions externally

👉 Clean separation = scalable system

3. Token-Based Approval (Game Changer)

POST /api/v1/approvals/token/approve

Secure
Expiring
Single-use

👉 Approve directly from email / Slack
👉 No login required

4. Smart Notification Batching

Instead of:

10 approvals → 10 emails ❌

You get:

10 approvals → 1 email ✅

5. Idempotent Workflow Execution

hash('sha256', payload)

Prevents duplicate workflows on retries.

6. Extensible Plugin System (One of My Favorite Parts)

One thing I really wanted was flexibility.

So I added a plugin system:

Hook into workflow events
Add integrations (Slack, email, APIs)
Extend behavior without touching core

Example:

class SlackPlugin extends BasePlugin
{
    public function boot(): void
    {
        $this->listen(WorkflowCompleted::class, function ($event) {
            // send slack notification
        });
    }
}

🧪 Built Like a Real System

Full lifecycle testing
Authorization validation
Event-driven consistency
Duplicate protection

💼 Real-World Direction

This system is designed for:

SaaS platforms
Banking workflows
Enterprise approval pipelines
Internal automation systems

🔗 Open Source

👉 https://github.com/apurba-labs/laravel-approval-engine

🤝 Let’s Talk

If you're building:

Workflow systems
Approval pipelines
RBAC / IAM architectures

📩 LinkedIn: https://www.linkedin.com/in/apurba-narayan-singh/
📧 apurbansinghdev@gmail.com

💬 I’d Love Your Feedback

How are you handling approvals today?
Are you using SaaS tools or building in-house?
What’s the biggest pain in your workflow systems?

Let’s discuss 👇

From Python to Laravel: Why I Built My Own IAM System Instead of Using Existing Packages

Apurba Singh — Sat, 04 Apr 2026 05:30:57 +0000

As a backend developer, I’ve spent most of my career working with Python — FastAPI, Django, Flask.
I’ve always cared about one thing deeply:
👉 building systems that scale without becoming messy
But there was one problem I kept running into… no matter the stack.

🧠 The Problem: The “Global Role” Trap
At first, everything looks simple:
• Users
• Roles
• Permissions
But as systems grow, things start breaking.
Most RBAC (Role-Based Access Control) packages assume:
👉 a user is either an Admin… or they aren’t.
But real-world systems are never that simple.

A real scenario:
• A user is a Manager in Branch A
• The same user is a Viewer in Branch B
Now ask yourself:
👉 How do you model this cleanly?

Most of the time, we don’t.
We write conditions like:
if ($user->role === 'manager' && $branch_id === 1) { ... }
And slowly…
• logic spreads everywhere
• dependencies grow
• and one small change breaks multiple parts of the system

😵 When It Became a Problem
Across multiple projects, I saw the same pattern:
• Roles started multiplying
• Permissions became unclear
• Debugging access issues became painful
It didn’t matter if I was using Python or Laravel.
👉 The problem wasn’t the framework.
👉 The problem was the model.

🔄 The Turning Point
While working on Laravel-based systems, I explored existing solutions like Spatie.
They are great — clean, simple, and widely used 👏
But for complex systems, I kept hitting limitations:
• No real support for contextual authority
• Difficult to manage multi-tenant permissions
• Hard to model relationships between roles and scopes
At some point, I stopped trying to “work around” the problem.
👉 I decided to rethink it.

🚀 Building Laravel IAM
Instead of focusing only on roles, I started thinking in terms of:
👉 relationships + context + resolution

This led me to build:
Laravel IAM (v0.2.0)

⚙️ The Core Idea: The Four Levels of Truth
Instead of hardcoding logic, the system resolves permissions through layered specificity:

*Global *→ . (Super Admin)
Resource Wildcard → invoice.*
Action Wildcard → *.approve
Atomic Permission → invoice.approve This makes permission checks: • predictable • scalable • easy to reason about

🧩 Context Matters
The same role doesn’t mean the same thing everywhere.
So the system supports:
• Tenant-based roles
• Team-based roles
• Branch-level permissions
👉 Without turning your code into a mess

💡 What I Learned
This journey taught me something important:
👉 Authorization is not about roles — it’s about context
And even more importantly:
👉 Architecture matters more than framework

⚙️ Under the Hood
Some design decisions behind the system:
• Registry Pattern → decoupled resources & actions
• Flexible Role Assignment → supports IDs, slugs, or models
• Scoped Middleware → supports contextual authorization
• Blade Directives → clean UI permission checks
And yes — everything is backed by a test suite simulating real workflows ✅

🛠️ Open Source
I’ve open-sourced the project and would genuinely love feedback:
📦 https://packagist.org/packages/apurba-labs/laravel-iam
💻 https://github.com/apurba-labs/laravel-iam

💬 Let’s Talk
How do you handle complex permissions in your systems?
Have you faced similar challenges with RBAC?

This is a submission for the 2026 WeCoded Challenge (https://dev.to/challenges/wecoded-2026): Echoes of Experience

Built with ☕ and logic by Apurba Labs.

Laravel #PHP #Python #IAM #RBAC #SaaS #Backend #OpenSource #WeCoded #wecoded2026

I’m a Python Developer — So I Built a Better IAM System for Laravel

Apurba Singh — Fri, 03 Apr 2026 19:16:58 +0000

I’m a Python/FastAPI Developer — So I Built an IAM System in Laravel
As a backend developer working with FastAPI, Django, and Flask, I’ve always cared deeply about clean architecture and scalable authorization systems.
But every time I built a SaaS product, I ran into the same problem:
👉 Permissions become messy… very quickly.

🧠 The Real Problem: Contextual Authority
Let’s say:
• A user is a Manager in Branch A
• The same user is a Viewer in Branch B
Most RBAC systems struggle here.
You either:
• add tons of conditional logic ❌
• or end up with tightly coupled, hard-to-maintain permission rules ❌

😵 The Breaking Point
When systems grow, you start seeing:
• Role explosions (too many roles)
• Nested dependencies
• Hardcoded permission checks
• “Who can do what?” becomes unclear
I faced this repeatedly in Python projects…
and surprisingly, the same issue exists in Laravel.

🚀 So I Built: Laravel IAM (v0.2.0)
Instead of patching the problem, I designed a system that handles:
✔ Contextual permissions (per scope: tenant, team, branch)
✔ Wildcard permissions (expense., *.)
✔ Hierarchical access (manage → all actions)
✔ Dynamic resolution (no hardcoded roles)

⚙️ The Core Idea: “Four Levels of Truth”
The engine resolves permissions using a layered approach:

Direct Permission → exact match
Wildcard Match → resource.*
Hierarchy Rule → resource.manage
Global Access → . This allows instant and predictable permission resolution — even in complex SaaS environments.

🔥 Why Not Just Use Existing Packages?
Packages like Spatie are great for basic RBAC 👏
But they don’t fully solve:
• Context-based access control
• Dynamic multi-tenant systems
• Workflow-aware permission resolution

💡 Example
IAM::can($user, 'expense.approve');
No complex conditionals.
No hardcoded roles.
Just clean, predictable logic.

🛠️ Open Source — Try It
I’ve open-sourced the project and would love feedback from the community:
📦 Packagist: https://packagist.org/packages/apurba-labs/laravel-iam
💻 GitHub: https://github.com/apurba-labs/laravel-iam

💬 Let’s Discuss
How do you handle contextual permissions in your projects?
Have you faced similar issues with RBAC systems?

Laravel #PHP #FastAPI #RBAC #IAM #SaaS #Backend #OpenSource

I Built a Laravel Approval Engine to Stop Email Spam 🚀

Apurba Singh — Wed, 25 Mar 2026 23:33:15 +0000

Over the last few months, while working on enterprise Laravel projects, I noticed a recurring "Notification Nightmare."

Every company needs an approval workflow (requisitions, invoices, PTO), but most systems flood managers with separate notifications for every single item.

I decided to build a solution: Laravel Approval Engine.

🔥 The "Smart Batching" Concept

The core problem with enterprise workflows isn't the approval logic; it's notification fatigue. Instead of sending 50 separate emails for 50 pending approvals, my engine buffers them into 1 smart batch. The manager receives a single, clean digest with secure, token-based links to approve everything at once.

🏗️ How it Works

The architecture is designed to be modular and plug-and-play.

Define a Module: Use php artisan make:workflow-module to create a logic class.
Queue Records: Your business models enter a "pending" state.
The Processor: A scheduled artisan command bundles pending records into a Batch.
Action: The approver receives a single email. They can Approve All, Reject, or View Details via a secure Next.js dashboard.

🧠 Technical Highlights

Multi-stage Workflows: Easily route from Manager -> Finance -> CEO.
Token-based Security: Approvers don't even need to log in to take action.
Event-Driven: Hooks for every stage (Created, Approved, Escalated).
Next.js Dashboard: A sleek frontend for managing the workflow status.
Laravel 12 Ready: Built to work with the latest PHP 8.2+ features.

📊 The Workflow Flow

graph TD
    A[Pending Records] --> B[Smart Batch Created]
    B --> C[Email Digest Sent]
    C --> D[Approver Clicks Link]
    D --> E[Stage Resolver]
    E --> F{Next Stage?}
    F -- Yes --> G[Create Next Batch]
    F -- No --> H[Workflow Completed]

🚀 Try the Demo

I've included a demo inside the repo so you can see it in action in under 2 minutes:

git clone https://github.com/apurba-labs/laravel-approval-engine
cd laravel-approval-engine/example/laravel-demo
composer install
php artisan approval:demo

🔗 GitHub

I’d love for the community to check it out, give it a star, or suggest new features!

👉 Get the Code on GitHub

Would love to hear your feedback! How do you handle complex approval routing in your own Laravel apps?

DEV Community: Apurba Singh

Traffic vs Truth: Building a Global Auction Platform with DynamoDB and Aurora DSQL

The Core Philosophy

The Problem

Architecture Overview

Why DynamoDB?

Why Aurora DSQL?

Why Vercel and Next.js?

The Interesting Part: Settlement Contention

Observability

What I Learned

Try It Yourself

Karate Platform: Finishing the Tournament Experience

What I Built

Links

The Comeback Story

🏆 Completion Arc: The Before & After Journey

Before the Challenge

After the Challenge

What the Finish-up-a-thon Helped Me Finish

🛠️ Use of Underlying Technology

📱 Usability, User Experience & Creativity

⭐ What I'm Most Proud Of

My Experience with GitHub Copilot

What's Next

We Didn’t Want Another AI Wrapper — So We Explored a High-Speed Hermes Orchestrator for Engineering Crews

The Core Idea

Project Links

Live Demo

GitHub Repository

Why We Didn’t Want a Single Monolithic Agent

Execution from Governance

Execution Layer

Governance Layer

The Multi-Agent Architecture

Stage 1 Concurrency with asyncio.gather

System Workflow Architecture

Hermes as the Orchestrator

Multi-Subdomain Infrastructure Design

Internal AI Backbone Architecture

Real Engineering Problems We Hit

The Local Compute Bottleneck

What We Optimized

Defensive Failure Engineering

Why Hermes Actually Worked Well

What Fascinated Us Most

Built With

Final Thoughts

GotiHub AGL — Building Governance-First AI Workflows with Local Gemma 4

AI can recommend. Governance decides. ⚖️

🚀 What I Built

💡 Inspiration

🏛️ How Gemma 4 Helps Trusted Communities

🚀 The Local Institutional Workflow

🧠 Why Gemma 4 Worked So Well for This Project

1️⃣ Interleaved Hybrid Attention for Massive Records

2️⃣ Native System Prompt Support & Rigid JSON Constraints

3️⃣ Mixture-of-Experts (MoE) & Multi-Token Prediction (MTP)

⚡ Mixture-of-Experts (MoE)

⚡ Multi-Token Prediction (MTP)

4️⃣ True Open-Source Sovereignty (Apache 2.0)

🛠️ Technical Architecture

⚡ Infrastructure Challenges We Solved

❤️ The Bigger Vision

AI can recommend. Governance decides. ⚖️

🛠️ Production Verification Details

Core Stack

AI Orchestration Layer

Cryptographic Verification

Infrastructure Hardening

🔗 Project Links

🚀 Live Demo

🏆 Devpost Submission

💻 GitHub Repository

🚀 The End of the Memory Wall — And the Beginning of the Coordination Problem

From Compute Bottlenecks to Coordination Bottlenecks

The Breakthrough Isn’t Just Models; It’s Silicon

Why the Memory Wall Matters for Agents

We’ve Seen This Before

The New Failure Mode: Reasoning Race Conditions

Stage 1 Concurrency with `asyncio.gather`