DEV Community: Willian R Moraes

Não achei um framework Go production-ready para agentes de IA. Então construí um.

Willian R Moraes — Fri, 29 May 2026 05:13:54 +0000

Como construí um framework Go para agentes de IA em produção — e as decisões de arquitetura que realmente importam.

Quinze anos escrevendo software profissionalmente e nunca open sourcei uma linha de código.

Não é que eu não quisesse. É que é assim que funciona quando você constrói dentro de empresa — o código é deles, os problemas são específicos do domínio deles, e quando você finalmente tinha algo que poderia abstrair pra algo útil, já tinha passado pra próxima crise.

Dois meses atrás decidi mudar isso.

Estava construindo um agente de IA conversacional em Go. Precisava de um framework. Fui procurar e encontrei... Python. Mais Python. Uns repos Go abandonados em 2023. E muito "é só envolver o SDK da OpenAI" — conselho que funciona bem até você ter tráfego real e o agente começar a responder duas vezes para a mesma mensagem.

Nada production-ready. Nada com arquitetura de verdade. Nada que eu pudesse passar para um time e falar isso aguenta.

Então construí. O resultado é o eywa — framework Go para agentes de IA conversacionais, arquitetura hexagonal, v1.0.0, licença MIT, open source.

O nome vem de Avatar — Eywa é a rede neural que conecta todos os seres vivos em Pandora. A metáfora encaixou: um sistema que conecta LLMs, canais, memória e ferramentas num organismo só, onde cada parte percebe e responde ao ambiente.

Aqui o que aprendi construindo.

Por que Go e não Python

O ecossistema de IA vive em Python. LangChain, LlamaIndex, CrewAI — tudo Python. Se você está prototipando ou rodando notebooks, faz sentido total.

Mas se você está rodando algo em produção com escala de verdade — usuários reais mandando mensagens reais e você precisando de observabilidade, controle de concorrência e algo que não cai às 3 da manhã — Go é uma história completamente diferente.

Go te dá:

Goroutines e primitivas de concorrência reais
O race detector (go test -race) — que vai encontrar bugs que Python nem vê
Performance previsível sem surpresa do GC no pior momento
Tipos estáticos que eliminam categorias inteiras de bug em tempo de compilação

Os frameworks Python assumem que você vai ter uma requisição por vez ou vai tratar concorrência via filas fora do framework. Quando você está lidando com webhooks de WhatsApp em escala — múltiplos eventos do mesmo usuário chegando com milissegundos de diferença — essa suposição quebra.

A Arquitetura: Hexagonal, Não "Olha Seu Wrapper de OpenAI"

O princípio central do eywa é que o domínio de negócio não pode ter absolutamente nenhum conhecimento de infraestrutura.

Sem import do SDK da OpenAI no código de domínio. Sem chamadas ao Redis. Sem queries no MongoDB. Só interfaces — o que o eywa chama de ports.

O domínio define o que precisa. A infraestrutura implementa. O wiring acontece na inicialização.

Aqui o port do Bond — o lock distribuído:

type Bond interface {
    AcquireLock(ctx context.Context, key string, ttl time.Duration) (bool, error)
    ReleaseLock(ctx context.Context, key string) error
    ExtendLock(ctx context.Context, key string, ttl time.Duration) error
}

O domínio sabe que pode adquirir e liberar locks. Não sabe que a implementação usa Redis Redlock por baixo. Em testes, você injeta um no-op. Em produção, injeta o adapter Redis.

Mesmo padrão para o Oracle — a abstração de LLM:

type OracleRequest struct {
    Model         string
    SystemPrompt  string
    Messages      []OracleMessage
    Temperature   float64
    MaxTokens     int
    Tools         []OracleTool
    UseTools      bool
    Attachments   []LLMAttachment
}

O domínio manda um OracleRequest. Se vai para Anthropic, OpenAI, Gemini, Bedrock ou VertexAI é detalhe de infraestrutura. Troca provider na inicialização. Roda múltiplos ao mesmo tempo. O domínio não sabe e não precisa saber.

Isso não é over-engineering. É o que torna o sistema testável, manutenível e sobrevivível quando o próximo provider de LLM lançar e todo mundo quiser trocar.

O Léxico: Nomes Que Significam Alguma Coisa

Uma coisa onde investi pesado: nomenclatura. Não só nomes limpos de variável — um vocabulário de domínio consistente que todo pedaço do código usa.

Sim, os nomes são intencionais. Queria um vocabulário de domínio consistente em vez de "Manager", "Service", "Handler" e "Util" — que não dizem nada sobre o que o componente realmente faz no contexto de um agente de IA.

Nome	O que é
Weave	Motor de runtime — orquestra tudo por evento
Spirit	Configuração do agente — LLM, tools, system prompt, comportamento
Pulse	Evento de entrada — uma mensagem recebida de um canal
Oracle	Abstração de LLM — manda prompt, recebe resposta
Bond	Lock distribuído — evita respostas duplicadas concorrentes
Voice	Adapter de saída — manda resposta de volta pro canal
Scout	Enriquecimento de contexto — roda antes da chamada ao LLM
Lore	RAG — geração aumentada por recuperação
Imprint	Injeção de memória de longo prazo
Vigil	Human-in-the-loop — pausa o agente para resposta humana
Rite	Approval workflow — barra ações por trás de confirmação humana
Conduit	Adapter cliente MCP (Model Context Protocol)

Quando seu código fala bond.AcquireLock(...) em vez de redisLock.Lock(...), você para de pensar em infraestrutura e começa a pensar no domínio. Terminologia é design.

O Problema Que Ninguém Fala: Pulses Concorrentes

Cenário que acontece em produção e quase nenhum framework trata:

Usuário manda mensagem no WhatsApp. Webhook dispara. Seu agente começa a processar — chamada ao LLM em andamento, 800ms dentro dela.

Usuário fica impaciente e manda a mesma mensagem de novo. Segundo webhook dispara.

Agora você tem duas goroutines processando o contexto do mesmo usuário simultaneamente. A primeira termina, escreve a resposta e atualiza a memória. A segunda termina, escreve outra resposta usando estado de memória stale, sobrescrevendo a primeira atualização.

O usuário recebe duas respostas. A memória está inconsistente. Você criou uma race condition no nível da aplicação.

Isso é o Bond.

Antes do Weave processar qualquer Pulse, ele adquire um lock distribuído com chave no session ID do usuário. Se o lock já está em posse de outra goroutine, o evento é descartado. Só um processamento ativo por usuário, sempre.

O contrato é preciso: AcquireLock retorna (false, nil) quando o lock está ocupado (caso esperado), e (false, error) só para falhas de infraestrutura. Essa distinção importa — o chamador trata os dois de forma diferente.

Scouts: Enriquecimento de Contexto Antes de Toda Chamada ao LLM

O pipeline que roda a cada Pulse antes da chamada ao LLM:

Pulse → Scouts → Pathfinder → Spirit → Oracle → Actions → Voice

Scouts são etapas sequenciais de enriquecimento de contexto. Leem de sistemas externos e injetam conhecimento no Pulse antes do modelo ver qualquer coisa.

type Scout interface {
    GetName() string
    Harvest(ctx context.Context, event *entities.Pulse) error
    IsApplicable(event *entities.Pulse) bool
}

A decisão crítica de design: Scouts são fail-open.

Um Scout que retorna erro é logado. O pipeline continua sem os dados dele. A chamada ao LLM acontece assim mesmo.

Por quê? Porque se um Scout está batendo num CRM para enriquecer o contexto do usuário, e o CRM está lento naquela manhã, você não quer que seu agente inteiro pare de responder. Você quer que continue funcionando com menos contexto, graciosamente.

Wiring: Como Tudo Se Conecta

O Weave inteiro é montado na inicialização com um builder fluente:

weave, err := eywa.NewWeaveBuilder(ctx).
    WithRepositories(spiritRepo, memoryRepo, echoRepo, chronicleRepo).
    WithBond(bond).
    WithActionRegistry(eywa.NewActionRegistry()).
    WithScoutRegistry(eywa.NewScoutRegistry()).
    AddOracle(eywaopenai.NewOracle(apiKey)).
    WithConfig(config).
    Build()

MongoDB para configuração de Spirits e histórico de conversa. Redis para lock distribuído e memória em andamento. OpenAI como Oracle. Tudo injetado — nada global.

Para adicionar Anthropic como provider adicional:

AddOracle(eywaopenai.NewOracle(openaiKey)).
AddOracle(eywaanthropic.NewOracle(anthropicKey)).

Spirits definem qual provider usam. O OracleFactory seleciona o correto em runtime.

19 Módulos: Paga Só o Que Usa

O framework inteiro é distribuído como 19 módulos Go independentes:

github.com/wmulabs/eywa                     # core
github.com/wmulabs/eywa/fiber               # adapter HTTP
github.com/wmulabs/eywa/mongo               # repositórios MongoDB
github.com/wmulabs/eywa/redis               # Bond + memória Redis
github.com/wmulabs/eywa/mcp                 # cliente MCP (Conduit)
github.com/wmulabs/eywa/providers/anthropic
github.com/wmulabs/eywa/providers/openai
github.com/wmulabs/eywa/providers/gemini
github.com/wmulabs/eywa/providers/bedrock
github.com/wmulabs/eywa/providers/vertexai
github.com/wmulabs/eywa/providers/weaviate
github.com/wmulabs/eywa/providers/qdrant
github.com/wmulabs/eywa/providers/pgvector
github.com/wmulabs/eywa/providers/pinecone
github.com/wmulabs/eywa/channels/whatsapp
github.com/wmulabs/eywa/gcp/cloudtasks
github.com/wmulabs/eywa/gcp/gcs
github.com/wmulabs/eywa/gcp/gemini

Se você não usa Bedrock, não importa. Não recebe as dependências dele no go.sum. Não recebe a superfície de segurança dele. Desenvolvedor Go liga pra isso.

Segurança Não Foi Deixada Pra Depois

Antes de chamar isso de v1.0, fiz uma revisão de segurança séria:

SSRF bloqueado em todo cliente HTTP de saída — IPs privados, loopback, IMDS (169.254.169.254), file://, ftp://
io.LimitReader em toda resposta HTTP — um servidor malicioso não consegue dar OOM no seu agente
API keys como SHA-256 com subtle.ConstantTimeCompare — sem timing attacks
Rate limiting nos endpoints de auth
go test -race em todos os 19 módulos em cada push de CI, sem exceção

Nada disso é animador. Tudo importa.

O Que Vem a Seguir

eywa está em v1.0.0. Estável, hardened, documentado.

Se você está construindo agentes de IA em Go — ou quer construir mas não achava nada sério o suficiente pra basear um sistema de produção — adoraria seu feedback.

Pull requests bem-vindos. Issues bem-vindas. Crítica direta bem-vinda.

→ github.com/wmulabs/eywa

Talvez o mundo não precisasse de mais um framework de IA. Mas definitivamente precisava de mais engenharia nele.

I Couldn't Find a Production-Ready Go Framework for AI Agents. So I Built One.

Willian R Moraes — Fri, 29 May 2026 05:03:27 +0000

How I built a production-grade Go framework for conversational AI agents — and the architecture decisions that actually matter.

Fifteen years of writing software professionally and I never open sourced a single thing.

Not because I didn't want to. It's just how it works when you build inside companies — the code belongs to them, the problems are specific to their domain, and by the time you could abstract something useful, you've already moved on to the next fire.

Two months ago I decided to change that.

I was building a conversational AI agent in Go. Needed a framework. Went looking for one and found... Python. More Python. A few Go repos that were abandoned in 2023. And a lot of "just wrap the OpenAI SDK" advice that works fine until you have real traffic and your agent starts responding twice to the same message.

Nothing production-ready. Nothing with actual architecture. Nothing I could hand to a team and say this will hold up.

So I built it. The result is eywa — a Go framework for conversational AI agents, hexagonal architecture, v1.0.0, MIT license, open source.

The name comes from Avatar — Eywa is the neural network connecting all living things on Pandora. The metaphor fit: a system that connects LLMs, channels, memory, and tools into a single organism, where each part perceives and responds to the environment.

Here's what I learned building it.

Why Go, not Python

The AI ecosystem lives in Python. LangChain, LlamaIndex, CrewAI — all Python. If you're prototyping, exploring, or running notebooks, this makes complete sense.

But if you're running something in production at scale — where real users are sending real messages and you need observability, concurrency control, and something that doesn't fall over at 3am — Go is a very different story.

Go gives you:

Goroutines and real concurrency primitives
The race detector (go test -race) — which will find bugs Python won't even see
Predictable performance without a GC surprise at the worst moment
Static types that catch entire categories of bugs at compile time

The Python frameworks assume you'll have one request at a time or handle concurrency via queues outside the framework. When you're dealing with WhatsApp webhooks at scale — multiple events per user arriving milliseconds apart — that assumption breaks.

The Architecture: Hexagonal, Not "Here's Your OpenAI Wrapper"

The core principle in eywa is that the business domain should have absolutely zero knowledge of infrastructure.

No OpenAI SDK imports in domain code. No Redis calls. No MongoDB queries. Just interfaces — what eywa calls ports.

The domain defines what it needs. Infrastructure implements it. Wiring happens at startup.

Here's the Bond port — the distributed lock:

type Bond interface {
    AcquireLock(ctx context.Context, key string, ttl time.Duration) (bool, error)
    ReleaseLock(ctx context.Context, key string) error
    ExtendLock(ctx context.Context, key string, ttl time.Duration) error
}

The domain knows it can acquire and release locks. It does not know that the implementation uses Redis Redlock under the hood. In tests, you inject a no-op. In production, you inject the Redis adapter.

Same pattern for the Oracle (the LLM abstraction):

type OracleRequest struct {
    Model         string
    SystemPrompt  string
    Messages      []OracleMessage
    Temperature   float64
    MaxTokens     int
    Tools         []OracleTool
    UseTools      bool
    Attachments   []LLMAttachment
}

The domain sends an OracleRequest. Whether that goes to Anthropic, OpenAI, Gemini, Bedrock, or VertexAI is an infrastructure concern. Swap providers at startup. Run multiple providers simultaneously. The domain doesn't care.

This is not over-engineering. It's what makes the system testable, maintainable, and survivable when the next LLM provider comes out and everyone wants to switch.

The Lexicon: Names That Actually Mean Something

One thing I invested heavily in: naming. Not just clean variable names — a consistent domain vocabulary that every piece of code uses.

Yes, the names are intentional. I wanted a consistent domain vocabulary instead of "Manager", "Service", "Handler", and "Util" — names that tell you nothing about what the component actually does in the context of an AI agent.

Name	What it is
Weave	The runtime engine — orchestrates everything per event
Spirit	Agent configuration — LLM, tools, system prompt, behavior
Pulse	Inbound event — a message received from a channel
Oracle	LLM abstraction — send prompt, receive response
Bond	Distributed lock — prevents concurrent duplicate responses
Voice	Outbound adapter — sends replies back to the channel
Scout	Context enrichment step — runs before the LLM call
Lore	RAG — retrieval-augmented generation
Imprint	Long-term memory injection
Vigil	Human-in-the-loop takeover
Rite	Approval workflow — gates actions behind human confirmation
Conduit	MCP (Model Context Protocol) client adapter

When your code says bond.AcquireLock(...) instead of redisLock.Lock(...), you stop thinking about infrastructure and start thinking about the domain. Terminology is design.

The Problem Nobody Talks About: Concurrent Pulses

Here's a scenario that happens in production and almost no framework handles it:

A user sends a WhatsApp message. The webhook fires. Your agent starts processing — LLM call in progress, 800ms into it.

The user gets impatient and sends the same message again. Second webhook fires.

Now you have two goroutines processing the same user's context simultaneously. The first finishes, writes the response and updates memory. The second finishes, writes another response using stale memory state, overwriting the first update.

The user gets two responses. Memory is inconsistent. You've introduced a race condition at the application level.

This is Bond.

Before the Weave processes any Pulse, it acquires a distributed lock keyed by the user's session ID. If the lock is already held, the event is discarded. Only one active processing per user, ever.

The contract is precise: AcquireLock returns (false, nil) when the lock is held (expected case), and (false, error) only for infrastructure failures. This distinction matters — the caller handles them differently.

Scouts: Context Enrichment Before Every LLM Call

The pipeline that runs on every Pulse before the LLM call:

Pulse → Scouts → Pathfinder → Spirit → Oracle → Actions → Voice

Scouts are sequential context enrichment steps. They read from external systems and inject knowledge into the Pulse before the model sees anything.

type Scout interface {
    GetName() string
    Harvest(ctx context.Context, event *entities.Pulse) error
    IsApplicable(event *entities.Pulse) bool
}

The critical design decision: Scouts are fail-open.

A Scout that returns an error gets logged. The pipeline continues without its data. The LLM call still happens.

Why? Because if a Scout is hitting a CRM to enrich the user's context, and the CRM is having a slow morning, you don't want your entire agent to stop responding. You want it to keep working with less context, gracefully.

Wiring It All Together

The entire Weave is assembled at startup with a fluent builder:

weave, err := eywa.NewWeaveBuilder(ctx).
    WithRepositories(spiritRepo, memoryRepo, echoRepo, chronicleRepo).
    WithBond(bond).
    WithActionRegistry(eywa.NewActionRegistry()).
    WithScoutRegistry(eywa.NewScoutRegistry()).
    AddOracle(eywaopenai.NewOracle(apiKey)).
    WithConfig(config).
    Build()

MongoDB for Spirit configuration and conversation history. Redis for distributed locking and in-flight memory. OpenAI as the Oracle. Everything injected — nothing global.

To add Anthropic as an additional provider:

AddOracle(eywaopenai.NewOracle(openaiKey)).
AddOracle(eywaanthropic.NewOracle(anthropicKey)).

Spirits define which provider they use. The OracleFactory selects the right one at runtime.

19 Modules: Pay for What You Use

The entire framework ships as 19 independent Go modules:

github.com/wmulabs/eywa                     # core
github.com/wmulabs/eywa/fiber               # HTTP adapter
github.com/wmulabs/eywa/mongo               # MongoDB repositories
github.com/wmulabs/eywa/redis               # Redis Bond + memory
github.com/wmulabs/eywa/mcp                 # MCP client (Conduit)
github.com/wmulabs/eywa/providers/anthropic
github.com/wmulabs/eywa/providers/openai
github.com/wmulabs/eywa/providers/gemini
github.com/wmulabs/eywa/providers/bedrock
github.com/wmulabs/eywa/providers/vertexai
github.com/wmulabs/eywa/providers/weaviate
github.com/wmulabs/eywa/providers/qdrant
github.com/wmulabs/eywa/providers/pgvector
github.com/wmulabs/eywa/providers/pinecone
github.com/wmulabs/eywa/channels/whatsapp
github.com/wmulabs/eywa/gcp/cloudtasks
github.com/wmulabs/eywa/gcp/gcs
github.com/wmulabs/eywa/gcp/gemini

If you don't use Bedrock, you don't import it. You don't get its dependencies in your go.sum. You don't get its security surface. Go developers care about this.

Security Was Not an Afterthought

Before calling this v1.0, I went through a proper security review:

SSRF protection on all outbound HTTP — private IPs, loopback, IMDS (169.254.169.254), file://, ftp:// all blocked
io.LimitReader on every HTTP response body — a misbehaving server can't OOM your agent
SHA-256 API key storage with subtle.ConstantTimeCompare — no timing attacks
Rate limiting on auth endpoints
go test -race across all 19 modules on every CI push, no exceptions

None of this is exciting. All of it matters.

What's Next

eywa is at v1.0.0. Stable, production-hardened, documented.

If you're building AI agents in Go — or you've been wanting to but couldn't find something serious enough to base a production system on — I'd genuinely love your feedback.

Pull requests welcome. Issues welcome. Blunt criticism welcome.

→ github.com/wmulabs/eywa

Maybe the world didn't need another AI framework. But it definitely needed more engineering in the ones it had.