DEV Community

Sanjeev Kumar
Sanjeev Kumar

Posted on

How We Made Grantex Enterprise-Grade: 3,332 Tests, Zero Failures

TL;DR: We ran a comprehensive security, resilience, and API quality audit on Grantex — the open authorization protocol for AI agents. We found and fixed 20+ issues across security (timing attacks, missing rate limits, open CORS), resilience (no DB transactions, no connection pools, no retry logic), and API consistency (phantom types, inconsistent error codes). The result: 3,332 tests across 27 packages with a 100% pass rate.


What is Grantex?

Grantex is OAuth 2.0 for AI agents. When an AI agent calls Salesforce, Jira, or Stripe on behalf of a human, Grantex ensures the agent can only do what it was authorized to do — read contacts, not delete them.

  • 3 SDKs: TypeScript, Python, Go
  • 53 pre-built tool manifests for scope enforcement
  • Integrations: LangChain, Anthropic, OpenAI Agents, CrewAI, Google ADK, Vercel AI, MCP
  • W3C Verifiable Credentials, FIDO2/WebAuthn, DID infrastructure
  • Apache 2.0, IETF Internet-Draft

The Audit

We ran 5 parallel deep-dive audits:

1. Security Audit

Found:

  • Admin authentication using !== (timing-attack vulnerable)
  • Zero rate limits on signup, session creation, SAML callbacks
  • CORS wide open (default Fastify config)
  • SSO state parameter was plain base64 (forgeable)
  • JWT exp claim not validated in grant introspection
  • No scope format validation (could send 10MB scope strings)

Fixed:

  • crypto.timingSafeEqual() for admin auth
  • Rate limits on every POST/PATCH/DELETE endpoint
  • CORS locked to origin: false
  • SSO state HMAC-signed with timing-safe verification
  • JWT expiry validated
  • Scope validation: max 256 chars, max 100 scopes
  • 7 HTTP security headers (HSTS, X-Content-Type-Options, X-Frame-Options, etc.)
  • 1MB body size limit
  • Input trimming on all string fields

2. Resilience Audit

Found:

  • No database connection pool settings
  • Token exchange: 4 sequential DB writes without transaction
  • Redis errors silently swallowed
  • No graceful shutdown (workers leaked on SIGTERM)
  • Health check just returned 200 (didn't check DB or Redis)
  • No SDK retry logic in any language

Fixed:

  • Connection pools (max 20, idle 30s, connect 10s, lifetime 30min)
  • Token exchange wrapped in sql.begin() transaction
  • Redis error handlers with reconnection logging
  • Graceful shutdown: SIGTERM → stop workers → close Redis → close DB
  • Deep health check: SELECT 1 + PING with 2s timeout, 503 on degraded
  • Config validation on startup (exits on missing DATABASE_URL, REDIS_URL, etc.)
  • SDK retry with exponential backoff + jitter in TypeScript, Python, and Go
  • Webhook retry jitter (20% randomization to prevent thundering herd)

3. API Consistency Audit

Found:

  • SDK types defined pagination fields (total, page, pageSize) that the API never returned
  • Admin error responses missing code and requestId
  • HTTP 422 used for parse errors (should be 400)
  • Inconsistent error formats across routes

Fixed:

  • SDK types aligned with actual API responses
  • All error responses standardized: { message, code, requestId }
  • 422 reserved for semantic validation only

4. Observability

Found:

  • console.log and console.error everywhere
  • No structured logging
  • No log levels

Fixed:

  • Structured pino JSON logging (zero console.* remaining)
  • LOG_LEVEL env var with dev-mode pino-pretty
  • Worker child loggers with context bindings

5. Documentation Gaps

Found:

  • 29 API endpoints implemented but undocumented
  • SDK overview tables missing 6-9 resource clients
  • 4 SDK docs pages missing (Python/Go vault + passports)
  • No Security Hardening guide
  • No Operations guide

Fixed:

  • 35 new API reference pages
  • All overview tables complete (20/20 services)
  • Security Hardening + Operations guides
  • SDK retry configuration documented
  • CHANGELOG updated

The Numbers

Metric Before After
Security issues 12 0
Endpoints with rate limits ~5 All
SDK retry logic None All 3 languages
Health check depth Shallow (200 always) DB + Redis (503 on degraded)
Structured logging None Full pino
Test count ~2,800 3,332
Pass rate ~99% 100%
Packages tested ~20 27

Try It

npm install @grantex/sdk    # TypeScript
pip install grantex          # Python
go get github.com/mishrasanjeev/grantex-go  # Go
Enter fullscreen mode Exit fullscreen mode

Star the repo if you're building AI agents that need real authorization.


Grantex is Apache 2.0 licensed. IETF Internet-Draft submitted. SOC 2 Type I certified.

Top comments (0)