DEV Community

Cover image for 611 Tests Later: How We Built Reliable Wallet Infrastructure for AI
Wallet Guy
Wallet Guy

Posted on

611 Tests Later: How We Built Reliable Wallet Infrastructure for AI

Building reliable wallet infrastructure for AI agents isn't just about getting transactions to work—it's about ensuring they work safely, consistently, and won't drain your funds when something goes wrong. After writing 626+ test files and implementing 3-layer security with default-deny policies, we learned that the hardest part isn't the crypto; it's the guardrails.

Why Testing Matters for Crypto Infrastructure

When you're building wallet infrastructure that AI agents will control, every edge case becomes a potential fund drain. Unlike traditional applications where a bug might show the wrong UI or corrupt some data, wallet bugs can permanently lose money. There's no "undo" button on the blockchain.

This is why we built WAIaaS with security-first architecture: 3-layer protection (session auth → policy enforcement → human approval), default-deny policies, and extensive testing across every component. The result is wallet infrastructure that developers can actually trust with real funds.

The Security Architecture

Layer 1: Session Authentication

AI agents don't get direct wallet access. Instead, they receive time-limited session tokens with specific permissions:

# Create a session for an AI agent
curl -X POST http://127.0.0.1:3100/v1/sessions \
  -H "Content-Type: application/json" \
  -H "X-Master-Password: my-secret-password" \
  -d '{"walletId": "<wallet-uuid>"}'
Enter fullscreen mode Exit fullscreen mode

Sessions have three distinct auth methods:

  • masterAuth: System administrator (wallet creation, policy management)
  • sessionAuth: AI agents (transactions within policy limits)
  • ownerAuth: Fund owners (transaction approval, emergency controls)

Layer 2: Policy Engine with Default-Deny

The policy engine enforces 21 policy types across 4 security tiers: INSTANT, NOTIFY, DELAY, APPROVAL. Most importantly, it defaults to deny—your agent can't touch tokens you haven't explicitly allowed.

# Create a spending limit policy
curl -X POST http://127.0.0.1:3100/v1/policies \
  -H "Content-Type: application/json" \
  -H "X-Master-Password: my-secret-password" \
  -d '{
    "walletId": "<wallet-uuid>",
    "type": "SPENDING_LIMIT",
    "rules": {
      "instant_max_usd": 100,
      "notify_max_usd": 500,
      "delay_max_usd": 2000,
      "delay_seconds": 900,
      "daily_limit_usd": 5000
    }
  }'
Enter fullscreen mode Exit fullscreen mode

Critical policies for security:

  • ALLOWED_TOKENS: Whitelist which tokens the agent can transfer (default-deny)
  • CONTRACT_WHITELIST: Whitelist which contracts the agent can call (default-deny)
  • APPROVED_SPENDERS: Control which protocols can be approved for token spending
  • SPENDING_LIMIT: Amount-based security tiers

Without these policies configured, transactions are blocked. No surprises.

Layer 3: Human-in-the-Loop Approval

For high-value transactions that exceed policy limits, WAIaaS requires human approval through multiple channels:

  • WalletConnect: Mobile wallet approval
  • Telegram: Encrypted bot notifications
  • Push notifications: Real-time alerts
# Approve a pending transaction
curl -X POST http://127.0.0.1:3100/v1/transactions/<tx-id>/approve \
  -H "X-Owner-Signature: <ed25519-or-secp256k1-signature>" \
  -H "X-Owner-Message: <signed-message>"
Enter fullscreen mode Exit fullscreen mode

Testing Strategy: 626+ Test Files

Building reliable crypto infrastructure requires comprehensive testing at every layer. Our test coverage includes:

  • Unit tests: Individual functions and components
  • Integration tests: API endpoints and service interactions
  • End-to-end tests: Complete transaction flows
  • Policy tests: Security enforcement scenarios
  • Network tests: Multi-chain transaction handling

The 7-stage transaction pipeline is particularly critical to test thoroughly:

  1. Validate: Check transaction format and basic requirements
  2. Authenticate: Verify session tokens and permissions
  3. Policy: Enforce spending limits, whitelists, and security tiers
  4. Wait: Handle delays for DELAY-tier transactions
  5. Execute: Submit transactions to blockchain networks
  6. Confirm: Monitor for confirmation and handle failures
  7. Stages: Coordinate the entire pipeline

Each stage has its own test suite covering success cases, error conditions, and edge cases.

Real-World Testing with 14 DeFi Protocols

WAIaaS integrates with 14 DeFi protocol providers, each requiring different transaction patterns and security considerations:

  • DEXs: Jupiter (Solana), 0x Protocol (EVM)
  • Lending: Aave V3, Kamino
  • Staking: Lido, Jito
  • Bridges: LI.FI, Across
  • Derivatives: Hyperliquid perpetuals
  • Prediction markets: Polymarket

Testing these integrations revealed common security patterns:

# Test DeFi action with dry-run first
curl -X POST http://127.0.0.1:3100/v1/actions/jupiter-swap/swap \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer wai_sess_<token>" \
  -d '{
    "inputMint": "So11111111111111111111111111111111111111112",
    "outputMint": "EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v",
    "amount": "1000000000",
    "dryRun": true
  }'
Enter fullscreen mode Exit fullscreen mode

The dry-run capability lets you simulate transactions before execution—critical for testing and agent development.

Multi-Chain Testing Complexity

Supporting 2 chain types (EVM and Solana) across 18 networks means testing transaction formats, gas estimation, nonce management, and confirmation patterns for each combination. Each chain has different:

  • Transaction formats: EVM uses RLP encoding, Solana uses binary formats
  • Signing schemes: secp256k1 vs ed25519
  • Gas mechanics: EVM gas vs Solana compute units
  • Confirmation patterns: Block times and finality rules

Our test suites validate these differences to ensure consistent behavior regardless of the underlying blockchain.

Docker Testing and Deployment

Reliable infrastructure needs reliable deployment. WAIaaS includes comprehensive Docker support with auto-provisioning and health checks:

healthcheck:
  test: ["CMD", "curl", "-f", "http://localhost:3100/health"]
  interval: 30s
  timeout: 5s
  start_period: 10s
  retries: 3
Enter fullscreen mode Exit fullscreen mode

The Docker deployment handles:

  • Auto-provisioning: Generate secure credentials on first boot
  • Health monitoring: Continuous availability checking
  • Secret management: Production-ready credential handling
  • Non-root execution: Security-hardened container user (UID 1001)

API Testing with OpenAPI Specification

All 39 REST API route modules are documented with OpenAPI 3.0 specifications and include comprehensive test coverage. The interactive API reference at /reference makes testing straightforward:

# Download complete API spec for testing tools
curl http://127.0.0.1:3100/doc -o openapi.json

# View interactive docs
open http://127.0.0.1:3100/reference
Enter fullscreen mode Exit fullscreen mode

Quick Start: Deploy and Test

Ready to test the security architecture yourself? Here's the minimal setup:

  1. Deploy with Docker:
git clone https://github.com/minhoyoo-iotrust/WAIaaS.git
cd WAIaaS
docker compose up -d
Enter fullscreen mode Exit fullscreen mode
  1. Create a wallet and session:
# Create wallet (masterAuth)
curl -X POST http://127.0.0.1:3100/v1/wallets \
  -H "Content-Type: application/json" \
  -H "X-Master-Password: my-secret-password" \
  -d '{"name": "test-wallet", "chain": "solana", "environment": "devnet"}'

# Create session for testing
curl -X POST http://127.0.0.1:3100/v1/sessions \
  -H "Content-Type: application/json" \
  -H "X-Master-Password: my-secret-password" \
  -d '{"walletId": "<wallet-uuid>"}'
Enter fullscreen mode Exit fullscreen mode
  1. Set up default-deny policies:
# Without ALLOWED_TOKENS policy, all transfers are blocked
curl -X POST http://127.0.0.1:3100/v1/policies \
  -H "Content-Type: application/json" \
  -H "X-Master-Password: my-secret-password" \
  -d '{
    "walletId": "<wallet-uuid>",
    "type": "ALLOWED_TOKENS", 
    "rules": {
      "tokens": [{"address": "native:solana", "symbol": "SOL", "chain": "solana"}]
    }
  }'
Enter fullscreen mode Exit fullscreen mode
  1. Test the security:
# This will succeed (SOL is whitelisted)
curl -X POST http://127.0.0.1:3100/v1/transactions/send \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer wai_sess_<token>" \
  -d '{
    "type": "TRANSFER",
    "to": "test-address",
    "amount": "0.001",
    "dryRun": true
  }'

# This will fail (USDC not whitelisted)
curl -X POST http://127.0.0.1:3100/v1/transactions/send \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer wai_sess_<token>" \
  -d '{
    "type": "TOKEN_TRANSFER", 
    "token": "EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v",
    "to": "test-address",
    "amount": "1.0",
    "dryRun": true
  }'
Enter fullscreen mode Exit fullscreen mode
  1. Check the test results:
# View API health and status
curl http://127.0.0.1:3100/health

# Check policy enforcement logs
docker compose logs waiaas-daemon | grep POLICY
Enter fullscreen mode Exit fullscreen mode

The security layers work: without explicit token whitelisting, transfers are blocked. Your AI agent can only touch funds you've explicitly allowed.

For comprehensive security testing, explore the Admin Web UI at http://127.0.0.1:3100/admin to configure policies, monitor transactions, and test approval flows.

What's Next

The 626+ test files represent our commitment to building crypto infrastructure developers can actually trust with real funds. Default-deny policies, 3-layer security, and comprehensive testing aren't just features—they're requirements for safe AI agent wallets.

Ready to build secure wallet infrastructure for your AI agents? Explore the complete codebase at GitHub or try the live system at waiaas.ai.

Top comments (0)