Wallet Guy

Posted on Apr 4

611 Tests Later: How We Built Reliable Wallet Infrastructure for AI

#ai #testing #security #opensource

Building reliable wallet infrastructure for AI agents isn't just about getting transactions to work—it's about ensuring they work safely, consistently, and won't drain your funds when something goes wrong. After writing 626+ test files and implementing 3-layer security with default-deny policies, we learned that the hardest part isn't the crypto; it's the guardrails.

Why Testing Matters for Crypto Infrastructure

When you're building wallet infrastructure that AI agents will control, every edge case becomes a potential fund drain. Unlike traditional applications where a bug might show the wrong UI or corrupt some data, wallet bugs can permanently lose money. There's no "undo" button on the blockchain.

This is why we built WAIaaS with security-first architecture: 3-layer protection (session auth → policy enforcement → human approval), default-deny policies, and extensive testing across every component. The result is wallet infrastructure that developers can actually trust with real funds.

The Security Architecture

Layer 1: Session Authentication

AI agents don't get direct wallet access. Instead, they receive time-limited session tokens with specific permissions:

# Create a session for an AI agent
curl -X POST http://127.0.0.1:3100/v1/sessions \
  -H "Content-Type: application/json" \
  -H "X-Master-Password: my-secret-password" \
  -d '{"walletId": "<wallet-uuid>"}'

Sessions have three distinct auth methods:

masterAuth: System administrator (wallet creation, policy management)
sessionAuth: AI agents (transactions within policy limits)
ownerAuth: Fund owners (transaction approval, emergency controls)

Layer 2: Policy Engine with Default-Deny

The policy engine enforces 21 policy types across 4 security tiers: INSTANT, NOTIFY, DELAY, APPROVAL. Most importantly, it defaults to deny—your agent can't touch tokens you haven't explicitly allowed.

# Create a spending limit policy
curl -X POST http://127.0.0.1:3100/v1/policies \
  -H "Content-Type: application/json" \
  -H "X-Master-Password: my-secret-password" \
  -d '{
    "walletId": "<wallet-uuid>",
    "type": "SPENDING_LIMIT",
    "rules": {
      "instant_max_usd": 100,
      "notify_max_usd": 500,
      "delay_max_usd": 2000,
      "delay_seconds": 900,
      "daily_limit_usd": 5000
    }
  }'

Critical policies for security:

ALLOWED_TOKENS: Whitelist which tokens the agent can transfer (default-deny)
CONTRACT_WHITELIST: Whitelist which contracts the agent can call (default-deny)
APPROVED_SPENDERS: Control which protocols can be approved for token spending
SPENDING_LIMIT: Amount-based security tiers

Without these policies configured, transactions are blocked. No surprises.

Layer 3: Human-in-the-Loop Approval

For high-value transactions that exceed policy limits, WAIaaS requires human approval through multiple channels:

WalletConnect: Mobile wallet approval
Telegram: Encrypted bot notifications
Push notifications: Real-time alerts

# Approve a pending transaction
curl -X POST http://127.0.0.1:3100/v1/transactions/<tx-id>/approve \
  -H "X-Owner-Signature: <ed25519-or-secp256k1-signature>" \
  -H "X-Owner-Message: <signed-message>"

Testing Strategy: 626+ Test Files

Building reliable crypto infrastructure requires comprehensive testing at every layer. Our test coverage includes:

Unit tests: Individual functions and components
Integration tests: API endpoints and service interactions
End-to-end tests: Complete transaction flows
Policy tests: Security enforcement scenarios
Network tests: Multi-chain transaction handling

The 7-stage transaction pipeline is particularly critical to test thoroughly:

Validate: Check transaction format and basic requirements
Authenticate: Verify session tokens and permissions
Policy: Enforce spending limits, whitelists, and security tiers
Wait: Handle delays for DELAY-tier transactions
Execute: Submit transactions to blockchain networks
Confirm: Monitor for confirmation and handle failures
Stages: Coordinate the entire pipeline

Each stage has its own test suite covering success cases, error conditions, and edge cases.

Real-World Testing with 14 DeFi Protocols

WAIaaS integrates with 14 DeFi protocol providers, each requiring different transaction patterns and security considerations:

DEXs: Jupiter (Solana), 0x Protocol (EVM)
Lending: Aave V3, Kamino
Staking: Lido, Jito
Bridges: LI.FI, Across
Derivatives: Hyperliquid perpetuals
Prediction markets: Polymarket

Testing these integrations revealed common security patterns:

# Test DeFi action with dry-run first
curl -X POST http://127.0.0.1:3100/v1/actions/jupiter-swap/swap \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer wai_sess_<token>" \
  -d '{
    "inputMint": "So11111111111111111111111111111111111111112",
    "outputMint": "EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v",
    "amount": "1000000000",
    "dryRun": true
  }'

The dry-run capability lets you simulate transactions before execution—critical for testing and agent development.

Multi-Chain Testing Complexity

Supporting 2 chain types (EVM and Solana) across 18 networks means testing transaction formats, gas estimation, nonce management, and confirmation patterns for each combination. Each chain has different:

Transaction formats: EVM uses RLP encoding, Solana uses binary formats
Signing schemes: secp256k1 vs ed25519
Gas mechanics: EVM gas vs Solana compute units
Confirmation patterns: Block times and finality rules

Our test suites validate these differences to ensure consistent behavior regardless of the underlying blockchain.

Docker Testing and Deployment

Reliable infrastructure needs reliable deployment. WAIaaS includes comprehensive Docker support with auto-provisioning and health checks:

healthcheck:
  test: ["CMD", "curl", "-f", "http://localhost:3100/health"]
  interval: 30s
  timeout: 5s
  start_period: 10s
  retries: 3

The Docker deployment handles:

Auto-provisioning: Generate secure credentials on first boot
Health monitoring: Continuous availability checking
Secret management: Production-ready credential handling
Non-root execution: Security-hardened container user (UID 1001)

API Testing with OpenAPI Specification

All 39 REST API route modules are documented with OpenAPI 3.0 specifications and include comprehensive test coverage. The interactive API reference at /reference makes testing straightforward:

# Download complete API spec for testing tools
curl http://127.0.0.1:3100/doc -o openapi.json

# View interactive docs
open http://127.0.0.1:3100/reference

Quick Start: Deploy and Test

Ready to test the security architecture yourself? Here's the minimal setup:

Deploy with Docker:

git clone https://github.com/minhoyoo-iotrust/WAIaaS.git
cd WAIaaS
docker compose up -d

Create a wallet and session:

# Create wallet (masterAuth)
curl -X POST http://127.0.0.1:3100/v1/wallets \
  -H "Content-Type: application/json" \
  -H "X-Master-Password: my-secret-password" \
  -d '{"name": "test-wallet", "chain": "solana", "environment": "devnet"}'

# Create session for testing
curl -X POST http://127.0.0.1:3100/v1/sessions \
  -H "Content-Type: application/json" \
  -H "X-Master-Password: my-secret-password" \
  -d '{"walletId": "<wallet-uuid>"}'

Set up default-deny policies:

# Without ALLOWED_TOKENS policy, all transfers are blocked
curl -X POST http://127.0.0.1:3100/v1/policies \
  -H "Content-Type: application/json" \
  -H "X-Master-Password: my-secret-password" \
  -d '{
    "walletId": "<wallet-uuid>",
    "type": "ALLOWED_TOKENS", 
    "rules": {
      "tokens": [{"address": "native:solana", "symbol": "SOL", "chain": "solana"}]
    }
  }'

Test the security:

# This will succeed (SOL is whitelisted)
curl -X POST http://127.0.0.1:3100/v1/transactions/send \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer wai_sess_<token>" \
  -d '{
    "type": "TRANSFER",
    "to": "test-address",
    "amount": "0.001",
    "dryRun": true
  }'

# This will fail (USDC not whitelisted)
curl -X POST http://127.0.0.1:3100/v1/transactions/send \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer wai_sess_<token>" \
  -d '{
    "type": "TOKEN_TRANSFER", 
    "token": "EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v",
    "to": "test-address",
    "amount": "1.0",
    "dryRun": true
  }'

Check the test results:

# View API health and status
curl http://127.0.0.1:3100/health

# Check policy enforcement logs
docker compose logs waiaas-daemon | grep POLICY

The security layers work: without explicit token whitelisting, transfers are blocked. Your AI agent can only touch funds you've explicitly allowed.

For comprehensive security testing, explore the Admin Web UI at http://127.0.0.1:3100/admin to configure policies, monitor transactions, and test approval flows.

What's Next

The 626+ test files represent our commitment to building crypto infrastructure developers can actually trust with real funds. Default-deny policies, 3-layer security, and comprehensive testing aren't just features—they're requirements for safe AI agent wallets.

Ready to build secure wallet infrastructure for your AI agents? Explore the complete codebase at GitHub or try the live system at waiaas.ai.

DEV Community