Wallet Guy

Posted on Apr 21

611 Tests for AI Agent Wallets: How We Validate Every Transaction Before It Hits Mainnet

#ai #testing #security #tutorial

Testing AI agent wallets is like debugging a program that can spend real money—you need bulletproof validation before any code touches mainnet. WAIaaS runs 631+ tests across its monorepo to ensure every transaction, policy check, and security layer works exactly as intended when your AI agents start managing actual funds.

Why Testing Matters for Financial AI Agents

When your trading bot executes a $10,000 swap or your DeFi agent stakes ETH, there's no "undo" button. A single bug in transaction validation could drain wallets. A faulty policy engine might let agents bypass spending limits. Broken auth could expose private keys.

Traditional software testing catches crashes and logic errors. Crypto wallet testing catches financial disasters.

WAIaaS Testing Architecture

The WAIaaS codebase includes 631+ test files spread across its 15-package monorepo. Every component that touches money, keys, or transactions has dedicated test coverage.

Core Components Under Test

The test suite validates:

7-stage transaction pipeline — Each stage (validate, auth, policy, wait, execute, confirm) has isolated tests
21 policy types across 4 security tiers — INSTANT, NOTIFY, DELAY, APPROVAL validation
3 authentication methods — masterAuth (Argon2id), ownerAuth (SIWS/SIWE), sessionAuth (JWT HS256)
15 DeFi protocol integrations — Jupiter, Lido, Hyperliquid, Polymarket, and others
Cross-chain functionality — 18 networks across Solana and EVM chains
39 REST API routes — Every endpoint that agents call

Transaction Validation Testing

Before any transaction reaches the blockchain, it passes through comprehensive validation:

# Test transaction building and validation
curl -X POST http://127.0.0.1:3100/v1/transactions/send \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer wai_sess_<token>" \
  -d '{
    "type": "TRANSFER",
    "to": "recipient-address",
    "amount": "0.1",
    "dryRun": true
  }'

The dry-run API lets you test transaction logic without spending gas. Tests validate:

Address format checking
Balance sufficiency
Gas estimation accuracy
Policy compliance simulation
Multi-signature coordination

Policy Engine Testing

The policy engine enforces 21 different policy types with default-deny security. Tests ensure policies actually block unauthorized transactions:

# Test spending limit enforcement
curl -X POST http://127.0.0.1:3100/v1/policies \
  -H "Content-Type: application/json" \
  -H "X-Master-Password: my-secret-password" \
  -d '{
    "walletId": "<wallet-uuid>",
    "type": "SPENDING_LIMIT",
    "rules": {
      "instant_max_usd": 10,
      "notify_max_usd": 100,
      "delay_max_usd": 1000,
      "delay_seconds": 300,
      "daily_limit_usd": 500
    }
  }'

Policy tests verify:

Default-deny behavior when ALLOWED_TOKENS isn't configured
Tier assignment logic (INSTANT → NOTIFY → DELAY → APPROVAL)
Rate limiting across time windows
Contract whitelist enforcement
Cross-chain policy inheritance

DeFi Integration Testing

Each of the 15 DeFi protocol providers has dedicated test coverage for complex multi-step operations:

# Test Jupiter swap simulation
curl -X POST http://127.0.0.1:3100/v1/actions/jupiter-swap/swap \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer wai_sess_<token>" \
  -d '{
    "inputMint": "So11111111111111111111111111111111111111112",
    "outputMint": "EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v",
    "amount": "1000000000",
    "dryRun": true
  }'

DeFi tests cover:

Price impact calculations
Slippage protection
Multi-hop routing validation
Liquidity availability checks
Protocol-specific error handling

Authentication Security Testing

The three-layer auth system undergoes rigorous testing:

# Test session creation and validation
curl -X POST http://127.0.0.1:3100/v1/sessions \
  -H "Content-Type: application/json" \
  -H "X-Master-Password: my-secret-password" \
  -d '{"walletId": "<wallet-uuid>"}'

Auth tests validate:

Argon2id password hashing resistance
JWT token expiration and renewal
SIWS/SIWE signature verification
Session isolation between wallets
Token revocation mechanisms

Running WAIaaS Tests

Tests run automatically in CI/CD, but you can execute them locally:

# Clone and setup
git clone https://github.com/minhoyoo-iotrust/WAIaaS.git
cd WAIaaS
pnpm install

# Run all tests
pnpm test

# Run specific package tests
pnpm --filter @waiaas/daemon test
pnpm --filter @waiaas/core test

# Integration tests with Docker
docker compose -f docker-compose.test.yml up --abort-on-container-exit

Docker Test Environment

WAIaaS includes Docker-based integration testing that spins up the full stack:

# docker-compose.test.yml
services:
  daemon:
    image: ghcr.io/minhoyoo-iotrust/waiaas:latest
    environment:
      - NODE_ENV=test
      - WAIAAS_AUTO_PROVISION=true
    volumes:
      - test-data:/data

  e2e-tests:
    build: packages/e2e-tests
    depends_on:
      - daemon
    environment:
      - WAIAAS_BASE_URL=http://daemon:3100

This catches integration issues that unit tests miss—like network timeouts, database locks, or race conditions in the 7-stage transaction pipeline.

Security-First Test Cases

Beyond functional testing, WAIaaS includes security-focused test scenarios:

Wallet Isolation Testing

Cross-wallet transaction attempts (should fail)
Session token reuse across wallets (should fail)
Policy inheritance between wallets (should be isolated)

Attack Vector Testing

Replay attack prevention
Rate limit bypass attempts
Authorization header manipulation
SQL injection in wallet names/metadata
XSS in admin UI components

Fund Safety Testing

Insufficient balance handling
Gas estimation edge cases
Network failure recovery
Partial transaction completion
Emergency stop functionality

Quick Start: Testing Your Integration

Want to validate WAIaaS before trusting it with real funds? Follow these steps:

Deploy with auto-provision to avoid manual setup:

   docker run -d \
     --name waiaas-test \
     -p 127.0.0.1:3100:3100 \
     -v waiaas-test-data:/data \
     -e WAIAAS_AUTO_PROVISION=true \
     ghcr.io/minhoyoo-iotrust/waiaas:latest

Create test wallet and session:

   # Get auto-generated password
   docker exec waiaas-test cat /data/recovery.key

   # Create wallet
   waiaas quickset --mode testnet

Test transaction simulation:

   # All transactions with dryRun: true are simulated only
   curl -X POST http://127.0.0.1:3100/v1/transactions/send \
     -H "Authorization: Bearer $WAIAAS_SESSION_TOKEN" \
     -d '{"type": "TRANSFER", "to": "test-address", "amount": "0.001", "dryRun": true}'

Validate policy enforcement:

   # Create restrictive spending limit
   curl -X POST http://127.0.0.1:3100/v1/policies \
     -H "X-Master-Password: $(cat recovery.key)" \
     -d '{"walletId": "$WALLET_ID", "type": "SPENDING_LIMIT", "rules": {"instant_max_usd": 1}}'

   # Try transaction above limit (should be denied or queued)

Test the full 45-tool MCP integration with Claude Desktop or other MCP clients to validate the AI agent experience end-to-end.

What's Next

The comprehensive test suite gives you confidence that WAIaaS handles your funds securely, but running your own validation is always recommended. Check out the full codebase at GitHub and explore the interactive API documentation at waiaas.ai to see exactly what each endpoint does before your agents start using them.

DEV Community