Gerard Desmarais

Posted on Dec 19

Capability-Based Architecture: A Practical Guide to Portability, Isolation, and AI-Readiness

#architecture #typescript #ai #softwaredesign

Have you ever found yourself staring at a codebase that started as a clean, well-intentioned application and somehow evolved into a tangled web of interdependencies? The kind where touching one component sends ripples through six others, and deploying a new feature feels like performing surgery on a house of cards? If so, you're not alone—and there's a better way to think about application design.

I've spent considerable time exploring architectural patterns that address this complexity head-on, and what I've found is that the most resilient, maintainable applications share a common trait: they're built around capabilities rather than features. The examples here are drawn from patterns I've started using myself, built on years of wrestling with these problems. In this post, I'll walk you through the key patterns that enable this approach and explain why they matter for modern application development.

The Core Problem: Entanglement

Let me paint a picture. Developer BW joins a team working on an e-commerce platform. She's tasked with adding order history export functionality. Simple enough, right? Except the order history component directly imports the user service, which depends on the authentication module, which has hooks into the notification system, which... you get the idea. What should be a self-contained feature becomes an exercise in understanding the entire application's dependency graph.

The traditional response has been to "be more careful" with dependencies or to introduce dependency injection frameworks. These help, but they treat symptoms rather than causes. The fundamental problem is architectural: we've allowed our components to know too much about each other.

Thinking in Capabilities

A capability is a discrete, self-contained unit of functionality that can be mounted into an application without hard dependencies on other capabilities. Think of capabilities as building blocks with well-defined shapes—they snap into place without needing to know what other blocks exist around them.

What makes this concept particularly powerful is that capabilities are full-stack by nature. A single capability encapsulates both its user interface components and its backend processing logic. The order history capability doesn't just render a list of orders—it also handles data fetching, caching, export generation, and any server-side operations that feature requires. Some capabilities are UI-heavy, presenting rich interactive experiences. Others are entirely headless, performing background processing, scheduled tasks, or API integrations without any visual component at all. Most live somewhere in between.

This full-stack encapsulation means you're not coordinating between a "frontend order history component" and a "backend order history service" that happen to share a name. You're working with a single cohesive unit that owns its entire vertical slice of functionality. When you add the order history capability to an application, you get everything it needs to function—viewers, actions, data access, business logic—in one portable package.

The theory then is this: if each capability is truly isolated, you can add, remove, or modify one without affecting others. You can test them independently. You can even share them across entirely different applications.

But isolation alone isn't enough. Capabilities still need to cooperate. The key insight is that cooperation doesn't require coupling—it requires communication. And that's where event-driven patterns come in.

Event-Driven Communication: The Great Decoupler

Rather than calling methods on other capabilities directly, isolated capabilities communicate through events. If capability A needs to inform capability B that something happened, A emits an event. B subscribes to that event if it cares about it. Neither knows the other exists.

Take for example our e-commerce scenario. When an order is placed, the order capability emits an order-placed event with the relevant data. The notification capability, which has subscribed to this event, picks it up and sends the confirmation email. The analytics capability logs the conversion. The inventory capability updates stock levels. Each handles its concern independently.

// Order capability emits an event
context.emit('order-placed', {
  orderId: order.id,
  customerId: order.customerId,
  total: order.total,
  timestamp: Date.now(),
});

// Notification capability subscribes (in a completely separate module)
eventBus.subscribe('order-placed', async (event) => {
  await sendOrderConfirmation(event.payload.orderId, event.payload.customerId);
});

The beauty here is that the order capability doesn't import, reference, or even know about the notification capability. If you remove notifications entirely, the order capability continues functioning. This is graceful degradation in action—the application becomes resilient to component failures or removals.

Separating Presentation from Operation: The Viewer/Action Pattern

Here's a pattern I've come to appreciate deeply: the explicit separation of viewers (UI components) and actions (operations). At first glance, this might seem like a rehash of MVC or its descendants. But there's a crucial difference—actions are defined as first-class, schema-validated operations that can be invoked by any interface, not just your UI.

Consider a media player capability. It defines actions like play, pause, seek, and setVolume. The viewer components—PlayButton, PauseButton, VolumeSlider—invoke these actions, but they're just one interface to the underlying operations.

const mediaActions = defineActions({
  play: {
    id: 'play',
    name: 'Play',
    description: 'Start or resume media playback',

    params: {}, // No parameters needed

    execute: async (_, context) => {
      await context.state.mediaElement.play();
      context.emit('playback-started', { 
        mediaId: context.state.currentMedia.id 
      });
      return { status: 'playing' };
    },
  },

  seek: {
    id: 'seek',
    name: 'Seek',
    description: 'Jump to a specific position in the media',

    params: {
      type: 'object',
      properties: {
        position: { 
          type: 'number', 
          description: 'Position in seconds' 
        },
      },
      required: ['position'],
    },

    execute: async ({ position }, context) => {
      context.state.mediaElement.currentTime = position;
      return { 
        status: 'seeked', 
        newPosition: position 
      };
    },
  },
});

This is why I believe we're finally ready for AI-integrated applications. Those same action definitions—complete with descriptions, parameter schemas, and examples—can be automatically transformed into AI tools. An LLM assistant can invoke seek(position: 45) exactly as a user would click on the seek bar. The action definition is the single source of truth for both human and machine interfaces.

This pattern also clarifies how capabilities can be both UI-driven and headless. A capability with viewers and actions presents interactive experiences—the user clicks, the action executes. But a headless capability simply defines actions without viewers. Consider a data synchronization capability that runs background jobs, or an integration capability that processes webhooks from external services. These capabilities have actions (sync, process, transform) but no UI components. They're still first-class capabilities, mountable and configurable through the same mechanisms, just without a visual presence. The viewer/action pattern accommodates both modes naturally—viewers are optional, actions are the core.

Mounting Points: Surfaces and Experiences

Now let's talk about where capabilities actually appear in your application. This is where the concepts of surfaces and experiences become useful.

A surface represents a major area or context in your application—think of it as a distinct "place" where users accomplish goals. An e-commerce platform might have surfaces like "cart," "account," "admin dashboard," and "product catalog." Each surface has its own layout, context providers, and behavioral characteristics.

An experience, on the other hand, represents a user journey or workflow within a surface. Within the cart surface, you might have a "checkout" experience or a "quick-buy" experience. Experiences shape how capabilities are presented and orchestrated for specific user flows.

Capabilities are then mounted at the intersection of surfaces and experiences. Order history might be mounted at account:overview with a compact view, while the same capability appears at account:orders with full filtering and pagination. Same capability, different presentations based on context.

// Same capability, context-aware presentation
{
  viewers: {
    'account:orders': {
      viewer: 'full',
      props: {
        pageSize: 20,
        showFilters: true,
      },
    },
    'account:overview': {
      viewer: 'compact',
      props: {
        pageSize: 5,
        showFilters: false,
      },
    },
  },
}

This mounting system provides remarkable flexibility. You define capabilities once and configure how they appear across your application declaratively. Need to add order history to a new mobile surface? Add a mounting configuration—you don't touch the capability code.

Configuration Hierarchy: Flexibility Without Chaos

Speaking of configuration, here's a pattern that solves a surprisingly common tension: how do you provide sensible defaults while allowing runtime customization without losing control?

The answer is a configuration hierarchy that flows through multiple layers:

Code defaults (checked into source control, deployed with the application)
Persisted configuration (modified by admins at runtime, stored in your persistence layer)
Runtime evaluation (feature flags, A/B tests, conditional rules)

Each layer can override the previous, but you always know where your configuration originates. If an admin changes a capability setting, that's recorded in the persistence layer. If a feature flag activates a new behavior, that's evaluated at runtime. But the foundational defaults remain stable in code.

Code Defaults          →    Persisted Config    →    Runtime Evaluation
(app.config.ts)             (Admin UI changes)       (Feature flags, ABAC)
     ↓                           ↓                         ↓
  Version controlled        Scoped overrides         Dynamic evaluation
  Deployed with app         Global/tenant/user       Context-aware

This hierarchy enables scenarios like blue/green deployments of configurations. Define a "blue" variant with your stable settings and a "green" variant with your next release. Route traffic between them without redeploying code.

Pluggable Middleware: Protocol Agnosticism

Here's where things get particularly interesting for the purposes of modern multi-channel applications. When you define actions with clear schemas and execution logic, you can expose them through multiple protocols simultaneously.

The pattern I recommend is a middleware layer that wraps your core service actions with cross-cutting concerns like authentication, authorization, validation, and auditing. This middleware then plugs into different adapters for different protocols:

An HTTP adapter exposes actions as REST endpoints
A CLI adapter makes them available from the command line
An AI adapter transforms them into tools for language models

┌─────────────────────────────────────────────────────────────────────────┐
│                           Service Registry                              │
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐         │
│  │ configActions   │  │ orderActions    │  │ mediaActions    │         │
│  │ ├─ get          │  │ ├─ getOrders    │  │ ├─ play         │         │
│  │ ├─ set          │  │ ├─ cancelOrder  │  │ ├─ pause        │         │
│  │ └─ delete       │  │ └─ exportOrders │  │ └─ seek         │         │
│  └─────────────────┘  └─────────────────┘  └─────────────────┘         │
└─────────────────────────────────────────────────────────────────────────┘
                                  │
                  ┌───────────────┼───────────────┐
                  ▼               ▼               ▼
           ┌───────────┐   ┌───────────┐   ┌───────────┐
           │   HTTP    │   │    CLI    │   │    AI     │
           │  Adapter  │   │  Adapter  │   │  Adapter  │
           └───────────┘   └───────────┘   └───────────┘

The same exportOrders action becomes POST /api/orders/export, myapp orders export --format csv, and the AI tool export_orders(format: "csv"). One implementation, three interfaces, zero duplication.

Access Control: Beyond Simple Roles

While we're discussing middleware, let's address authorization. Traditional role-based access control (RBAC) assigns permissions to roles, and users inherit permissions through their roles. It works, but it's often too coarse for modern applications.

The pattern I advocate combines hierarchical scopes with attribute-based conditions. Scopes establish the basic hierarchy: system-admin inherits from account-admin, which inherits from tenant-admin, and so on. But individual capabilities, features, or actions can require additional conditions based on attributes.

{
  access: {
    scopes: ['tenant-admin', 'account-admin', 'system-admin'],
    conditions: [
      { attribute: 'tenant.plan', operator: 'in', value: ['business', 'enterprise'] },
      { attribute: 'user.mfaEnabled', operator: 'equals', value: true },
    ],
  },
}

This configuration means: you need at least tenant-admin scope, AND your tenant must be on a business or enterprise plan, AND you must have MFA enabled. It's expressive enough for complex enterprise requirements while remaining declarative and auditable.

Capability Sets: Organized Complexity

As applications grow, individual capabilities benefit from grouping into capability sets. A set is a collection of related capabilities that share a controller, state, and can communicate through private internal events.

Imagine an order management set containing order-history, order-tracking, order-returns, and order-notifications capabilities. Each capability maintains its isolation—they don't import each other. But the set controller can maintain shared state (like cached order data) and facilitate internal communication that wouldn't make sense to expose publicly.

The set also becomes a convenient unit for:

Enabling/disabling related capabilities together
Applying consistent access rules
Versioning and distributing as a package
Configuring AI tool availability

@myorg/order-management/
├── controller/                 # Shared state and lifecycle
├── actions/                    # Set-level actions
├── events/
│   ├── internal.ts            # Private events within the set
│   └── public.ts              # Events other sets can subscribe to
└── capabilities/
    ├── order-history/
    ├── order-tracking/
    ├── order-returns/
    └── order-notifications/

Capabilities within the set can emit private events that only siblings can subscribe to. This prevents other parts of the application from depending on implementation details while still enabling internal coordination.

AI-Readiness: The Dual Interface Pattern

I've touched on AI integration several times, and for good reason—it's becoming table stakes for modern applications. The architectural decisions you make today determine how naturally AI assistants can interact with your system tomorrow.

The dual interface pattern means every action you define can be invoked by both humans (through viewers) and AI (through tool definitions). But there's nuance here. Not every action should be available to AI. Bulk operations, destructive actions, or those requiring human judgment might need to be gated:

{
  features: {
    'export-history': {
      ai: {
        enabled: true,  // Safe for AI to trigger
        readOnly: true,
      },
    },
    'bulk-delete': {
      ai: {
        enabled: false,  // Requires human oversight
      },
    },
    'cancel-order': {
      ai: {
        enabled: true,
        requiresConfirmation: true,  // AI must get user approval
      },
    },
  },
}

This granular control means you can embrace AI integration without surrendering safety. The AI assistant can query orders, export reports, and track shipments. But canceling orders requires confirmation, and bulk deletions are human-only operations.

The Persistence Question

All of this architectural elegance needs somewhere to live beyond memory. The pattern here is a pluggable persistence provider interface that different storage backends can implement:

Local file storage for development
Blob/table storage for serverless deployments
SQL databases for enterprise requirements

The persistence layer handles configuration storage, state synchronization, and audit logging. But because it's behind an interface, you can swap implementations without changing application code. Start with file-based storage in development, deploy to Azure Table Storage in production, migrate to SQL when requirements change—all without touching your capabilities.

AI-Assisted Development: When Your Architecture Speaks to Your Tools

Here's something I find genuinely exciting: the same structural clarity that enables runtime AI integration also dramatically amplifies AI-assisted development. Think about it—we've defined capabilities with explicit manifests, actions with typed schemas, events with documented contracts, and mounting configurations that declare where everything lives. This isn't just good for humans reading the code. It's exactly the kind of structured, self-describing codebase that AI coding tools thrive on.

Consider what happens when you ask an AI coding assistant to "add a refund action to the order management capability" in a traditional codebase. The AI needs to understand your project structure, infer where actions live, guess at your patterns for defining operations, figure out how you handle validation, and hope it doesn't break something in the process. The context window fills with exploratory reads, and the result is often a best-effort approximation that requires significant human correction.

Now consider the same request in a capability-based architecture. The AI can immediately locate the capability manifest, which explicitly declares the capability's structure:

// The manifest tells the AI exactly what this capability contains
export default defineCapability({
  id: '@myorg/order-management/order-returns',
  version: '2.1.0',

  actions: './actions',           // Actions live here
  viewers: './viewers',           // UI components here
  events: {
    emits: ['return-initiated', 'refund-processed'],
    subscribes: ['order-updated'],
  },

  features: {
    'bulk-returns': { /* ... */ },
    'refund-processing': { /* ... */ },
  },
});

The AI doesn't guess—it reads. It knows actions are in ./actions, follows the established pattern from existing actions, generates a new action with the correct schema structure, and updates the manifest. The structural conventions eliminate ambiguity.

But the benefits go deeper than navigation. Because actions have explicit parameter and result schemas, the AI can generate type-safe implementations with confidence. It knows what the action receives, what it should return, and can even infer appropriate error handling from existing patterns:

// AI generates this by following established patterns
refundOrder: {
  id: 'refundOrder',
  name: 'Refund Order',
  description: 'Process a full or partial refund for an order',

  params: z.object({
    orderId: z.string().describe('The order to refund'),
    amount: z.number().optional().describe('Partial refund amount; omit for full refund'),
    reason: z.enum(['customer-request', 'damaged', 'wrong-item', 'other']),
  }),

  result: z.object({
    refundId: z.string(),
    amount: z.number(),
    status: z.enum(['pending', 'processed', 'failed']),
  }),

  access: {
    scopes: ['tenant-admin', 'account-admin'],  // Inferred from sibling actions
  },

  ai: {
    enabled: true,
    requiresConfirmation: true,  // AI recognizes this is a sensitive operation
    description: 'Processes a refund for a customer order. Use when customer requests money back.',
  },

  execute: async (params, context) => {
    // Implementation follows patterns from existing actions
  },
}

Notice how the AI can infer access control patterns from sibling actions, recognize that refunds are sensitive operations requiring confirmation, and even generate AI-specific metadata. The architecture teaches the AI how to write code that belongs.

The isolation principle pays dividends here too. When an AI modifies a capability, the blast radius is inherently contained. The AI can refactor the entire internal implementation of order-returns without risking breakage in order-tracking or order-notifications. There are no hidden dependencies to accidentally sever, no implicit contracts to violate. The capability boundary is a safety boundary.

Event contracts provide another form of guidance. When the AI sees that a capability subscribes to order-updated events, it understands the integration point without needing to trace import statements through the codebase:

// AI understands the integration contract explicitly
externalHandlers: {
  'order-updated': async (event, context) => {
    // AI can generate handlers knowing exactly what payload to expect
    const { orderId, status, timestamp } = event.payload;
    // ...
  },
}

For scaffolding entirely new capabilities, the structured approach is transformative. A prompt like "create a customer loyalty capability with points tracking and reward redemption" gives the AI enough to generate a complete, well-structured capability skeleton: manifest, action definitions with schemas, event declarations, viewer stubs, and even test files following your project's patterns. What might take a developer an hour of boilerplate becomes a starting point generated in seconds.

Perhaps most importantly, this architecture makes AI-generated code reviewable. When every action has a declared schema, when access controls are explicit in configuration, when events are typed and documented—the human reviewer can quickly verify correctness. The AI's output isn't a black box of interconnected mutations; it's a discrete unit with clear contracts that can be evaluated in isolation.

I've started thinking of well-structured capability architectures as a form of "AI-legible" code. Just as we write code for humans to read (with the computer as a secondary audience), we're now writing code for AI to read, understand, and extend. The patterns that make code maintainable for humans—explicit contracts, isolation, self-documenting structure—turn out to be exactly what makes code extensible by AI.

This creates a virtuous cycle. AI helps you build capabilities faster. Those capabilities follow patterns that make future AI assistance more effective. Your codebase becomes increasingly amenable to AI collaboration over time, rather than accumulating the kind of implicit complexity that confounds both humans and machines.

Practical Implications and Where This Leads

If you implement these patterns, you'll notice several things. First, your velocity on new capabilities increases. New capabilities don't fight with existing code—they snap into place. Second, your test coverage improves naturally. Isolated capabilities with clear action contracts are straightforward to test. Third, your application becomes genuinely multi-modal. The same codebase serves web users, CLI power users, and AI assistants.

Looking forward, these patterns position you well for scenarios that are rapidly emerging. Imagine AI agents that don't just respond to queries but proactively take actions on users' behalf. With explicit action definitions, proper access controls, and confirmation requirements, you've already built the foundation. Or consider the rise of ambient computing, where your application exists across watches, glasses, voice assistants, and traditional screens. Surface and experience abstractions let you mount capabilities appropriately for each context.

Turns out that the principles underlying resilient, maintainable applications—isolation, explicit contracts, event-driven communication, declarative configuration—are exactly what you need for AI-native, multi-modal, highly adaptable systems. It's not about predicting the future so much as building in a way that doesn't foreclose it.

If you've been grappling with application complexity, I'd encourage you to start small. Take one feature and refactor it into an isolated capability with explicit actions. Add event-based communication with one neighbor. Observe how the dynamics change. Then gradually extend the pattern outward. The architecture I've described isn't an all-or-nothing proposition—it's a direction of travel that pays dividends incrementally.

And the next time you're asked to add order history export functionality? It'll be a capability you define once and mount wherever it's needed. No surgery on houses of cards required.

DEV Community