WojciechowskiApp

Posted on Oct 21 • Originally published at wojciechowski.app

Kinetiq Infrastructure: .NET 10, LiveKit, Yjs CRDT in Production

#dotnet #architecture #programming #saas

System Overview

Kinetiq is a global education platform with real-time collaboration and multi-language support. Current scale:

500+ tutors, 30 countries
3,500+ completed sessions
150+ languages supported
99.9% uptime in production

This post covers the technical infrastructure and key architectural decisions.

Architecture Layers

┌─────────────────────────────────────────────────────┐
│ Frontend: Next.js 15.5.4 + React 19.1.0             │
│ - Server Components for data-heavy pages            │
│ - React Query 5.90.2 for server state               │
│ - Zustand 5.0.8 for client state                    │
│ - TypeScript 5.x                                     │
└─────────────────────────────────────────────────────┘
                      ▼ HTTPS/WSS
┌─────────────────────────────────────────────────────┐
│ Backend: .NET 10 Minimal APIs                        │
│ - Dapper ORM for queries                            │
│ - SignalR for WebSocket connections                 │
│ - PostgreSQL 16 (primary data)                      │
│ - Redis 7 (cache, sessions, pub/sub)                │
└─────────────────────────────────────────────────────┘
                      ▼
┌─────────────────────────────────────────────────────┐
│ Real-Time Services                                   │
│ - LiveKit (self-hosted WebRTC SFU)                  │
│ - Yjs CRDT (collaborative whiteboard + code editor) │
│ - Deepgram (real-time STT, <500ms latency)          │
│ - Azure Cognitive Services (translation)            │
│ - OpenAI GPT-4 + Whisper (content generation)       │
└─────────────────────────────────────────────────────┘
                      ▼
┌─────────────────────────────────────────────────────┐
│ Infrastructure: Kubernetes + Docker                  │
│ - Dapr 1.16 Service Mesh                            │
│ - Cloudflare R2 (video/file storage)                │
│ - Serilog → Seq (structured logging)                │
│ - Zipkin (distributed tracing)                      │
└─────────────────────────────────────────────────────┘

Key Technical Decisions

1. .NET 10 Minimal APIs Over Node.js

Decision: Use .NET 10 for backend instead of Node.js/Express.

Rationale:

Performance: TechEmpower benchmarks show .NET 10 Minimal APIs at 7M req/s vs Node.js (Fastify) at 1.2M req/s
Concurrency: SignalR handles 100K+ concurrent WebSocket connections on single instance
CPU-intensive workloads: Better for AI prompt processing and real-time translation orchestration
Memory: Lower memory footprint under load compared to Node.js event loop

Production metrics:

100K concurrent WebSocket connections at 40% CPU
P50 API response: 8ms
P95 API response: 45ms

2. Yjs CRDT for Collaborative Editing

Decision: Use Yjs (Conflict-free Replicated Data Types) for real-time collaboration.

Rationale:

Conflict-free: Multiple users can edit simultaneously without merge conflicts
Eventual consistency: All clients converge to identical state
Offline-capable: Operations queue locally and sync when reconnected
Performance: Operations are lightweight, minimal network overhead

Implementation:

Whiteboard (tldraw):

import { Tldraw } from 'tldraw'
import { YjsEditor } from '@tldraw/yjs'
import * as Y from 'yjs'
import { WebsocketProvider } from 'y-websocket'

const doc = new Y.Doc()
const provider = new WebsocketProvider('wss://kinetiq.one/sync', roomId, doc)
const store = createTLStore()
new YjsEditor(store, doc.getMap('tldraw'))

Code editor (Monaco):

import { MonacoBinding } from 'y-monaco'

const doc = new Y.Doc()
const yText = doc.getText('monaco')
const provider = new WebsocketProvider('wss://kinetiq.one/sync', roomId, doc)

new MonacoBinding(
  yText,
  editor.getModel(),
  new Set([editor]),
  provider.awareness
)

Production metrics:

Zero sync conflicts since deployment (6 months, 3,500+ sessions)
P50 sync latency: 12ms
P95 sync latency: 85ms (cross-continent)

3. Self-Hosted LiveKit Over Managed Services

Decision: Self-host LiveKit WebRTC SFU instead of using Twilio/Agora/Vonage.

Rationale:

Cost: Managed services charge $0.004-0.015/minute/participant. At 3,500+ sessions, this is $2K+/month
Control: Full control over SFU configuration, recording format, storage location
Scalability: Horizontal scaling in Kubernetes with auto-healing
Integration: Direct recording to Cloudflare R2 (zero egress fees)

Infrastructure:

LiveKit deployed as Kubernetes StatefulSet
Auto-scaling based on active rooms
Health checks + liveness probes
Recording pipeline: LiveKit → Cloudflare R2 → Deepgram/Whisper

Cost comparison:

Managed service: ~$2,000/month
Self-hosted (Kubernetes + compute): ~$200/month
Savings: $1,800/month

4. PostgreSQL + Redis Architecture

Decision: PostgreSQL 16 as primary database, Redis 7 for caching and real-time features.

PostgreSQL usage:

User data, courses, sessions
Full-text search with GIN indexes
JSONB for flexible schemas (course content, user preferences)

Redis usage:

Session storage (distributed across instances)
Translation cache (85% hit rate, saves $680/month on Azure API calls)
Real-time presence (online users, typing indicators)
Rate limiting (sliding window algorithm)
Pub/Sub for SignalR backplane (multi-instance WebSocket)

Query optimization example:

-- Tutor search with full-text + filters
CREATE INDEX idx_tutors_search ON tutors USING GIN(to_tsvector('english', name || ' ' || bio));
CREATE INDEX idx_tutors_skills ON tutors USING GIN(skills);

-- Query time: 8s → 40ms after indexing
SELECT * FROM tutors
WHERE to_tsvector('english', name || ' ' || bio) @@ to_tsquery('python & react')
  AND skills @> ARRAY['typescript', 'node.js']
LIMIT 20;

5. Translation Caching Strategy

Problem: Azure Translator API costs $10 per 1M chars. At scale: $800/month for repetitive phrases.

Solution: Redis caching with intelligent key strategy.

// C# caching implementation
public async Task<string> TranslateAsync(string text, string targetLang)
{
    var cacheKey = $"translate:{ComputeHash(text)}:{targetLang}";

    // Check cache first
    var cached = await _redis.StringGetAsync(cacheKey);
    if (cached.HasValue)
        return cached.ToString();

    // Miss: call Azure API
    var translated = await _azureTranslator.TranslateAsync(text, targetLang);

    // Cache for 7 days
    await _redis.StringSetAsync(cacheKey, translated, TimeSpan.FromDays(7));

    return translated;
}

Results:

Cache hit rate: 85%
API calls reduced from 8M/month to 1.2M/month
Cost: $800/month → $120/month
P50 latency: 450ms → 15ms (cache hit)

Production Metrics

Uptime & Reliability:

99.9% uptime (Kubernetes auto-healing)
Zero-downtime deployments (rolling updates)
Mean time to recovery: <2 minutes

Performance:

API P50: 8ms, P95: 45ms
WebSocket sync P50: 12ms, P95: 85ms
Translation P50: 15ms (cached), P95: 450ms
Full-text search: 40ms average

Scale:

100K+ concurrent WebSocket connections supported
500 concurrent tutoring sessions (peak)
3,500+ completed sessions
150+ languages supported

Cost Efficiency:

Infrastructure: ~$400/month (compute + storage + CDN)
LiveKit self-hosted: $200/month vs $2K managed
Translation caching: $120/month vs $800 without cache
Total savings: ~$2,500/month vs managed alternatives

Technology Stack

Frontend:

Next.js 15.5.4 (App Router, Server Components)
React 19.1.0
TypeScript 5.x
Tailwind CSS 4.0
React Query 5.90.2
Zustand 5.0.8

Backend:

.NET 10 (Minimal APIs)
Dapper ORM
SignalR (WebSockets)
PostgreSQL 16
Redis 7

Real-Time:

LiveKit (self-hosted SFU)
Yjs CRDT + y-websocket
tldraw 2.4.6 (whiteboard)
Monaco Editor + y-monaco (code editor)

AI/ML:

OpenAI GPT-4 (content generation)
Whisper (transcription)
Deepgram (real-time STT)
Azure Cognitive Services (translation)

Infrastructure:

Kubernetes + Docker
Dapr 1.16 (service mesh)
Cloudflare R2 (object storage)
Serilog + Seq (logging)
Zipkin (distributed tracing)

Full case study with implementation details, challenges, and lessons learned:
wojciechowski.app/en/articles/kinetiq-case-study

Portfolio: wojciechowski.app

Top comments (1)

WojciechowskiApp • Oct 21

Questions about the architecture? Drop a comment.

DEV Community

Kinetiq Infrastructure: .NET 10, LiveKit, Yjs CRDT in Production

System Overview

Architecture Layers

Key Technical Decisions

1. .NET 10 Minimal APIs Over Node.js

2. Yjs CRDT for Collaborative Editing

3. Self-Hosted LiveKit Over Managed Services

4. PostgreSQL + Redis Architecture

5. Translation Caching Strategy

Production Metrics

Technology Stack

Read More

Top comments (1)