DEV Community

Clay Roach
Clay Roach

Posted on • Originally published at dev.to

Day 20: Service Topology Implementation with Critical Request Paths

Today completed the Service Topology feature implementation, replacing the previous AI Insights view with a comprehensive three-panel visualization system. The implementation demonstrates practical AI-assisted development achieving enterprise-level features in minimal time.

Implementation Overview

The 4-hour development session produced:

  • Service Topology visualization with interactive network graph
  • Critical Request Paths analysis using Sankey flow diagrams
  • Real-time service health indicators with R.E.D metrics
  • AI-powered analysis panel for selected services
  • Global analysis controls integrated into menu bar
  • Live/Demo mode toggle for data source switching

Technical Architecture

The Service Topology feature uses a three-panel layout for comprehensive system visualization.

Critical Request Paths Panel

interface CriticalPath {
  id: string
  name: string
  description?: string
  priority: 'critical' | 'high' | 'medium' | 'low'
  services: string[]
  edges: Array<{ source: string; target: string }>
  metrics: {
    requestCount: number
    avgLatency: number
    p99Latency: number
    errorRate: number
  }
}
Enter fullscreen mode Exit fullscreen mode

Multi-select functionality with Cmd/Ctrl+Click enables simultaneous path comparison.

Interactive Service Topology Graph

Node sizing uses logarithmic scaling for visual clarity:

const calculateNodeSize = (rate: number, maxRate: number) => {
  const minSize = 30
  const maxSize = 80
  const scaleFactor = Math.log(rate + 1) / Math.log(maxRate + 1)
  return minSize + (maxSize - minSize) * scaleFactor
}

const getHealthColor = (errorRate: number): string => {
  if (errorRate > 0.05) return '#ff4d4f' // >5% errors
  if (errorRate > 0.01) return '#faad14' // 1-5% errors
  return '#52c41a' // <1% errors
}
Enter fullscreen mode Exit fullscreen mode

AI Analysis Panel

Service health analysis with actionable insights:

export const generateHealthExplanation = (
  serviceName: string,
  metrics: ServiceMetricsDetail
): HealthExplanation => {
  const errorSeverity = metrics.errorRate > 0.05 ? 2 : 
                        metrics.errorRate > 0.01 ? 1 : 0
  const latencySeverity = metrics.duration > 500 ? 2 : 
                          metrics.duration > 100 ? 1 : 0
  const rateSeverity = metrics.rate < 1 ? 2 : 
                       metrics.rate < 10 ? 1 : 0

  const maxSeverity = Math.max(errorSeverity, latencySeverity, rateSeverity)
  const status = maxSeverity === 2 ? 'critical' : 
                 maxSeverity === 1 ? 'warning' : 'healthy'

  return {
    status,
    summary: generateSummary(serviceName, metrics, status),
    impactedMetrics: analyzeMetrics(metrics),
    recommendations: generateRecommendations(metrics, status)
  }
}
Enter fullscreen mode Exit fullscreen mode

Development Metrics

Quantifiable progress from today's implementation:

  • Lines of Code: 2,500 across 12 TypeScript files
  • Components Created: 8 React components
  • Test Coverage: 12 e2e tests passing, 7 skipped for compatibility
  • Development Time: 4 hours focused work
  • Refactoring Iterations: 3 major cycles

Technical Implementation Details

Sankey Diagram for Request Flow

Converting topology data to flow visualization:

const getSankeyOption = (): EChartsOption => {
  const links = path.edges.map((edge) => {
    const sourceService = services.find(s => s.id === edge.source)
    const targetService = services.find(s => s.id === edge.target)
    const volume = Math.min(
      sourceService?.metrics?.rate || 100,
      targetService?.metrics?.rate || 100
    )
    const errorRate = targetService?.metrics?.errorRate || 0

    return {
      source: edge.source,
      target: edge.target,
      value: volume,
      lineStyle: {
        color: getServiceColor(errorRate),
        opacity: errorRate > 0.01 ? 0.9 : 0.6
      }
    }
  })

  return {
    series: [{
      type: 'sankey',
      emphasis: { focus: 'adjacency' },
      data: nodes,
      links: links
    }]
  }
}
Enter fullscreen mode Exit fullscreen mode

Service Neighbor Visibility

Intelligent filtering for selected service context:

const getVisibleServices = (selectedService: string, allServices: ServiceNode[]) => {
  const neighbors = new Set<string>()

  edges.forEach(edge => {
    if (edge.source === selectedService) neighbors.add(edge.target)
    if (edge.target === selectedService) neighbors.add(edge.source)
  })

  return allServices.filter(service => 
    service.id === selectedService || neighbors.has(service.id)
  )
}
Enter fullscreen mode Exit fullscreen mode

Data Source Management

Supporting both mock and live data:

const useDataSource = () => {
  const { useMockData } = useAppStore()

  return useMemo(() => ({
    fetchTopology: useMockData 
      ? () => Promise.resolve(getMockTopologyData())
      : () => fetchRealTopologyData(),
    fetchMetrics: useMockData
      ? () => Promise.resolve(getMockMetrics())
      : () => fetchRealMetrics()
  }), [useMockData])
}
Enter fullscreen mode Exit fullscreen mode

Visual Documentation

Screenshots from PR #39 implementation:

Main Topology View

Service Topology
Critical paths, interactive topology, and AI analysis panels

Checkout Flow Path

Checkout Flow
Sankey diagram showing request volumes and error rates

Test Coverage

Comprehensive e2e test suite ensuring quality:

describe('Service Topology Comprehensive Validation', () => {
  test('should display all Service Topology components correctly')
  test('should handle path selection in critical paths panel')
  test('should display topology graph with nodes and edges')
  test('should show service details on node click')
  test('should handle Live/Demo mode switching')
  test('should filter services based on health status')
  test('should highlight selected paths in topology')
  test('should show AI analysis for selected service')
  test('should handle multi-select with Cmd/Ctrl+Click')
  test('should maintain state across panel interactions')
  test('should handle error states gracefully')
  test('should perform smoothly with large datasets')
})
Enter fullscreen mode Exit fullscreen mode

4-Hour Development Breakdown

Hour 1: Requirements analysis and component architecture
Hour 2: ECharts topology graph implementation
Hour 3: Sankey diagram and path visualization
Hour 4: AI analysis panel and test suite

Performance Considerations

Current limitations and planned optimizations:

  • Graph rendering slows with >100 nodes
  • WebSocket integration needed for real-time updates
  • Mobile viewport requires responsive design adjustments
  • Export functionality pending for diagram sharing

Implementation Insights

Effective Patterns

  • Component isolation simplified parallel development
  • Mock data first approach accelerated UI iteration
  • TypeScript interfaces prevented runtime errors
  • Effect-TS patterns provided type-safe service boundaries

Areas Requiring Refinement

  • Large dataset performance optimization
  • Real-time data streaming integration
  • Mobile-responsive layout adaptation
  • Diagram export capabilities

Next Steps

Tomorrow's implementation priorities:

  1. Connect to live OpenTelemetry data streams
  2. Implement autoencoder-based anomaly detection
  3. Optimize rendering for enterprise-scale graphs
  4. Add time-series topology evolution

Summary

Day 20 delivered a complete Service Topology implementation with critical path analysis, interactive visualization, and AI-powered insights. The 4-hour focused development session produced 2,500 lines of production-ready code with comprehensive test coverage.

Progress: Day 20 of 30 complete
Feature: Service Topology with Critical Request Paths
Code: 2,500 LOC added
Tests: 12 passing, 7 skipped
PR: #39


Part of the 30-Day AI-Native Observability Platform series. Building enterprise observability with AI-assisted development and 4-hour focused workdays.

Top comments (0)