DEV Community

Cover image for Platform Engineering: Building an Internal Developer Platform That Teams Actually Use
Samson Tanimawo
Samson Tanimawo

Posted on

Platform Engineering: Building an Internal Developer Platform That Teams Actually Use

The "Build It and They Won't Come" Problem

Our platform team spent 6 months building an internal developer platform. Beautiful service catalog, automated provisioning, self-service databases. Nobody used it.

Here's what we learned.

Why Platforms Fail

Most internal platforms fail for the same reason: they're built top-down instead of bottom-up.

Top-down: "We decided every team should use this standardized deployment pipeline."
Bottom-up: "We noticed 8 teams solving the same problem differently, so we built a shared solution."

The Paved Road Approach

Instead of mandating tools, offer a paved road. Make the right thing the easy thing.

Paved road (easy):        Off-road (hard but allowed):
─────────────────         ──────────────────────────
Standard CI/CD template   Custom pipeline
Managed Postgres          Self-managed DB
Shared observability       Own monitoring stack
Pre-configured K8s         Custom infrastructure
Enter fullscreen mode Exit fullscreen mode

The key: off-road is allowed but unsupported. You break it, you own it.

What Our IDP Looks Like

# service.yaml — the only file developers need to create
apiVersion: platform/v1
kind: Service
metadata:
  name: checkout-api
  team: payments
  tier: critical
spec:
  language: python
  framework: fastapi

  dependencies:
    - postgres:14
    - redis:7
    - rabbitmq:3

  scaling:
    min: 3
    max: 20
    metric: cpu
    target: 70

  environments:
    staging:
      replicas: 1
    production:
      replicas: 3
      multi_az: true
Enter fullscreen mode Exit fullscreen mode

From this single file, the platform provisions:

  • Git repo with CI/CD pipeline
  • Kubernetes namespace and RBAC
  • Database and connection secrets
  • Monitoring dashboards (golden signals)
  • Alerting rules
  • Log aggregation
  • Service mesh entry

The Developer Experience Metrics

We track these to know if the platform is working:

Time from idea to production deploy:    Before: 2 weeks  After: 4 hours
Time to provision a new environment:    Before: 3 days   After: 12 minutes
Deploy frequency:                       Before: weekly    After: 5x/day
Change failure rate:                    Before: 18%       After: 4%
Developer satisfaction (quarterly NPS): Before: -10       After: +52
Enter fullscreen mode Exit fullscreen mode

The Self-Service Portal

Our portal has exactly four actions:

  1. Create Service — Generates everything from service.yaml
  2. View My Services — Dashboard of health, deploys, costs
  3. Request Resource — Database, queue, cache (auto-provisioned)
  4. Get Help — Links to docs + Slack channel

That's it. Four buttons. If you need more than four buttons, your platform is too complex.

Adoption Strategy

We didn't mandate adoption. We seduced teams into it:

  1. Week 1-4: Pilot with the friendliest team. Fix everything.
  2. Week 5-8: Add two more teams. Fix more things.
  3. Week 9-12: Success stories in engineering all-hands.
  4. Week 13+: Other teams start asking to join.

By month 6, 80% of teams had migrated voluntarily. The remaining 20% had legitimate edge cases we accommodated.

What Not to Build

  • Don't build a service mesh if you have < 20 services
  • Don't build a custom scheduler if standard K8s works
  • Don't build a custom secret manager — use Vault or cloud-native
  • Don't build a custom CI system — use GitHub Actions/GitLab CI

Build the glue, not the tools.

If you want a platform that includes AI-powered operations from day one, check out what we're building at Nova AI Ops.


Written by Dr. Samson Tanimawo
BSc · MSc · MBA · PhD
Founder & CEO, Nova AI Ops. https://novaaiops.com

Top comments (0)