Technical Guide

Platform Engineering vs DevOps: What Actually Changed (and What Didn't)

What platform engineering actually means in practice. Golden paths, self-service infrastructure, internal developer portals, documentation that works, and when platform engineering is overhead you can't afford.

February 20, 202614 min readOronts Engineering Team

The DevOps to Platform Engineering Shift

DevOps said "you build it, you run it." Every developer manages their own infrastructure, deploys their own services, configures their own monitoring. In theory, this creates ownership. In practice, it creates 15 different ways to deploy a service, 8 different monitoring setups, and every team reinventing the same CI/CD pipeline.

Platform engineering is the correction. Instead of every team building infrastructure from scratch, a platform team provides opinionated, self-service tools that make the right thing easy and the wrong thing hard.

The shift is real but oversold. For teams under 30 engineers, platform engineering is often overhead. For teams over 50, it's necessary. This article covers the practical patterns. For how we deploy infrastructure, see our IaC guide and Kubernetes guide.

What Platform Engineering Actually Is

Platform engineering is not a tool. It's not Backstage. It's not Kubernetes. It's an approach: build internal tools that make developers productive without requiring them to become infrastructure experts.

DevOps ApproachPlatform Engineering Approach
Every team writes their own DockerfilePlatform provides a base image per language
Every team configures their own CI/CDPlatform provides a pipeline template, team fills in parameters
Every team sets up monitoringPlatform provides observability-as-a-service with standard dashboards
Every team manages their own databasePlatform provides database provisioning via a form or API
New service setup takes 2 daysNew service setup takes 15 minutes using a golden path

The key metric: time to first deploy for a new service. If it takes a developer 2 days to go from "I need a new service" to "it's running in staging," your platform has a problem. If it takes 15 minutes, your platform is working.

Golden Paths

A golden path is an opinionated template for creating a new service. It includes everything a developer needs: project structure, CI/CD pipeline, Dockerfile, Kubernetes manifests, monitoring configuration, and documentation.

golden-path-typescript-api/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ index.ts                    # Entry point with health check
β”‚   β”œβ”€β”€ routes/                     # Route definitions
β”‚   └── services/                   # Business logic
β”œβ”€β”€ test/
β”‚   β”œβ”€β”€ unit/
β”‚   └── integration/
β”œβ”€β”€ Dockerfile                      # Optimized multi-stage build
β”œβ”€β”€ .github/workflows/
β”‚   └── ci-cd.yaml                  # Build, test, push, deploy
β”œβ”€β”€ kubernetes/
β”‚   β”œβ”€β”€ base/
β”‚   β”‚   β”œβ”€β”€ deployment.yaml
β”‚   β”‚   β”œβ”€β”€ service.yaml
β”‚   β”‚   └── kustomization.yaml
β”‚   └── overlay/
β”‚       β”œβ”€β”€ staging/
β”‚       └── production/
β”œβ”€β”€ monitoring/
β”‚   β”œβ”€β”€ alerts.yaml                 # Standard alert rules
β”‚   └── dashboard.json              # Grafana dashboard template
β”œβ”€β”€ docs/
β”‚   └── runbook.md                  # Operational runbook template
β”œβ”€β”€ .env.example
β”œβ”€β”€ package.json
β”œβ”€β”€ tsconfig.json
└── README.md

A developer runs platform create-service --name my-api --template typescript-api, answers 5 questions (service name, team, database needed, public or internal), and gets a fully functional project with CI/CD, monitoring, and deployment manifests. First deploy in 15 minutes.

Golden Path Principles

PrincipleWhy
Opinionated defaultsDon't offer 5 database options. Pick one (PostgreSQL) and make it easy.
OverridableAdvanced teams can customize. But the defaults should cover 80% of cases.
MaintainedWhen the platform updates (new base image, new security policy), all services using the golden path get the update.
DocumentedThe template itself is documentation. Comments in manifests explain why each configuration exists.
TestedThe golden path is a product. It has tests, CI, and versioning.

What Makes a Good vs Bad Golden Path

GoodBad
Creates a working service in 15 minutesCreates a skeleton that needs 2 days of configuration
Includes CI/CD, monitoring, deploymentOnly includes project structure
Reflects current best practicesReflects the original author's preferences from 2 years ago
Updated when platform changesNever updated after creation
Has 1-2 options (TypeScript API, Python worker)Has 15 options for every possible combination

Self-Service Infrastructure

Developers should not need to file a ticket to get a database, a Redis instance, or a new Kubernetes namespace. Self-service means they can provision what they need through a UI, CLI, or API.

// Platform CLI: provision a PostgreSQL database
// $ platform db create --name my-api-db --size small --env staging
interface DatabaseRequest {
    name: string;
    size: 'small' | 'medium' | 'large';  // Opinionated: 3 sizes, not arbitrary specs
    environment: 'dev' | 'staging' | 'production';
    team: string;
    backupRetention: number;  // Default: 7 days staging, 30 days production
}

// Behind the scenes: Terraform runs, creates RDS instance,
// stores credentials in Secrets Manager, creates K8s secret,
// adds monitoring dashboard, notifies the team

What to Self-Service

ResourceSelf-Service?Why / Why Not
Database (PostgreSQL)YesStandard resource, opinionated sizes
Redis cacheYesStandard resource
Kubernetes namespaceYesLow risk, team-scoped
S3 bucketYesStandard resource
Domain / DNS recordYes (with approval)Low risk but needs naming governance
IAM roles / permissionsNo (request-based)Security risk, needs review
VPC / networking changesNo (platform team)Cluster-wide impact
New cloud accountNo (platform team)Cost and security governance

The boundary: self-service for resources scoped to a team. Request-based for resources that affect the whole organization.

Internal Developer Portals

An internal developer portal is a catalog of all services, their owners, their APIs, their runbooks, and their health status. Backstage (by Spotify) is the most well-known, but it's not the only option.

What a Portal Should Show

ViewContentWho Uses It
Service catalogAll services, owners, tech stack, links to reposEveryone
API documentationOpenAPI/GraphQL specs, auto-generatedFrontend teams, partners
RunbooksOperational procedures per serviceOn-call engineers
DependenciesWho depends on whatArchitecture reviews
Health statusCurrent status, recent incidentsOps, management
CostMonthly cost per service/teamFinance, management
Golden pathsTemplates for new servicesDevelopers

Backstage vs Alternatives

OptionEffortBest For
BackstageHigh (6+ weeks to set up, ongoing maintenance)Large orgs (100+ engineers), dedicated platform team
Custom portal (Next.js + API)Medium (2-4 weeks MVP)Mid-size teams, specific needs
Enhanced README + wikiLow (days)Small teams (< 30 engineers)
Notion/ConfluenceLowNon-technical stakeholders need access

For teams under 30 engineers, Backstage is overkill. A well-organized Git repository with README files, a shared Notion wiki, and a simple service catalog spreadsheet covers 80% of the need.

For teams over 50 engineers, the catalog problem becomes real. Services get created and forgotten. Owners leave and nobody knows who maintains what. A portal with ownership tracking and health dashboards becomes essential.

The Documentation Problem

Nobody reads your wiki. This is not a people problem. It's a location problem. Documentation that lives separately from the code it describes becomes stale within weeks.

Documentation That Works

TypeWhere It LivesWhy
API documentationAuto-generated from code (OpenAPI, GraphQL introspection)Always current
RunbooksIn the service repo (docs/runbook.md)Deployed with the code
Architecture decisionsADR files in the repo (docs/adr/)Version-controlled
OnboardingIn the golden path templateEvery new service starts with it
Platform capabilitiesPortal or platform CLI --helpDiscoverable at point of need

Architecture Decision Records (ADRs)

Every significant technical decision gets an ADR:

# ADR-003: Use PostgreSQL as the primary database

## Status: Accepted

## Context
We need a database for the new service. Options considered: PostgreSQL, MySQL, DynamoDB.

## Decision
PostgreSQL 15 via the platform's self-service database provisioning.

## Consequences
- Standard tooling (backups, monitoring, migrations) works out of the box
- Team doesn't need DynamoDB expertise
- Slightly higher latency than DynamoDB for key-value access patterns (acceptable)

ADRs prevent re-debating the same decisions. When a new team member asks "why PostgreSQL?", the answer is in the repo, not in someone's memory.

Measuring Developer Experience

If you're investing in a platform, measure whether it's actually helping:

MetricWhat It MeasuresGood Target
Time to first deployHow long from "new service idea" to "running in staging"< 30 minutes
Deployment frequencyHow often teams deploy to productionMultiple times per day
Lead time for changesTime from commit to production< 1 hour
Change failure ratePercentage of deployments causing incidents< 5%
MTTRMean time to recover from incidents< 30 minutes
Developer satisfactionSurvey score (quarterly)> 4/5
Support ticket volumePlatform-related requests per weekDecreasing trend

The first four are DORA metrics. The last three are platform-specific. Track all of them. If time-to-first-deploy is improving but developer satisfaction is dropping, the platform is adding complexity without value.

When Platform Engineering Is Overhead

Platform engineering is not free. A platform team costs 2-5 full-time engineers. Golden paths need maintenance. Self-service tools need development and support. A portal needs content.

Don't Build a Platform When

  • Your team is under 20 engineers (the overhead exceeds the benefit)
  • You have fewer than 5 services (not enough standardization opportunity)
  • You're a startup that might pivot (the platform will be wasted)
  • Your engineering process is already fast (if time-to-deploy is already 30 minutes, you don't need a platform team to improve it)

Do Build a Platform When

  • You have 50+ engineers deploying 10+ services
  • New service creation takes more than a day
  • Teams are reinventing the same infrastructure patterns
  • On-call is painful because every service has different monitoring
  • Developer satisfaction is low because of infrastructure friction

The minimum viable platform for a 30-50 person team:

  1. One golden path template (the most common service type)
  2. CI/CD pipeline template (shared, parameterized)
  3. Standard monitoring (Grafana dashboards auto-provisioned)
  4. Service catalog (even if it's just a spreadsheet)
  5. One-page platform guide ("how to create a new service")

That's it. No Backstage, no custom portal, no self-service infrastructure. Just templates and standards. Add complexity when the team outgrows the simple approach.

Common Pitfalls

  1. Building a platform before you have standardization. If every service uses a different language, framework, and deployment method, a platform can't help. Standardize first, then platform.

  2. Backstage before 50 engineers. Backstage is powerful but complex. For smaller teams, the setup and maintenance cost exceeds the benefit.

  3. Golden paths that are never updated. A template from 2 years ago with outdated dependencies and deprecated patterns does more harm than good. Maintain it like a product.

  4. Self-service everything. IAM roles and networking changes should not be self-service. The blast radius of a mistake is too large.

  5. Measuring activity, not outcomes. "We built 15 platform features" is not success. "Time-to-first-deploy dropped from 2 days to 30 minutes" is success.

  6. Ignoring developer satisfaction. A platform that forces developers into patterns they hate will be bypassed. Talk to your users (the developers) regularly.

  7. No documentation. A self-service platform without documentation is just a different kind of black box.

Key Takeaways

  • Platform engineering is not a tool, it's an approach. Build internal tools that make developers productive without requiring infrastructure expertise. The golden path is the core product.

  • Time to first deploy is the key metric. If a developer can go from idea to running service in 15 minutes, the platform is working. If it takes 2 days, it's not.

  • Start simple. One golden path, one CI/CD template, standard monitoring. Add complexity when the simple approach isn't enough. Most teams under 30 engineers don't need Backstage.

  • Documentation lives with the code. Auto-generated API docs, runbooks in the repo, ADRs for decisions. Wiki pages become stale. In-repo docs stay current.

  • Measure outcomes, not activity. DORA metrics plus developer satisfaction. If the numbers aren't improving, the platform investment isn't working.

  • The platform team is a product team. Their users are developers. Their product is the internal platform. They need feedback loops, user research, and iteration just like any product team.

We build internal platforms and developer tooling as part of our cloud services and consulting practice. If you need help with platform engineering strategy, talk to our team or request a quote. See also our methodology page for how we approach engineering culture.

Topics covered

platform engineeringinternal developer platformDevOps evolutiondeveloper experiencegolden pathsself-service infrastructureBackstageplatform team

Ready to build production AI systems?

Our team specializes in building production-ready AI systems. Let's discuss how we can help transform your enterprise with cutting-edge technology.

Start a conversation