Technical Guide

Kubernetes for SaaS: When It's Right, When ECS Wins, and What We Chose

Kubernetes vs ECS vs Lambda for SaaS platforms. Multi-tenant isolation, deployment strategies, networking, cost optimization, and the honest decision framework from running all three in production.

February 26, 202618 min readOronts Engineering Team

The Kubernetes Decision

Every engineering team eventually asks: should we use Kubernetes? The honest answer is: it depends on what you're running, how many services you have, and whether you can afford the operational overhead.

We run three different compute strategies in production. One platform runs on Kubernetes (7+ services, Pimcore, OpenSearch, workers). Another runs on ECS Fargate + Lambda (serverless-first, event-driven). A third uses a mix of both. Each was the right choice for its context.

This article covers the decision framework and the implementation patterns for each approach. For how we manage the infrastructure as code behind these deployments, see our IaC guide. For the application architectures that run on top, see our system architecture guide.

The Honest Comparison

Criteria	Kubernetes (EKS/AKS/GKE)	ECS Fargate	Lambda
Operational complexity	High (cluster upgrades, networking, RBAC)	Medium (task definitions, service mesh)	Low (just deploy functions)
Cold start	None (pods are always running)	None (tasks are always running)	100ms-5s (depends on runtime/package)
Scaling speed	Minutes (pod scheduling + node scaling)	Seconds (task launch)	Milliseconds (concurrent invocations)
Cost at idle	High (minimum 2-3 nodes running always)	Medium (pay per running task)	Zero (pay per invocation)
Cost at scale	Low (efficient packing, spot instances)	Medium (less efficient packing)	Can be high (per-invocation pricing)
Stateful workloads	Good (PVCs, StatefulSets)	Limited (EFS only)	Not supported
Long-running processes	Unlimited	Unlimited	15 min max
Ecosystem	Enormous (Helm, operators, service mesh)	AWS-native	AWS-native
Multi-cloud	Yes (same manifests, different providers)	AWS only	AWS only
Team skill requirement	High (K8s expertise needed)	Medium (AWS knowledge)	Low (just write functions)
Best for	Complex multi-service systems, stateful workloads	Simple microservices, containers without K8s overhead	Event-driven, API endpoints, scheduled tasks

The Real Cost Breakdown

For a typical SaaS platform with 5 services:

Component	Kubernetes (EKS)	ECS Fargate	Lambda + API Gateway
Compute (monthly)	~$600 (3 nodes t3.large + pods)	~$450 (5 services, 0.5 vCPU each)	~$50-500 (depends on traffic)
Control plane	$73/month (EKS fee)	Free	Free
Load balancer	$25/month (ALB)	$25/month (ALB)	Included in API GW
Networking (NAT)	$45/month	$45/month	$45/month
Monitoring	$50-200/month	$50-200/month	$50-200/month
Total (low traffic)	~$800-1,000/month	~$570-720/month	~$200-800/month
Total (high traffic)	~$1,500-3,000/month	~$2,000-4,000/month	~$3,000-10,000/month

Kubernetes is cheapest at scale (efficient bin-packing, spot instances, reserved capacity). Lambda is cheapest at low traffic (pay nothing at idle). ECS Fargate is the middle ground.

When to Choose Kubernetes

Choose Kubernetes when you have:

Complex multi-service systems. If you're running 7+ services with interdependencies, shared configuration, service discovery, and coordinated deployments, Kubernetes orchestrates this well. Individual Docker containers on ECS become hard to manage at this scale.

Stateful workloads. Databases, search engines (OpenSearch, MeiliSearch), message brokers (RabbitMQ), and cache clusters (Redis) all benefit from Kubernetes StatefulSets, PersistentVolumeClaims, and operators. Running these on ECS requires external managed services for every stateful component.

Multi-cloud requirements. Kubernetes manifests work on any cloud provider. ECS and Lambda are AWS-only. If you need to run on AWS and Azure (or might need to in the future), Kubernetes is the portable choice.

A platform team. Kubernetes requires ongoing maintenance: cluster upgrades (every 3-4 months for security patches), node group management, networking configuration (ingress controllers, network policies), and RBAC management. Without a dedicated person or team handling this, the operational overhead will slow the entire engineering organization.

Kubernetes Architecture for a PIM/Commerce Platform

┌─────────────────────────────────────────────────────────────┐
│                    Kubernetes Cluster                         │
│                                                              │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐ │
│  │ Ingress      │  │ Cert-Manager│  │ External DNS        │ │
│  │ (Nginx/Traefik)│ │ (Let's Encrypt)│ │ (Route53 sync)  │ │
│  └──────┬───────┘  └─────────────┘  └─────────────────────┘ │
│         │                                                     │
│  ┌──────▼──────────────────────────────────────────────────┐ │
│  │                    Namespaces                            │ │
│  │                                                          │ │
│  │  ┌──────────────────────────────────────────────────┐   │ │
│  │  │  production namespace                             │   │ │
│  │  │                                                    │   │ │
│  │  │  pimcore-web (2-4 replicas)                       │   │ │
│  │  │  pimcore-worker (1-3 replicas)                    │   │ │
│  │  │  pimcore-ops (1 replica, maintenance)             │   │ │
│  │  │  frontend (2-3 replicas)                          │   │ │
│  │  └──────────────────────────────────────────────────┘   │ │
│  │                                                          │ │
│  │  ┌──────────────────────────────────────────────────┐   │ │
│  │  │  data namespace                                   │   │ │
│  │  │                                                    │   │ │
│  │  │  mysql (StatefulSet, 1 replica or managed)        │   │ │
│  │  │  redis (StatefulSet, 1 replica or managed)        │   │ │
│  │  │  opensearch (StatefulSet, 2-3 replicas)           │   │ │
│  │  │  rabbitmq (StatefulSet, 1-3 replicas)             │   │ │
│  │  └──────────────────────────────────────────────────┘   │ │
│  │                                                          │ │
│  │  ┌──────────────────────────────────────────────────┐   │ │
│  │  │  flux-system namespace (GitOps controller)        │   │ │
│  │  └──────────────────────────────────────────────────┘   │ │
│  └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

Deployment Strategy: GitOps with Flux

We use Flux for GitOps-based deployments. The Git repository is the single source of truth. Flux reconciles the cluster state with the repository every minute.

# flux-system/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
    name: platform
    namespace: flux-system
spec:
    interval: 1m
    sourceRef:
        kind: GitRepository
        name: infrastructure
    path: ./kubernetes/resources/overlay/prod
    prune: true  # Remove resources deleted from Git
    healthChecks:
        - apiVersion: apps/v1
          kind: Deployment
          name: pimcore
          namespace: production

Benefits over kubectl apply or CI-driven deployments:

Drift detection and correction. If someone changes a resource manually, Flux reverts it within 1 minute.
Git as audit trail. Every change is a Git commit with author, timestamp, and diff.
No cluster credentials in CI. Flux pulls from Git. CI pushes to Git. The CI pipeline never needs kubectl access.
Rollback is git revert. Revert the commit, Flux reconciles, rollback complete.

Kustomize for Environment Overlays

kubernetes/resources/
├── base/
│   ├── deployments/
│   │   ├── pimcore.yaml
│   │   ├── frontend.yaml
│   │   └── worker.yaml
│   ├── services/
│   ├── configmaps/
│   └── kustomization.yaml
├── overlay/
│   ├── prod/
│   │   ├── patches/
│   │   │   ├── pimcore-replicas.yaml    # 4 replicas
│   │   │   ├── resource-limits.yaml      # Higher CPU/memory
│   │   │   └── env-secrets.yaml          # Production secrets
│   │   └── kustomization.yaml
│   ├── staging/
│   │   ├── patches/
│   │   │   ├── pimcore-replicas.yaml    # 1 replica
│   │   │   └── resource-limits.yaml      # Lower limits
│   │   └── kustomization.yaml
│   └── dev/
│       └── kustomization.yaml

Base manifests define the common structure. Overlays patch for environment-specific differences (replicas, resource limits, secrets, domains). Same application, different configuration per environment.

When ECS Fargate Wins

We chose ECS Fargate + Lambda for a commerce platform instead of Kubernetes. The reasons:

Simpler operations. No cluster upgrades, no node management, no RBAC configuration. ECS handles scheduling, scaling, and health checks. The team focuses on application code, not infrastructure.

Faster scaling. ECS Fargate launches new tasks in seconds. Kubernetes needs to schedule pods, potentially wait for node scaling (minutes), and pass health checks. For traffic spikes, Fargate responds faster.

Better cost for variable workloads. Pay per running task, not per node. If traffic drops to zero at night, costs drop proportionally. Kubernetes nodes keep running (and charging) regardless of load.

// ECS service definition (via CDK)
const service = new ecs.FargateService(this, 'ApiService', {
    cluster,
    taskDefinition,
    desiredCount: 2,
    assignPublicIp: false,
    vpcSubnets: { subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS },
    circuitBreaker: { rollback: true },  // Auto-rollback on deployment failure
    capacityProviderStrategies: [
        { capacityProvider: 'FARGATE_SPOT', weight: 2 },  // 66% spot
        { capacityProvider: 'FARGATE', weight: 1 },        // 33% on-demand
    ],
});

// Auto-scaling
const scaling = service.autoScaleTaskCount({ minCapacity: 2, maxCapacity: 10 });
scaling.scaleOnCpuUtilization('CpuScaling', { targetUtilizationPercent: 70 });
scaling.scaleOnRequestCount('RequestScaling', {
    targetGroup,
    requestsPerTarget: 1000,
});

Lambda for Event-Driven Workloads

Lambda functions handle event-driven workloads that don't justify a persistent service:

// Lambda for webhook processing
const webhookHandler = new lambda.Function(this, 'WebhookHandler', {
    runtime: lambda.Runtime.NODEJS_20_X,
    handler: 'webhook.handler',
    timeout: cdk.Duration.seconds(30),
    memorySize: 256,
    environment: {
        TABLE_NAME: table.tableName,
        QUEUE_URL: queue.queueUrl,
    },
});

// API Gateway triggers Lambda
const api = new apigateway.RestApi(this, 'WebhookApi');
api.root.addResource('webhook').addMethod('POST',
    new apigateway.LambdaIntegration(webhookHandler)
);

The Hybrid: ECS + Lambda

The architecture we use most often for commerce platforms:

Component	Runs On	Why
Commerce API (Vendure)	ECS Fargate	Long-running, stateful sessions
Worker service	ECS Fargate	Persistent queue consumer
Webhook handlers	Lambda	Event-driven, sporadic traffic
Scheduled tasks	Lambda + EventBridge	Cron-like, no persistent process needed
Image processing	Lambda	CPU-intensive, parallelizable
Search indexing	Lambda + SQS	Event-driven, bursty
Admin dashboard	ECS Fargate or S3+CloudFront	Static assets or SSR

The commerce API and workers run on Fargate (persistent, long-running). Everything event-driven runs on Lambda (pay-per-use, auto-scaling). The combination is cheaper than running everything on Fargate and simpler than running everything on Kubernetes.

Multi-Tenant Isolation on Kubernetes

If you run a multi-tenant SaaS on Kubernetes, tenant isolation needs explicit configuration:

Namespace Isolation

# Network policy: pods in tenant-a namespace can only talk to each other
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
    name: tenant-isolation
    namespace: tenant-a
spec:
    podSelector: {}
    policyTypes:
        - Ingress
        - Egress
    ingress:
        - from:
            - namespaceSelector:
                matchLabels:
                    tenant: tenant-a
    egress:
        - to:
            - namespaceSelector:
                matchLabels:
                    tenant: tenant-a
        - to:  # Allow DNS resolution
            - namespaceSelector: {}
              podSelector:
                matchLabels:
                    k8s-app: kube-dns
            ports:
                - port: 53
                  protocol: UDP

Resource Quotas

Prevent one tenant from consuming all cluster resources:

apiVersion: v1
kind: ResourceQuota
metadata:
    name: tenant-a-quota
    namespace: tenant-a
spec:
    hard:
        requests.cpu: "4"          # Max 4 CPU cores
        requests.memory: "8Gi"     # Max 8GB RAM
        limits.cpu: "8"
        limits.memory: "16Gi"
        pods: "20"                 # Max 20 pods
        services: "10"
        persistentvolumeclaims: "5"

The Noisy Neighbor Problem

Even with resource quotas, one tenant's I/O-heavy workload can affect others on the same node. Solutions:

Strategy	Isolation Level	Cost Impact
Shared nodes, resource quotas	Soft (CPU/memory limited, I/O shared)	Lowest
Node affinity (dedicated node pools)	Medium (dedicated nodes per tenant)	Higher
Dedicated clusters	Full (completely separate infrastructure)	Highest

For most SaaS applications, shared nodes with resource quotas is sufficient. Reserve dedicated node pools for enterprise tenants with strict isolation requirements. For the application-level isolation patterns (API middleware, query filters, policies), see our multi-tenant design guide.

Cost Optimization

Spot Instances (Kubernetes)

Spot instances are 60-90% cheaper than on-demand. Use them for stateless workloads that can tolerate interruption:

# EKS managed node group with spot instances
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
    name: production
    region: eu-central-1
managedNodeGroups:
    - name: spot-workers
      instanceTypes: ["t3.large", "t3.xlarge", "m5.large"]
      spot: true
      minSize: 2
      maxSize: 10
      desiredCapacity: 3
      labels:
          node-type: spot
    - name: on-demand-workers
      instanceTypes: ["t3.large"]
      minSize: 1
      maxSize: 3
      desiredCapacity: 1
      labels:
          node-type: on-demand

Run stateless services (web servers, workers) on spot. Run stateful services (databases, search engines) on on-demand. Use pod anti-affinity to spread replicas across nodes so a spot interruption doesn't take down all replicas.

Right-Sizing

Most teams over-provision. A service requesting 1 CPU and 2GB RAM might actually use 0.2 CPU and 400MB. Over-provisioning wastes money. Under-provisioning causes OOM kills.

# Check actual resource usage vs requests
kubectl top pods -n production
# Compare with resource requests in deployment manifests
kubectl get pods -n production -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.containers[0].resources.requests}{"\n"}{end}'

Use Vertical Pod Autoscaler (VPA) in recommendation mode to see what your pods actually need:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
    name: pimcore-vpa
spec:
    targetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: pimcore
    updatePolicy:
        updateMode: "Off"  # Recommendation only, don't auto-apply

Autoscaling

Horizontal Pod Autoscaler (HPA) scales based on metrics:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
    name: pimcore-hpa
spec:
    scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: pimcore
    minReplicas: 2
    maxReplicas: 8
    metrics:
        - type: Resource
          resource:
              name: cpu
              target:
                  type: Utilization
                  averageUtilization: 70
        - type: Resource
          resource:
              name: memory
              target:
                  type: Utilization
                  averageUtilization: 80
    behavior:
        scaleDown:
            stabilizationWindowSeconds: 300  # Wait 5 min before scaling down
            policies:
                - type: Pods
                  value: 1
                  periodSeconds: 60  # Remove max 1 pod per minute

The stabilizationWindowSeconds prevents flapping (scale up, scale down, scale up). The scaleDown.policies prevent aggressive scale-down that might cause capacity issues during the next traffic spike.

The Networking Minefield

Kubernetes networking is where most teams get stuck.

Ingress Controllers

Controller	Best For	Complexity
Nginx Ingress	General purpose, most common	Low
Traefik	Auto-discovery, Let's Encrypt built-in	Low
AWS ALB Ingress	AWS-native, WAF integration	Medium
Istio Gateway	Service mesh, mTLS, traffic management	High

For most SaaS platforms, Nginx Ingress + cert-manager (Let's Encrypt) is sufficient. Add a service mesh (Istio, Linkerd) only if you need mTLS between services, advanced traffic routing (canary deployments, traffic splitting), or detailed service-to-service observability.

DNS Resolution Issues

A common production issue: pods can't resolve external hostnames because the DNS configuration is wrong.

# Find the correct DNS service IP in your cluster
kubectl get svc -n kube-system kube-dns -o jsonpath='{.spec.clusterIP}'

# If nginx configs reference a resolver, use this IP
# Common mistake: using 10.0.0.10 when the actual DNS is at 10.2.0.10

If your nginx sidecar proxies requests to external services (cloud storage, external APIs), the resolver directive must point to the cluster's kube-dns IP, not a hardcoded value.

Common Pitfalls

Choosing Kubernetes because "everyone uses it." If you have 3 services and a small team, ECS Fargate is simpler and cheaper. Kubernetes makes sense at 7+ services with a platform team.
No GitOps. kubectl apply from a developer laptop is not a deployment strategy. Use Flux or ArgoCD for reconciliation-based deployments.
Shared cluster without resource quotas. One tenant or one runaway pod consumes all resources. Every namespace needs resource quotas.
All pods on on-demand instances. Spot instances are 60-90% cheaper for stateless workloads. Use them for web servers and workers.
Over-provisioning resources. Pods requesting 2 CPU and using 0.2 CPU waste money. Use VPA recommendations to right-size.
Aggressive autoscaling. Scaling down too fast causes capacity issues on the next spike. Use stabilization windows and gradual scale-down policies.
No network policies. Without them, any pod can talk to any other pod in the cluster. In a multi-tenant setup, this is a security issue.
Ignoring cluster upgrades. Kubernetes versions go end-of-life every 12-15 months. Plan quarterly upgrade windows. Falling behind creates security vulnerabilities and blocks new features.
Mixing stateful and stateless on the same nodes. An OpenSearch pod and a web server pod competing for I/O on the same node degrades both. Use node affinity to separate them.
No sealed secrets. Committing plain secrets to Git is a security breach waiting to happen. Use Sealed Secrets, External Secrets Operator, or AWS Secrets Manager.

Key Takeaways

Kubernetes for complex multi-service platforms. 7+ services, stateful workloads, multi-cloud requirements, and a team that can handle the operational overhead.
ECS Fargate for simpler container workloads. Same containers, less operational complexity. Better for teams without Kubernetes expertise.
Lambda for event-driven workloads. Webhooks, scheduled tasks, image processing, and any workload that's bursty and short-lived. Zero cost at idle.
The hybrid (ECS + Lambda) is often the best answer. Persistent services on Fargate, event-driven work on Lambda. Cheaper than all-Kubernetes, simpler than all-Lambda.
GitOps with Flux gives real reconciliation. Not just deployment automation. Drift detection, audit trail, and rollback via git revert.
Spot instances save 60-90% on stateless workloads. Run web servers and workers on spot. Run databases and search engines on on-demand.
Multi-tenant isolation needs network policies and resource quotas. Namespace isolation alone is not enough. Enforce network boundaries and resource limits per tenant.

We deploy and manage Kubernetes, ECS, and Lambda infrastructure as part of our cloud services. If you need help choosing a compute strategy or optimizing your existing deployment, talk to our team or request a quote. See also our Pimcore upgrade guide for Kubernetes-specific deployment patterns with Pimcore.

Topics covered

Kubernetes SaaSK8s productionKubernetes vs ECSECS Fargateplatform engineering K8sKubernetes multi-tenantKubernetes cost optimization

Ready to build production AI systems?

Our team specializes in building production-ready AI systems. Let's discuss how we can help transform your enterprise with cutting-edge technology.

Start a conversation