Container Security in Production

What actually keeps container images safe in production: distroless base images, CI scanning, cosign signing, and Pod Security Standards on EKS.

It was a Tuesday afternoon. Our branded mobile app pipeline at the creator economy platform I worked at pushed a build image to ECR, and Trivy slammed it with a wall of critical CVEs. None of them were from code I’d written. The base image had picked up a fresh kernel-headers advisory overnight, plus some glibc thing, plus a couple of OpenSSL ones for good measure. The deploy queue paused. Twelve services behind it. Three engineers in the war room asking the same question. Do we ship or do we hold?

I’ve been on the wrong side of that question more than once. So this is the opinion up front, and I’ll defend it the rest of the way down: container security in production is a Dockerfile problem first, an admission controller problem second, and a runtime tool problem a distant third. If your image is fat, unsigned, and runs as root, your fancy runtime sensor is theater. Strip the image. Sign it. Refuse to admit anything unsigned. Then we can talk about the rest.

Starting with the base image

The single highest-leverage move is to stop shipping a full Linux distro to production. A node:20 or python:3.12 image gives you a package manager, a shell, curl, a bunch of CLI tools, and a fresh set of CVEs every week. Most of those packages don’t get touched at runtime. They just sit there waiting to fail a scan.

Multi-stage builds plus distroless solve most of it.

# syntax=docker/dockerfile:1.7

# Build stage. Full toolchain here, never shipped.
FROM node:20.17-bookworm@sha256:7148b0a09c8b8a5bea9c9f5f3f2e6b9d3a3b4a3c5d6e7f8a9b0c1d2e3f4a5b6c AS build
WORKDIR /app
COPY package*.json ./
RUN --mount=type=cache,target=/root/.npm \
    npm ci --omit=dev
COPY . .
RUN npm run build

# Final stage. Distroless, non-root, no shell.
FROM gcr.io/distroless/nodejs20-debian12:nonroot@sha256:fa3a3b0c2e1f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b
WORKDIR /app
COPY --from=build --chown=nonroot:nonroot /app/dist ./dist
COPY --from=build --chown=nonroot:nonroot /app/node_modules ./node_modules
USER nonroot
EXPOSE 3000
CMD ["dist/server.js"]

Two things in there worth pointing at. First, the @sha256: digest pin on both stages. Tags are mutable, digests aren’t. If you only pin node:20, you’ll get a different base image next Tuesday and a different vuln set with it. Second, the final stage is distroless/nodejs20-debian12:nonroot. No apt, no shell, no curl. An attacker who pops your app process can’t drop into a shell because there isn’t one to drop into. They can still do plenty of damage, but the easy paths are closed.

Scanning images in CI

Scan in the pipeline, fail the build, produce an SBOM. The SBOM matters because when a CVE drops two months from now, you want to grep your registry’s attestations and know in five minutes whether you’re exposed.

name: build-and-scan

on:
  push:
    branches: [main]

permissions:
  contents: read
  id-token: write
  packages: write

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ secrets.ECR_PUSH_ROLE }}
          aws-region: us-east-1

      - id: ecr
        uses: aws-actions/amazon-ecr-login@v2

      - name: Build image
        run: |
          docker build \
            --build-arg COMMIT_SHA=${{ github.sha }} \
            -t $ECR/app:${{ github.sha }} .
        env:
          ECR: ${{ steps.ecr.outputs.registry }}/app

      - name: Trivy scan
        uses: aquasecurity/[email protected]
        with:
          image-ref: ${{ steps.ecr.outputs.registry }}/app:${{ github.sha }}
          severity: CRITICAL,HIGH
          exit-code: '1'
          ignore-unfixed: false
          vuln-type: os,library

      - name: Generate SBOM (SPDX)
        uses: anchore/sbom-action@v0
        with:
          image: ${{ steps.ecr.outputs.registry }}/app:${{ github.sha }}
          format: spdx-json
          artifact-name: app-${{ github.sha }}.spdx.json

      - name: Push image
        run: docker push $ECR/app:${{ github.sha }}
        env:
          ECR: ${{ steps.ecr.outputs.registry }}/app

Notice ignore-unfixed: false. The default in a lot of templates is to silently skip CVEs that don’t have a fixed version yet. That’s a comforting lie. An unfixed CVE is still an unfixed CVE. Either pick a base image that doesn’t carry it, or accept it explicitly with a dated allowlist that an SRE has to renew. Don’t hide it.

Signing and provenance

OK so here’s where most teams stop. They scan, they pass CI, they push to ECR. And then anything with the right IAM role can shove an image into that registry, and the cluster will happily pull it. Signing closes that loop. Cosign with keyless OIDC is the lowest-friction option, no key management, the signature is bound to your GitHub Actions identity.

      - uses: sigstore/cosign-installer@v3

      - name: Sign image
        env:
          COSIGN_EXPERIMENTAL: '1'
        run: |
          IMAGE="${{ steps.ecr.outputs.registry }}/app@$(docker inspect --format='{{index .RepoDigests 0}}' $ECR/app:${{ github.sha }} | cut -d@ -f2)"
          cosign sign --yes "$IMAGE"
          cosign attest --yes --predicate app-${{ github.sha }}.spdx.json \
            --type spdxjson "$IMAGE"

The sign happens against the image digest, not the tag. Same reason as before, tags lie. The attest call binds the SBOM to the image as an in-toto attestation, which means later you can ask the registry “what was inside this image” without rebuilding it.

Verification happens at admission time. We used Kyverno on EKS because it was already in the cluster for other policies, but Sigstore Policy Controller works just as well.

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: verify-app-images
spec:
  validationFailureAction: Enforce
  background: false
  rules:
    - name: check-signature
      match:
        any:
          - resources:
              kinds: [Pod]
              namespaces: [prod, staging]
      verifyImages:
        - imageReferences:
            - "1234.dkr.ecr.us-east-1.amazonaws.com/app*"
          attestors:
            - entries:
                - keyless:
                    subject: "https://github.com/myorg/app/.github/workflows/release.yml@refs/heads/main"
                    issuer: "https://token.actions.githubusercontent.com"

The subject pins it to a specific workflow file on a specific branch. A push from a fork or a different repo doesn’t get admitted, no matter how valid the signature looks otherwise.

Runtime hardening on Kubernetes

Image is clean, signed, admitted. Now make sure the running pod can’t do anything it shouldn’t. Pod Security Standards restricted plus an explicit securityContext covers most of it.

apiVersion: v1
kind: Namespace
metadata:
  name: prod
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/enforce-version: latest
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: app
  namespace: prod
spec:
  replicas: 6
  selector:
    matchLabels: { app: app }
  template:
    metadata:
      labels: { app: app }
    spec:
      automountServiceAccountToken: false
      securityContext:
        runAsNonRoot: true
        runAsUser: 65532
        runAsGroup: 65532
        fsGroup: 65532
        seccompProfile:
          type: RuntimeDefault
      containers:
        - name: app
          image: 1234.dkr.ecr.us-east-1.amazonaws.com/app@sha256:fa3a3b0c2e1f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b
          imagePullPolicy: IfNotPresent
          ports: [{ containerPort: 3000 }]
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
            capabilities:
              drop: ["ALL"]
          resources:
            requests: { cpu: 250m, memory: 256Mi }
            limits: { cpu: 1, memory: 512Mi }
          volumeMounts:
            - { name: tmp, mountPath: /tmp }
      volumes:
        - name: tmp
          emptyDir: { medium: Memory, sizeLimit: 64Mi }

Drop all caps. Read-only root FS, with an emptyDir for /tmp because every Node runtime wants somewhere to write. automountServiceAccountToken: false because most app pods don’t need to talk to the Kubernetes API and the default behavior of mounting a token is a footgun. The image is pinned by digest, same as in the Dockerfile.

The CVE wall, and the wrong fix

Back to that Tuesday. The first reflex from the room was the wrong one. Someone suggested adding --ignore-unfixed to the Trivy step and getting the deploy through. I get it, the queue was stuck, native mobile submissions had cutoff windows for the App Store side. But quieting the scanner doesn’t fix the image. It just promotes those CVEs from “loud problem in CI” to “silent problem in prod”.

Real fix was a four-hour rebuild of the base. Moved the runtime stage to distroless, dropped the apt cache and the build toolchain, pinned every layer by digest. The scan after that came back almost empty, just one informational. Total deploy slip was about an afternoon. Lesson the team kept: the moment you start arguing with the scanner about whether a CVE “really matters”, you’ve lost the plot. Make the image small enough that the scanner has nothing to argue about.

When signing saved us

When prod is breaking and you’re trying to roll back, “what’s actually deployed” needs to be a question with a one-line answer. Signed images and immutable digests give you that. Tag-chasing doesn’t.

We had a near-miss on exactly this. An old image tag, never deleted from ECR, got accidentally referenced in a Helm values override during a hotfix. The admission policy rejected it. The deploy failed loud, the engineer rebuilt with the right digest, twenty minutes lost. Without the admission check that would’ve been an old image, running in prod, with whoever-knows-what inside it.

Takeaways

Multi-stage Dockerfiles. Distroless final stage. Pin every base by digest, never by tag.
Trivy in CI, fail on critical and high, do not ignore-unfixed away the problem. Produce an SBOM per image.
Sign with cosign keyless. Verify on admission, pinned to a specific workflow identity.
Pod Security Standards restricted, non-root, read-only root FS, drop all caps, no auto-mounted service tokens.
If your security story relies on a runtime sensor catching a bad image at runtime, you’ve already lost. Catch it in CI, refuse it at admission.

Thanks for reading. If you’ve got thoughts, send them my way.