API Gateway Patterns Compared

Kong, Envoy, and a custom BFF gateway, weighed against a real build-vs-buy call where rate limiting, JWT validation, and response aggregation drove the decision.

It was a Wednesday afternoon at the combat-sports tournament platform I CTO’d in London. We had hundreds of microservices behind one ELB and a sad little nginx tier that was doing six jobs badly. Auth was bolted on. Rate limits were per-service. Aggregating a single mobile screen meant the app calling six endpoints and stitching the JSON itself. The mobile lead pinged me at 3pm. “Can you just give me one endpoint that returns the whole match card.” Yeah. That’s the day the gateway conversation actually started.

I’ll lay out the argument up front, then unpack. Kong is fine if your gateway concern is mostly plugin-driven and you don’t want to write code. Envoy is the right choice when you actually need a data plane, sidecars, or fine-grained traffic policy. A custom BFF gateway is the right choice when your domain is the gateway, meaning aggregation, per-client shaping, and weird auth flows nobody else has. The mistake I see most often is teams reaching for Kong because it’s the default Helm chart, then ending up writing custom Lua plugins to do what a NestJS BFF would have done in 200 lines.

What the gateway actually does

Forget the marketing. In production a gateway does five things and you should rank them in this order before you pick anything.

Terminate TLS and route.
Validate auth, usually a JWT or a session token.
Rate limit per principal, not per IP.
Aggregate or shape responses for specific clients.
Provide observability hooks, request IDs, tracing headers, structured logs.

Kong does 1 through 3 with config. Envoy does 1 through 3 and 5 with config plus xDS. Neither does 4 well. Aggregation is product-specific and there’s no plugin that knows what your mobile home screen should look like.

Where Kong fits

Kong shines when the gateway is a thin layer. JWT plugin, rate-limit plugin, ACL plugin, done. At a regulated CPG manufacturer’s internal IT org I worked at, we ran Kong in front of a small fleet of internal NestJS services. The team that owned the gateway was two people. Kong gave them a UI, declarative config, and almost zero code to maintain.

# kong.yaml - declarative config
_format_version: "3.0"
services:
  - name: standings
    url: http://standings-svc.default.svc.cluster.local:8080
    routes:
      - name: standings-public
        paths: ["/v1/standings"]
        methods: ["GET"]
    plugins:
      - name: jwt
        config:
          claims_to_verify: ["exp"]
          key_claim_name: "iss"
      - name: rate-limiting
        config:
          minute: 600
          policy: redis
          redis_host: redis.default.svc.cluster.local
          fault_tolerant: true

The fault-tolerant flag matters. Without it, if Redis blips your gateway starts denying traffic. I’ve watched that exact failure mode kill a launch window. Always set it.

Where Kong falls down is when you need response shaping. The transformer plugins are fine for header rewrites and tiny body edits. They are not a tool for assembling a mobile feed from six services. If you find yourself writing custom Lua, you’ve outgrown Kong and you don’t realize it yet.

Where Envoy earns its keep

Envoy is closer to a real proxy you can program. At the real-time trading and charting platform I architected, we ran Envoy at the edge in front of a WebSocket gateway tier, partly because we needed per-connection rate limits at the listener level and partly because xDS let us push config changes without restarting pods. Around peak market open we’d see north of millions of concurrent connections, and any restart meant a reconnect storm.

# envoy.yaml - listener with JWT auth and rate limit
static_resources:
  listeners:
  - name: edge_listener
    address: { socket_address: { address: 0.0.0.0, port_value: 8443 } }
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          stat_prefix: edge
          route_config: { name: local, virtual_hosts: [ { name: be, domains: ["*"], routes: [ { match: { prefix: "/" }, route: { cluster: backend } } ] } ] }
          http_filters:
          - name: envoy.filters.http.jwt_authn
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.jwt_authn.v3.JwtAuthentication
              providers:
                primary:
                  issuer: "https://auth.example.com"
                  remote_jwks: { http_uri: { uri: "https://auth.example.com/.well-known/jwks.json", timeout: 2s }, cache_duration: { seconds: 300 } }
              rules: [ { match: { prefix: "/" }, requires: { provider_name: primary } } ]
          - name: envoy.filters.http.local_ratelimit
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
              stat_prefix: edge_rl
              token_bucket: { max_tokens: 200, tokens_per_fill: 200, fill_interval: 1s }
              filter_enabled: { runtime_key: edge_rl_enabled, default_value: { numerator: 100, denominator: HUNDRED } }
          - name: envoy.filters.http.router

Envoy will also handle a thing Kong won’t: per-route filter chains with conditional logic. That’s where you start paying for it, though. The config tax is real, and if your team is three engineers, Envoy is overkill. If your team owns a platform and ten product teams ship through it, Envoy pays for itself.

When you write the gateway yourself

This is the part most teams get wrong. Aggregation is not a gateway concern in the abstract. It’s a client concern. Your mobile app needs the match card assembled. Your web app needs something different. Your partner API needs a third shape. That’s the BFF pattern, and it lives in code.

// match-card.bff.ts - NestJS BFF endpoint
import { Controller, Get, Param, UseGuards } from '@nestjs/common';
import { JwtAuthGuard } from './auth/jwt.guard';
import { lastValueFrom, forkJoin, of, timeout, catchError } from 'rxjs';
import { HttpService } from '@nestjs/axios';

@Controller('mobile/v1/matches')
@UseGuards(JwtAuthGuard)
export class MatchCardController {
  constructor(private readonly http: HttpService) {}

  @Get(':id/card')
  async getCard(@Param('id') id: string) {
    const fanout$ = forkJoin({
      match: this.http.get(`http://match-svc/matches/${id}`).pipe(timeout(400), catchError(() => of({ data: null }))),
      fighters: this.http.get(`http://fighter-svc/matches/${id}/fighters`).pipe(timeout(400), catchError(() => of({ data: [] }))),
      odds: this.http.get(`http://odds-svc/matches/${id}`).pipe(timeout(250), catchError(() => of({ data: null }))),
      rankings: this.http.get(`http://ranking-svc/matches/${id}`).pipe(timeout(300), catchError(() => of({ data: null }))),
      tickets: this.http.get(`http://ticket-svc/matches/${id}/availability`).pipe(timeout(400), catchError(() => of({ data: null }))),
    });

    const data = await lastValueFrom(fanout$);

    if (!data.match.data) {
      // upstream failed, do NOT 500. Return what we have. The app degrades gracefully.
      return { id, status: 'partial', fighters: data.fighters.data };
    }

    return {
      id,
      match: data.match.data,
      fighters: data.fighters.data,
      odds: data.odds.data,
      rankings: data.rankings.data,
      tickets: data.tickets.data,
    };
  }
}

That’s 30 lines and it does what no off-the-shelf gateway plugin will give you. Per-call timeouts. Per-call fallback. A single client-shaped response. The hard part isn’t the code, it’s deciding what counts as a degraded response versus a failed one. We learned that the painful way.

How I’d choose today

If you’re a small team, one product, mostly REST, and your auth is a standard JWT, run Kong. The config is honest and the operational surface is small. If you’re a platform team supporting many product teams with traffic policy needs, sidecars, or service mesh ambitions, run Envoy. If your gateway concerns are aggregation, per-client shaping, or weird business-specific auth flows, write the BFF. Don’t try to make a plugin do it.

Most real systems end up with two of these in production. We had Envoy at the edge for TLS, rate limiting, JWT validation, and a NestJS BFF behind it for response shaping. Kong sat on the internal traffic between services. Three tools, three jobs. Nobody confused them.

Takeaways

Rank your gateway concerns: route, auth, rate limit, aggregate, observe. Pick tools per concern, not per vendor.
Kong is right for declarative, plugin-driven gateways with small ops surface. Wrong the moment you need response shaping.
Envoy earns its config tax when you have a platform team and real traffic policy needs.
Aggregation is a BFF concern. Keep it in code, per client, with per-call timeouts and graceful fallbacks.
Two gateways in production is normal. Don’t force one tool to do all five jobs.

Thanks for reading. If you’ve got thoughts, send them my way.