Kong, Envoy, and a custom BFF gateway, weighed against a real build-vs-buy call where rate limiting, JWT validation, and response aggregation drove the decision.
It was a Wednesday afternoon at the combat-sports tournament platform I CTO’d in London. We had hundreds of microservices behind one ELB and a sad little nginx tier that was doing six jobs badly. Auth was bolted on. Rate limits were per-service. Aggregating a single mobile screen meant the app calling six endpoints and stitching the JSON itself. The mobile lead pinged me at 3pm. “Can you just give me one endpoint that returns the whole match card.” Yeah. That’s the day the gateway conversation actually started.
I’ll lay out the argument up front, then unpack. Kong is fine if your gateway concern is mostly plugin-driven and you don’t want to write code. Envoy is the right choice when you actually need a data plane, sidecars, or fine-grained traffic policy. A custom BFF gateway is the right choice when your domain is the gateway, meaning aggregation, per-client shaping, and weird auth flows nobody else has. The mistake I see most often is teams reaching for Kong because it’s the default Helm chart, then ending up writing custom Lua plugins to do what a NestJS BFF would have done in 200 lines.
Forget the marketing. In production a gateway does five things and you should rank them in this order before you pick anything.
Kong does 1 through 3 with config. Envoy does 1 through 3 and 5 with config plus xDS. Neither does 4 well. Aggregation is product-specific and there’s no plugin that knows what your mobile home screen should look like.
Kong shines when the gateway is a thin layer. JWT plugin, rate-limit plugin, ACL plugin, done. At a regulated CPG manufacturer’s internal IT org I worked at, we ran Kong in front of a small fleet of internal NestJS services. The team that owned the gateway was two people. Kong gave them a UI, declarative config, and almost zero code to maintain.
# kong.yaml - declarative config
_format_version: "3.0"
services:
- name: standings
url: http://standings-svc.default.svc.cluster.local:8080
routes:
- name: standings-public
paths: ["/v1/standings"]
methods: ["GET"]
plugins:
- name: jwt
config:
claims_to_verify: ["exp"]
key_claim_name: "iss"
- name: rate-limiting
config:
minute: 600
policy: redis
redis_host: redis.default.svc.cluster.local
fault_tolerant: true
The fault-tolerant flag matters. Without it, if Redis blips your gateway starts denying traffic. I’ve watched that exact failure mode kill a launch window. Always set it.
Where Kong falls down is when you need response shaping. The transformer plugins are fine for header rewrites and tiny body edits. They are not a tool for assembling a mobile feed from six services. If you find yourself writing custom Lua, you’ve outgrown Kong and you don’t realize it yet.
Envoy is closer to a real proxy you can program. At the real-time trading and charting platform I architected, we ran Envoy at the edge in front of a WebSocket gateway tier, partly because we needed per-connection rate limits at the listener level and partly because xDS let us push config changes without restarting pods. Around peak market open we’d see north of millions of concurrent connections, and any restart meant a reconnect storm.
# envoy.yaml - listener with JWT auth and rate limit
static_resources:
listeners:
- name: edge_listener
address: { socket_address: { address: 0.0.0.0, port_value: 8443 } }
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: edge
route_config: { name: local, virtual_hosts: [ { name: be, domains: ["*"], routes: [ { match: { prefix: "/" }, route: { cluster: backend } } ] } ] }
http_filters:
- name: envoy.filters.http.jwt_authn
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.jwt_authn.v3.JwtAuthentication
providers:
primary:
issuer: "https://auth.example.com"
remote_jwks: { http_uri: { uri: "https://auth.example.com/.well-known/jwks.json", timeout: 2s }, cache_duration: { seconds: 300 } }
rules: [ { match: { prefix: "/" }, requires: { provider_name: primary } } ]
- name: envoy.filters.http.local_ratelimit
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
stat_prefix: edge_rl
token_bucket: { max_tokens: 200, tokens_per_fill: 200, fill_interval: 1s }
filter_enabled: { runtime_key: edge_rl_enabled, default_value: { numerator: 100, denominator: HUNDRED } }
- name: envoy.filters.http.router
Envoy will also handle a thing Kong won’t: per-route filter chains with conditional logic. That’s where you start paying for it, though. The config tax is real, and if your team is three engineers, Envoy is overkill. If your team owns a platform and ten product teams ship through it, Envoy pays for itself.
This is the part most teams get wrong. Aggregation is not a gateway concern in the abstract. It’s a client concern. Your mobile app needs the match card assembled. Your web app needs something different. Your partner API needs a third shape. That’s the BFF pattern, and it lives in code.
// match-card.bff.ts - NestJS BFF endpoint
import { Controller, Get, Param, UseGuards } from '@nestjs/common';
import { JwtAuthGuard } from './auth/jwt.guard';
import { lastValueFrom, forkJoin, of, timeout, catchError } from 'rxjs';
import { HttpService } from '@nestjs/axios';
@Controller('mobile/v1/matches')
@UseGuards(JwtAuthGuard)
export class MatchCardController {
constructor(private readonly http: HttpService) {}
@Get(':id/card')
async getCard(@Param('id') id: string) {
const fanout$ = forkJoin({
match: this.http.get(`http://match-svc/matches/${id}`).pipe(timeout(400), catchError(() => of({ data: null }))),
fighters: this.http.get(`http://fighter-svc/matches/${id}/fighters`).pipe(timeout(400), catchError(() => of({ data: [] }))),
odds: this.http.get(`http://odds-svc/matches/${id}`).pipe(timeout(250), catchError(() => of({ data: null }))),
rankings: this.http.get(`http://ranking-svc/matches/${id}`).pipe(timeout(300), catchError(() => of({ data: null }))),
tickets: this.http.get(`http://ticket-svc/matches/${id}/availability`).pipe(timeout(400), catchError(() => of({ data: null }))),
});
const data = await lastValueFrom(fanout$);
if (!data.match.data) {
// upstream failed, do NOT 500. Return what we have. The app degrades gracefully.
return { id, status: 'partial', fighters: data.fighters.data };
}
return {
id,
match: data.match.data,
fighters: data.fighters.data,
odds: data.odds.data,
rankings: data.rankings.data,
tickets: data.tickets.data,
};
}
}
That’s 30 lines and it does what no off-the-shelf gateway plugin will give you. Per-call timeouts. Per-call fallback. A single client-shaped response. The hard part isn’t the code, it’s deciding what counts as a degraded response versus a failed one. We learned that the painful way.
If you’re a small team, one product, mostly REST, and your auth is a standard JWT, run Kong. The config is honest and the operational surface is small. If you’re a platform team supporting many product teams with traffic policy needs, sidecars, or service mesh ambitions, run Envoy. If your gateway concerns are aggregation, per-client shaping, or weird business-specific auth flows, write the BFF. Don’t try to make a plugin do it.
Most real systems end up with two of these in production. We had Envoy at the edge for TLS, rate limiting, JWT validation, and a NestJS BFF behind it for response shaping. Kong sat on the internal traffic between services. Three tools, three jobs. Nobody confused them.
Thanks for reading. If you’ve got thoughts, send them my way.