Why I run Zod end-to-end in NestJS instead of class-validator, how the OpenAPI document becomes the contract, and the bugs that taught me to treat schemas as the source of truth.
Wednesday afternoon. A creator on the platform I was working at had filed a support ticket that read, more or less, “all my customers got charged twice this month and the app shows two active subscriptions.” We pulled the logs. Apple’s SubscriptionRenewal server-to-server notification had been retried after our endpoint returned a 200 OK a second past the 30s deadline. Our handler had a typed DTO. It compiled. It looked clean. It had no idempotency check, because the type system never asked for one.
That’s the part of “type-safe API design” people skip. The types are the contract you ship to the frontend, sure, but they’re also the contract you ship to yourself. And if the types only describe the shape, not the invariants, the compiler is lying to you with a straight face.
This is how I run NestJS APIs now. Schema-first with Zod, OpenAPI generated from the same schemas, client SDKs generated from the OpenAPI document. One source of truth. Not three.
NestJS ships with class-validator and class-transformer. They’re fine. I used them for years across hundreds of microservices on the federation platform I CTO’d, plenty of internal services at the creator-video startup I led, and most of the NestJS pieces inside the creator-economy platform’s adjacent products.
But the moment you want the same schema to validate at runtime AND produce a TypeScript type AND drive your OpenAPI document AND generate a frontend SDK, class-validator starts fighting you. Decorators describe the validation. The type comes from the class shape. The OpenAPI metadata needs a third set of decorators. Then someone changes one and forgets the other two, and your “type-safe” API is type-safe in the sense that it compiles.
Zod gives you all four off a single schema definition. Inferred type, runtime validator, OpenAPI doc, SDK. I’d rather maintain one thing than three things pretending to be the same thing.
The core pattern is small. Schemas live in a contracts/ folder. Every controller binds a pipe that runs the schema at runtime. The inferred type flows through the handler.
import { z } from 'zod';
export const CreateOrderInput = z.object({
idempotencyKey: z.string().uuid(),
customerId: z.string().uuid(),
items: z.array(
z.object({
productId: z.string().uuid(),
quantity: z.number().int().min(1).max(999),
}),
).min(1),
currency: z.enum(['USD', 'EUR', 'GBP', 'TRY']),
});
export type CreateOrderInput = z.infer<typeof CreateOrderInput>;
export const OrderResponse = z.object({
id: z.string().uuid(),
status: z.enum(['pending', 'paid', 'failed']),
totalCents: z.number().int(),
createdAt: z.string().datetime(),
});
export type OrderResponse = z.infer<typeof OrderResponse>;
That’s it. One file. The frontend imports the type. The backend imports the validator. The OpenAPI generator reads the schema. Nobody invents a fourth representation.
The validation pipe is a one-time write that you’ll never touch again. Mine looks like this.
import {
PipeTransform,
Injectable,
BadRequestException,
ArgumentMetadata,
} from '@nestjs/common';
import { ZodSchema, ZodError } from 'zod';
@Injectable()
export class ZodValidationPipe implements PipeTransform {
constructor(private readonly schema: ZodSchema) {}
transform(value: unknown, _metadata: ArgumentMetadata) {
try {
return this.schema.parse(value);
} catch (err) {
if (err instanceof ZodError) {
throw new BadRequestException({
message: 'Validation failed',
issues: err.issues.map((i) => ({
path: i.path.join('.'),
code: i.code,
message: i.message,
})),
});
}
throw err;
}
}
}
The wiring in a controller is plain decorator code. No magic.
import { Body, Controller, Post, UsePipes } from '@nestjs/common';
import { CommandBus } from '@nestjs/cqrs';
import { ZodValidationPipe } from '../pipes/zod-validation.pipe';
import { CreateOrderInput, OrderResponse } from '../contracts/order.contract';
import { PlaceOrderCommand } from './commands/place-order.command';
@Controller('orders')
export class OrdersController {
constructor(private readonly commands: CommandBus) {}
@Post()
@UsePipes(new ZodValidationPipe(CreateOrderInput))
async create(@Body() input: CreateOrderInput): Promise<OrderResponse> {
return this.commands.execute(
new PlaceOrderCommand(
input.idempotencyKey,
input.customerId,
input.items,
input.currency,
),
);
}
}
input is typed by inference from the schema, not by hand. If I add a field to the Zod object, the handler argument widens automatically and the SDK regenerates on the next CI run.
I treat the generated OpenAPI document as the contract. Not the docs we wrote. Not the Notion page. The JSON.
@anatine/zod-openapi and nestjs-zod-openapi both work. I use the former because it gives me direct control over the OpenAPI builder, and I want to commit the resulting openapi.json into the repo so PRs show schema diffs.
import { extendApi, generateSchema } from '@anatine/zod-openapi';
import { OpenApiBuilder } from 'openapi3-ts/oas31';
import { CreateOrderInput, OrderResponse } from '../contracts/order.contract';
const builder = new OpenApiBuilder()
.addInfo({ title: 'Orders API', version: '1.4.0' })
.addSchema('CreateOrderInput', generateSchema(extendApi(CreateOrderInput)))
.addSchema('OrderResponse', generateSchema(extendApi(OrderResponse)))
.addPath('/orders', {
post: {
operationId: 'createOrder',
requestBody: {
required: true,
content: { 'application/json': { schema: { $ref: '#/components/schemas/CreateOrderInput' } } },
},
responses: {
'201': {
description: 'Order created',
content: { 'application/json': { schema: { $ref: '#/components/schemas/OrderResponse' } } },
},
},
},
});
export const openapiDocument = builder.getSpec();
CI writes openapi.json on every build. The frontend team’s client generator (we use openapi-typescript plus a thin TanStack Query wrapper) reads that file. If a PR changes a schema, the generated SDK changes, and the frontend’s typecheck fails. The contract break is visible at the diff, not at runtime.
That’s the part I want from “end-to-end type safety.” Not a fancy phrase. A failing CI job when someone removes a field.
We once had a quieter version of this same problem on the branded-mobile-app pipeline at the creator platform I was at. Branded mobile apps, Apple Store submissions automated with Rails plus Python plus Fastlane on GitHub Actions. Hundreds of branded app releases a week. The pipeline had been in production for months and felt boring, in the good way.
On a Wednesday morning the pending_apple_review queue started backing up. By lunch, a few hundred customer app builds were stuck in “Waiting for Review” on App Store Connect, but our internal pipeline thought they were submitted successfully. Apple’s Connect API was silently throttling our submit endpoint, returning 200 OK with a body that parsed cleanly against our DTO, but the submission was being dropped on their side.
First wrong fix: we had auto-retry on 5xx. We extended it to retry on “stuck” state too. That made it worse. Apple started seeing what looked like duplicate submissions, and a chunk of customers ended up with two competing review records and conflicting metadata. The retry treated 200 OK as truth, because the type said so.
Real fix: pulled the retry. Added a circuit breaker that verified submission state via a separate GET against the App Store Connect resource. Not via the response of the POST. Wrote a reconciliation job with an idempotency key derived from app_id + version + git_sha to dedupe.
The lesson is the same one I now bake into every NestJS schema. When the upstream is human-moderated, the response body is not the truth. The remote resource is. Your response DTO is a shape, not a fact. Three days of slipped releases for that lesson. Cheap, in retrospect.
Three things, every NestJS service I ship.
Idempotency keys on every mutating endpoint, surfaced in the Zod schema as a required field. Not optional. Not “we’ll add it later”. If the contract requires an idempotency key, the type forces the caller to provide one. Forgetting it becomes a typecheck error, not a production incident.
Response schemas validated on the way out, not just on the way in. NestJS interceptors are made for this. One interceptor per controller method runs the response through OrderResponse.parse() and refuses to return drift. Costs a few microseconds. Catches the kind of silent shape regression that ate me alive on the branded-app pipeline.
Schema versioning in the path. Not in headers, not negotiated, not magic. /v1/orders and /v2/orders are different schemas, different SDK exports, different deprecation timelines. The frontend team can pin to a version and move at their own pace.
openapi.json in CI and commit it. Schema diffs show up in PR review.Thanks for reading. If you’ve got thoughts, send them my way.