Async Request-Reply Pattern in Microservices

How 202 Accepted, webhooks, and WebSockets replaced a sync HTTP path that kept timing out under load on a real-time platform.

It was a Tuesday morning at the real-time trading platform I architected, and the export endpoint was timing out again. Not occasionally. About one in four calls. A backend lead pinged me on Slack with a screenshot of nginx returning 504s while the upstream service was still happily chewing on the request. The work was finishing. The HTTP connection just couldn’t wait that long. Classic case of the wrong shape of API for the workload, and I’d ack’d the original design six months earlier, so I owned the cleanup.

That cleanup is what this post is about. When a microservice operation takes longer than a polite HTTP round trip, you stop pretending it’s synchronous. You hand back a 202, give the caller a way to find out when it’s done, and you move on.

The shape of the problem

The naive version is everywhere. A client POSTs, the service does the work, returns 200 with the result. Fine for ~200ms operations. Falls apart the moment any step in the call graph crosses a load balancer timeout, a CDN timeout, a mobile carrier idle timeout, or a user’s patience.

You have three solid options when the work outlasts the request:

Return 202 Accepted and let the client poll a status endpoint.
Return 202 Accepted and call the client back via a webhook when done.
Return 202 Accepted and push the result over an open WebSocket.

Same starting move. Different finishing move. I’ve shipped all three in production and they each earn their place.

202 Accepted and polling

This is the default. Boring. Works. Easy to operate.

The server enqueues the work, returns a job handle, and exposes a status endpoint the client polls. Here’s a NestJS sketch I’d ship today.

import { Controller, Post, Get, Body, Param, HttpCode, Res } from '@nestjs/common';
import { Response } from 'express';
import { Queue } from 'bullmq';
import { randomUUID } from 'crypto';
import { JobsService } from './jobs.service';

@Controller('reports')
export class ReportsController {
  constructor(
    private readonly jobs: JobsService,
    private readonly queue: Queue,
  ) {}

  @Post()
  @HttpCode(202)
  async create(@Body() body: CreateReportDto, @Res() res: Response) {
    const jobId = randomUUID();
    await this.jobs.create(jobId, body);
    await this.queue.add('build-report', { jobId, ...body }, { jobId });

    res.setHeader('Location', `/reports/${jobId}`);
    res.setHeader('Retry-After', '2');
    res.json({ jobId, status: 'pending' });
  }

  @Get(':id')
  async status(@Param('id') id: string) {
    const job = await this.jobs.get(id);
    if (!job) return { status: 'unknown' };
    if (job.status === 'done') {
      return { status: 'done', result: job.resultUrl };
    }
    return { status: job.status, progress: job.progress ?? 0 };
  }
}

Two things that get missed and burn people. The Location header tells the client where to look, including across deployments and reverse proxies. The Retry-After header is a hint, not a contract. Mobile clients especially will mash that button. Cap the poll on the client and add jitter.

Trade-off: every active job is a stream of GETs against your status endpoint. Cache aggressively. Don’t hit the database on every poll. Stick a Redis layer in front of it keyed by job:{id}.

Webhook callbacks

Polling is fine when you control the client. When the caller is another service, especially across a team or a partner boundary, webhooks scale better. Server finishes the work, POSTs back to a URL the caller registered.

import { HttpService } from '@nestjs/axios';
import { Injectable, Logger } from '@nestjs/common';
import * as crypto from 'crypto';
import { lastValueFrom } from 'rxjs';

@Injectable()
export class WebhookDispatcher {
  private readonly log = new Logger(WebhookDispatcher.name);

  constructor(private readonly http: HttpService) {}

  async send(target: WebhookTarget, payload: object) {
    const body = JSON.stringify(payload);
    const sig = crypto
      .createHmac('sha256', target.secret)
      .update(body)
      .digest('hex');

    try {
      await lastValueFrom(
        this.http.post(target.url, body, {
          headers: {
            'content-type': 'application/json',
            'x-event-id': payload['eventId'],
            'x-signature': `sha256=${sig}`,
          },
          timeout: 10_000,
        }),
      );
    } catch (err) {
      this.log.warn({ msg: 'webhook delivery failed', url: target.url, err: err.message });
      throw err;
    }
  }
}

The dispatch goes through BullMQ with exponential backoff, a dead letter queue after 8 attempts, and an idempotency key on the receiver side. I’m not going to paste the whole worker. The shape that matters: the receiver MUST treat every delivery as potentially duplicate, because at-least-once is the only honest guarantee.

I learned that one the hard way on a native-billing rollout at the creator-economy platform I worked at. Apple’s SubscriptionRenewal server-to-server notification retried after our endpoint went over its 30 second deadline, our handler had no idempotency check, and a chunk of customers got billed twice with two competing subscription rows. Frontend “fix” went out the same hour, hid the duplicate row, didn’t fix the duplicate charge. Real fix was a unique constraint on (apple_original_transaction_id, notification_uuid), the handler returning 200 within 5 seconds by enqueueing the work to Sidekiq, and a reconciliation job to clean up the existing damage. Refunds took 4 days because Apple’s developer support API approves per transaction. Lesson, written in scar tissue: server-to-server retries from any upstream are non-negotiable. Idempotency isn’t optional, it’s the contract.

WebSocket notifications

When the client is a browser or a mobile app the user is actively staring at, polling feels laggy and webhooks aren’t an option. Open a WebSocket, subscribe to the job channel, push the result.

import { WebSocketGateway, OnGatewayConnection, SubscribeMessage, MessageBody, ConnectedSocket } from '@nestjs/websockets';
import { Socket } from 'socket.io';
import { JwtService } from '@nestjs/jwt';

@WebSocketGateway({ namespace: 'jobs', cors: { origin: process.env.WEB_ORIGIN } })
export class JobsGateway implements OnGatewayConnection {
  constructor(private readonly jwt: JwtService) {}

  async handleConnection(client: Socket) {
    try {
      const token = client.handshake.auth?.token;
      const claims = await this.jwt.verifyAsync(token);
      client.data.userId = claims.sub;
    } catch {
      client.disconnect(true);
    }
  }

  @SubscribeMessage('subscribe')
  async subscribe(@ConnectedSocket() client: Socket, @MessageBody() jobId: string) {
    if (!await this.canRead(client.data.userId, jobId)) {
      client.emit('error', { code: 'forbidden', jobId });
      return;
    }
    client.join(`job:${jobId}`);
  }

  private canRead(userId: string, jobId: string): Promise<boolean> {
    // ownership check against the job table
    return Promise.resolve(true);
  }
}

When the worker finishes the job, it publishes to a Redis pub/sub channel that the gateway is subscribed to, and the gateway emits to job:{jobId} rooms. Authorize on subscribe, not on emit. Drop the connection on a bad token. And please, give yourself a backoff strategy on the client.

When to pick which

Polling is great for backend-to-backend within a team. You control the cadence and the cache. Webhooks are great across team or partner boundaries, but only if the receiver does idempotency and you sign your payloads. WebSockets are great for user-facing real-time, but they cost more to operate and you need a backoff plan on the client before you ship.

You can also combine them. Return 202 with a Location for poll fallback, fire a webhook for any server caller, and push over WebSocket to the browser if one is open. The job’s state of truth lives in one place, the delivery channel is whichever the caller wired up.

Takeaways

If the operation can outlive the HTTP timeout, return 202 and a job handle. Don’t fight the request shape.
Polling is the default. Cache the status endpoint hard.
Webhooks need HMAC signatures, idempotency keys, and a real retry policy with a DLQ.
WebSockets need auth on subscribe, room-scoped emits, and client-side jittered backoff.
Any retry-capable channel will deliver duplicates. Make the receiver idempotent or get burned.
Pick by who the caller is: same team, partner, or browser.

Thanks for reading. If you’ve got thoughts, send them my way.