Testing DDD With Ubiquitous Language

How to write DDD tests that read like the domain itself, using Given-When-Then naming, aggregate invariant tests, and domain-specific test builders that double as living documentation.

OK so the test that finally cracked it open for me was named it("does the thing"). We were in a review meeting at the digital product agency I led at in London, mid way through migrating a chunk of the portfolio to DDD, and a product manager I’d been working with for months was sitting in. She asked, totally innocently, “what does Order#submit actually do?” The senior backend lead opened the test file. The test names were should work, should also work, edge case 2. Yeah. We had 600 tests and not one of them was readable by the person who literally defined the business rules.

That afternoon I rewrote one file as an experiment. By the end of the week I’d convinced the team. By the end of the quarter we had a rule on every aggregate. Tests are the cheapest, most up to date documentation you’ll ever ship. They run on every PR. They fail when the domain drifts. The only reason most teams’ tests don’t read like the domain is that nobody insisted on it.

This is about how I do that now. With code, with the language we agreed with the business people, and with the test scaffolding that makes it sustainable past the second sprint.

Tests are domain documentation

Here’s the deal. If you’ve actually done event storming with a domain expert, you’ve got verbs. submit, cancel, refund, partially fulfill, place on hold. You’ve got nouns. Order, LineItem, ShippingAddress, Payment, Refund. And you’ve got rules. “An order can’t be submitted without at least one paid line item.” “A refund can’t exceed the captured amount.”

Those rules belong in the aggregate. The tests for those rules belong in a file whose name matches the aggregate. And every test name should be something the domain expert would read out loud and say “yes, that’s the rule.”

Here’s the shape I use. TypeScript, NestJS-shaped, but the pattern is language agnostic.

// orders/__tests__/order.submit.spec.ts
import { OrderBuilder } from "./builders/order.builder";
import { OrderSubmissionError } from "../order.errors";

describe("Order: submitting", () => {
  it("given an order with one paid line item, when submitted, then it becomes submitted", () => {
    const order = OrderBuilder.aDraftOrder()
      .withPaidLineItem({ sku: "SKU-001", qty: 2 })
      .build();

    const submittedAt = new Date();
    order.submit({ submittedAt });

    expect(order.status).toBe("submitted");
    expect(order.submittedAt).toEqual(submittedAt);
  });

  it("given an order with zero line items, when submitted, then it raises OrderSubmissionError", () => {
    const order = OrderBuilder.aDraftOrder().build();

    expect(() => order.submit({ submittedAt: new Date() }))
      .toThrow(new OrderSubmissionError("cannot submit an order with no line items"));
  });

  it("given an order with an unpaid line item, when submitted, then it raises OrderSubmissionError", () => {
    const order = OrderBuilder.aDraftOrder()
      .withUnpaidLineItem({ sku: "SKU-001", qty: 1 })
      .build();

    expect(() => order.submit({ submittedAt: new Date() }))
      .toThrow(/at least one paid line item/);
  });
});

Three things to call out. The describe is the aggregate plus the operation, not the class name. Tests are grouped by behavior, not by code structure. The it names use Given-When-Then literally. And nothing about Jest, mocks, or repository plumbing leaks into these tests. They read like rules.

Domain builders are the unlock

The reason most teams’ aggregate tests devolve into should work is that constructing a valid aggregate is a 40-line ceremony. People copy paste, then they hate it, then they extract a setupOrder() helper that takes 14 parameters, then they hate that too. The fix is a builder per aggregate, living in __tests__/builders/, that speaks the ubiquitous language.

// orders/__tests__/builders/order.builder.ts
import { Order } from "../../order";
import { OrderId } from "../../order-id";
import { LineItem } from "../../line-item";
import { Money } from "../../../shared/money";

export class OrderBuilder {
  private id: OrderId = OrderId.generate();
  private customerId: string = "cust_test";
  private lineItems: LineItem[] = [];
  private status: "draft" | "submitted" = "draft";

  static aDraftOrder(): OrderBuilder {
    return new OrderBuilder();
  }

  static aSubmittedOrder(): OrderBuilder {
    return new OrderBuilder().asSubmitted();
  }

  withPaidLineItem(args: { sku: string; qty: number; unitPrice?: number }): OrderBuilder {
    this.lineItems.push(
      LineItem.fromPersisted({
        sku: args.sku,
        qty: args.qty,
        unitPrice: Money.usd(args.unitPrice ?? 1000),
        paymentStatus: "paid",
      }),
    );
    return this;
  }

  withUnpaidLineItem(args: { sku: string; qty: number }): OrderBuilder {
    this.lineItems.push(
      LineItem.fromPersisted({
        sku: args.sku,
        qty: args.qty,
        unitPrice: Money.usd(1000),
        paymentStatus: "unpaid",
      }),
    );
    return this;
  }

  forCustomer(customerId: string): OrderBuilder {
    this.customerId = customerId;
    return this;
  }

  private asSubmitted(): OrderBuilder {
    this.status = "submitted";
    return this;
  }

  build(): Order {
    return Order.rehydrate({
      id: this.id,
      customerId: this.customerId,
      lineItems: this.lineItems,
      status: this.status,
    });
  }
}

The methods on the builder are the language of the domain. Not setStatus("submitted"). aSubmittedOrder(). Not addItem({...}). withPaidLineItem({...}) or withUnpaidLineItem({...}). The naming forces every contributor to think in domain terms before they can write a test, and the test reads back as a sentence.

The trick that took me too long to learn. Builders should call a private aggregate constructor (Order.rehydrate), not the public one. Public construction goes through the domain factory which enforces all invariants. Rehydration is the path from persistence, and it’s what lets you build the “given” state of an aggregate in a test without re-running its entire history. Mix those two up and your tests start failing for reasons that have nothing to do with the rule under test.

Invariant tests live with the aggregate

You can’t write a test for “this aggregate’s persistence change doesn’t take a long write lock on a hot table.” That’s an infrastructure concern. But you can absolutely test the aggregate’s own invariants in a way that survives every refactor and every migration. And that’s what I make every team do now.

// orders/__tests__/order.invariants.spec.ts
import { OrderBuilder } from "./builders/order.builder";

describe("Order: invariants", () => {
  it("never has negative total", () => {
    const order = OrderBuilder.aDraftOrder()
      .withPaidLineItem({ sku: "SKU-1", qty: 2, unitPrice: 500 })
      .withPaidLineItem({ sku: "SKU-2", qty: 1, unitPrice: 1500 })
      .build();

    expect(order.total.amount).toBeGreaterThanOrEqual(0);
  });

  it("refund amount never exceeds captured amount", () => {
    const order = OrderBuilder.aSubmittedOrder()
      .withPaidLineItem({ sku: "SKU-1", qty: 1, unitPrice: 2000 })
      .build();

    expect(() => order.refund({ amount: 2500, reason: "customer_request" }))
      .toThrow(/refund.*exceeds captured/);
  });

  it("cancelled order rejects further line item changes", () => {
    const order = OrderBuilder.aSubmittedOrder()
      .withPaidLineItem({ sku: "SKU-1", qty: 1 })
      .build();
    order.cancel({ reason: "customer_request" });

    expect(() => order.addLineItem({ sku: "SKU-2", qty: 1, unitPrice: 1000 }))
      .toThrow(/cancelled order/);
  });
});

These are the rules a domain expert can sit next to you and read. If she shakes her head at “refund amount never exceeds captured amount” because actually merchants can issue goodwill credits above captured, the test fails and that conversation happens at the aggregate, not three weeks later in support tickets.

Test behavior, not implementation

One more war story, then a rule. The combat sports tournament platform I CTO’d in London. We had a Ranking aggregate and a projector pushing rankings into Elasticsearch. A live federation broadcast on a Saturday night. The new champion’s ranking should have updated within minutes. Eight hours later, the page still showed the old number one. The athlete tweeted a screenshot of our broken rankings page tagging the federation.

The root cause was the indexer’s circuit breaker silently staying open after a transient ES blip. The tests for Ranking were beautiful, frankly. Lots of coverage on the aggregate. But every projector test was an implementation test. expect(projector.batchSize).toBe(100). expect(esClient.bulk).toHaveBeenCalledTimes(3). None of them tested the behavior we actually cared about, which was “after a new bout result is published, the rankings projection reflects it within N seconds.”

Now the rule on every team I lead. Aggregate tests test behavior at the aggregate boundary. They don’t mock the repository. They don’t assert on private state. They call a domain method, they check the resulting domain state, and they check the domain events that were emitted. That’s it.

it("when refunded, then emits OrderRefunded with the correct amount", () => {
  const order = OrderBuilder.aSubmittedOrder()
    .withPaidLineItem({ sku: "SKU-1", qty: 1, unitPrice: 2000 })
    .build();

  order.refund({ amount: 500, reason: "partial_damage" });

  const events = order.pullDomainEvents();
  expect(events).toEqual([
    expect.objectContaining({
      type: "OrderRefunded",
      payload: expect.objectContaining({ amount: 500, reason: "partial_damage" }),
    }),
  ]);
});

If the internal state changes from a status enum to a state machine class, this test doesn’t care. If the refund method starts emitting an additional OrderAuditLogged event for compliance, this test still passes, because we’re using objectContaining and we’re checking the rule we actually wrote down with the business, not the wiring.

Takeaways

  • Test names are the cheapest, most accurate domain documentation you can ship. Given-When-Then, every time.
  • A builder per aggregate. The methods speak the ubiquitous language, not the data shape. Build through rehydrate, not the public factory.
  • Invariant tests live next to the aggregate and protect the rules the business actually agreed to. Run them next to a domain expert if you can.
  • Test behavior at the aggregate boundary. Not private state, not mocked repositories. Domain method in, domain state plus domain events out.
  • If a product person reads your test names and frowns, fix the test name first and the rule second.

Thanks for reading. If you’ve got thoughts, send them my way.

© 2026 Akin Gundogdu. All Rights Reserved.