CSS Modules, Tailwind, StyleX, and vanilla-extract compared on bundle size, DX, design tokens, and migration. Why I keep landing on Tailwind plus a typed token layer for product teams.
A Friday afternoon at a live-video creator platform I led engineering at. Five PRs went out in one batch, finishing the migration of older Vue components onto the atomic CSS system and shared design system I’d introduced over the previous quarter. CI passed. Deploy ran. About 30 minutes later support pinged. Creator profile pages had no bio. The bio was in the DOM. A global CSS reset bundled with the design system was zeroing padding-top on <section> tags and collapsing the bio container to zero height. Several creators were already on Twitter asking what was going on.
That’s the kind of bug that decides which CSS architecture you pick for the next five years.
Look, I’ve shipped large frontends on four flavors of this. Hand-rolled CSS Modules with BEM conventions. Atomic CSS hand-written. Tailwind. A typed-tokens-plus-utility approach. I’ve evaluated StyleX and vanilla-extract seriously enough to know what they’re for. The honest take is that for product teams at scale, Tailwind plus a typed design-token layer wins on the boring axes that matter. Bundle size, refactor velocity, design-system cohesion, the speed at which a new engineer can ship a screen on day three. CSS Modules and vanilla-extract have a real place. StyleX is impressive engineering and the wrong default outside its native ecosystem.
CSS Modules optimize for local scoping with zero runtime. Tailwind for a constrained token surface composed at the call site. StyleX for atomic CSS extraction at build time with type-safe style objects. vanilla-extract for typed CSS-in-TS with zero-runtime extraction. They are not interchangeable, even though every comparison post pretends they are.
Bundle size, for an app of any real size, is mostly about whether styles deduplicate across the codebase. Tailwind and StyleX both atomize. CSS Modules and vanilla-extract do not, by design. On a multi-route product hitting millions of customers the difference shows up. Not in v1. In month 18 when half the team forgets to share a component and copies the styles.
The biggest mistake I see in large frontends is treating the CSS solution as the architecture. It isn’t. The architecture is your token layer. Whatever CSS engine sits under it is an implementation detail you can swap.
Shape I keep coming back to. A typed token module. Tailwind config, runtime helpers, and component primitives all consume it.
// src/design/tokens.ts
export const tokens = {
color: {
"brand.500": "var(--c-brand-500)",
"brand.600": "var(--c-brand-600)",
"neutral.50": "var(--c-neutral-50)",
"neutral.900": "var(--c-neutral-900)",
"danger.500": "var(--c-danger-500)",
"success.500": "var(--c-success-500)",
},
space: { xs: 4, sm: 8, md: 12, lg: 16, xl: 24, "2xl": 32 },
radius: { sm: 4, md: 8, lg: 12, pill: 9999 },
shadow: {
sm: "0 1px 2px rgba(0,0,0,0.06)",
md: "0 4px 12px rgba(0,0,0,0.08)",
},
} as const;
export type ColorToken = keyof typeof tokens.color;
export type SpaceToken = keyof typeof tokens.space;
export type RadiusToken = keyof typeof tokens.radius;
That file is the source of truth. Designers don’t ship a “new gray” by adding a hex code in a component. They add a token. Code review fails if you reach for a literal hex value in JSX.
Then the Tailwind preset reads from it. The whole point is that the engine is replaceable.
// tailwind.config.ts
import type { Config } from "tailwindcss";
import { tokens } from "./src/design/tokens";
export default {
content: ["./src/**/*.{ts,tsx,mdx}"],
theme: {
colors: tokens.color,
spacing: Object.fromEntries(
Object.entries(tokens.space).map(([k, v]) => [k, `${v}px`]),
),
borderRadius: Object.fromEntries(
Object.entries(tokens.radius).map(([k, v]) => [k, `${v}px`]),
),
boxShadow: tokens.shadow,
extend: {},
},
corePlugins: { preflight: false },
} satisfies Config;
preflight: false is intentional. A global reset on a partially-migrated codebase is the foot-gun that broke my profile pages. The design system owns its boundary, not the universe.
Next pattern that pays rent. Typed primitives accept tokens, not raw CSS. The caller writes intent, the primitive resolves to classes.
import { tokens, type ColorToken, type SpaceToken } from "@/design/tokens";
import { clsx } from "clsx";
import type { ReactNode } from "react";
type StackProps = {
gap?: SpaceToken;
align?: "start" | "center" | "end" | "stretch";
bg?: ColorToken;
children: ReactNode;
className?: string;
};
const gapClass: Record<SpaceToken, string> = {
xs: "gap-xs", sm: "gap-sm", md: "gap-md",
lg: "gap-lg", xl: "gap-xl", "2xl": "gap-2xl",
};
export function Stack({ gap = "md", align = "stretch", bg, children, className }: StackProps) {
return (
<div
className={clsx("flex flex-col", gapClass[gap], `items-${align}`, className)}
style={bg ? { background: tokens.color[bg] } : undefined}
>
{children}
</div>
);
}
<Stack gap="lg" bg="neutral.50"> reads like a spec. A junior on the team can ship a screen on day three without a single hex code, a single px value, or a Slack message asking which gray is “the right gray.” That is the architecture. The fact that it’s Tailwind underneath is incidental.
Back to the Friday bio outage. First move was a hotfix. A teammate pushed a padding-top override scoped to the bio container. It fixed the bio. Within an hour, three more reports landed for similarly-collapsed sections elsewhere on the profile and the discovery surface. The reset’s reach was bigger than we’d traced.
The actual fix was a rollback of the design-system bundle. We re-extracted the class additions from the five PRs without the global reset, scoped the reset to a data-design-system attribute boundary, re-shipped two days later. Added a visual-regression CI step against the most-trafficked routes with diff thresholds set deliberately tight. About two hours of broken profile pages, several creators publicly tweeted screenshots before the rollback landed.
The architectural lesson is the rule baked into the Tailwind config above. A design system is a boundary, not a global. If your CSS solution can paint outside its boundary by default, it will.
Different product. I was working on a React and TypeScript visual designer for a creator-economy platform’s branded-mobile-app builder. Creators dragged components together and got their own themed mobile apps. The team had started down a path where every component’s className was a deeply generic computed type, “to make sure designers don’t pass the wrong token.” We hit a bug where a button themed as brand.500 rendered as neutral.500. The types said it compiled.
What was actually wrong was a typo in the token-mapping table. The types were checking the shape, not the truth. The fix was a build-time integration check. A tiny script loaded the token module, rendered a known fixture into JSDOM, asserted the computed background matched the expected CSS variable. Half a day. Caught the regression. Generic-heavy types didn’t help because the bug wasn’t a type bug.
Spend your type budget on user-facing contracts. Spend your visual-regression budget on the rest. Don’t try to make the type system do what a fixture render can do in five seconds.
CSS Modules when the team is small, the design surface is narrow, the priority is “engineers who don’t want to think about CSS at all.” Underrated for a five-person team.
vanilla-extract when the team is deep in TypeScript and ships a component library other teams consume. Typed style objects with no runtime, the export-as-CSS story is genuinely good.
Skip StyleX unless you’re already inside its native ecosystem. Good engineering, real DX cost on the way in, and Tailwind plus a token layer gets you most of the benefit with a shorter on-ramp.
Don’t do the big bang. Pick a leaf surface. Move it. Watch CI for a week. Move the next one. The bio outage above was a five-PR batch on a Friday, that part wasn’t a CSS engine problem. That part was a process problem. Visual-regression snapshots are cheaper than the Twitter cleanup.
Thanks for reading. If you’ve got thoughts, send them my way.