How we carved a native mobile release flow out of a giant Rails monolith using Strangler Fig, bounded contexts, and a database split that did not blow up logins.
A Thursday afternoon at the creator economy platform I worked at. The branded-mobile-app pipeline was being run, in part, by a small army of mobile-CX engineers submitting Apple and Google builds by hand. The Rails monolith owned everything around it. Customer records, billing, content, push tokens. The release flow itself was a tangle of background jobs, ad-hoc rake tasks, and a Google Sheet nobody admitted to maintaining.
We wanted to ship native iOS and Android apps for thousands of creator-owned brands without that team. So we pulled the release flow out. Strangler Fig, bounded context drawn carefully, database split only when forced.
Opinion up front: do not decompose because microservices are fashionable. Decompose when one slice has a different cadence, workload shape, and blast radius than the rest.
The monolith-first counterargument is correct more often than people admit. If you cannot describe the seams in your domain on a whiteboard without arguing for an hour, you do not have bounded contexts. You have wishful thinking. Cutting that into services gives you the same confusion plus a network in the middle.
I have done greenfield microservices at scale. The combat-sports tournament platform I CTO’d in London ran on hundreds of services with Kafka as the backbone. Right call there because we knew the domain cold and the load shape demanded it. None of that was true on the creator-economy monolith. Fifteen years old, many repositories. Decomposition there was triage, not architecture astronautics.
Three signals said “this does not belong here”. Release cadence: the monolith deployed a few times a day, native apps ship on Apple’s and Google’s calendar, hours-to-days behind. Coupling those timelines meant every monolith deploy carried risk for mobile. Workload was IO-bound on external APIs with long-tailed retries, and Sidekiq queues for it blocked other work behind 30-second Apple calls. And the team running it was different. Mobile-CX, Fastlane, signing certs, a separate on-call. Native release became its own bounded context. Everything else stayed put.
Before any service got spun up, we sat down with the mobile lead and named the language. Not the database tables. The language. What is an “app version”? A “release”? A “submission”? When a creator changes their app icon, who owns it?
That conversation took a couple of weeks. Most valuable two weeks of the project, and the part most teams skip when they “go microservices”. It is why they end up with a distributed monolith six months later.
The output was a context map. BrandedApp, Release, Submission, ReviewStatus belonged to the new mobile-release context. Creator, Subscription, Customer stayed in the monolith. Every cross-context call went through a thin published interface, not direct table reads. At the digital product agency I led engineering at, we migrated a portfolio of legacy projects to Domain-Driven Design. The pattern that survived: bounded contexts first, ubiquitous language next, code last.
We did not rewrite. We strangled. Rails stayed in front. New code went into a NestJS service. Routes moved one at a time, behind a feature flag.
import { Injectable } from '@nestjs/common';
import { FastifyReply, FastifyRequest } from 'fastify';
import { FeatureFlags } from '../flags/feature-flags.service';
import { HttpProxy } from '../proxy/http-proxy.service';
@Injectable()
export class BrandedAppRouter {
constructor(
private readonly flags: FeatureFlags,
private readonly proxy: HttpProxy,
) {}
async route(req: FastifyRequest, reply: FastifyReply) {
const creatorId = req.headers['x-creator-id'] as string;
const useNewService = await this.flags.isEnabled(
'mobile_release.use_new_service',
{ creatorId },
);
const target = useNewService
? process.env.MOBILE_RELEASE_SVC_URL
: process.env.RAILS_MONOLITH_URL;
try {
return await this.proxy.forward(req, reply, target, {
timeoutMs: 8000,
retries: 0,
});
} catch (err) {
req.log.error({ err, creatorId, target }, 'branded_app_route_failed');
throw err;
}
}
}
No retries on the new path. The point is to fail fast and fall back to Rails by flipping the flag. A retry would hide problems we needed to see.
You can split the code in a week. Splitting the database is where projects die.
We did not split the Aurora cluster on day one. The new service read and wrote, for months, against the same PostgreSQL writer the monolith used, but only on tables it owned (releases, submissions, review_status, branded_app_builds). No cross-context joins. Reads of Creator or Subscription went through a thin Rails-side HTTP endpoint returning a stable DTO.
Only once that contract held did we move the mobile-release tables to a separate schema, then later a separate cluster. The cutover went through three deploys: shadow read, dual-write, then flip.
BEGIN;
INSERT INTO mobile_release.branded_app_builds
SELECT * FROM public.branded_app_builds
WHERE created_at >= :cutover_start
ON CONFLICT (id) DO NOTHING;
CREATE OR REPLACE VIEW public.branded_app_builds_v AS
SELECT * FROM mobile_release.branded_app_builds;
COMMIT;
ANALYZE mobile_release.branded_app_builds;
A separate deploy switched Rails to read from the view. The new service read the schema directly. Rollbackable in seconds.
Thanks for reading. If you’ve got thoughts, send them my way.