Architecture·12 February 2026·7 min read

From monolith to modular: the cuts that actually pay off

Most monolith decompositions fail at boundary selection. Here's the heuristic we use to find the cuts worth making — and the ones that look attractive but cost more than they return.

Trillion Thoughts Engineering

"We need to break up the monolith" is one of the most expensive sentences in software. Sometimes it's right. Often it's right in spirit and wrong in detail — the team picks the wrong boundaries, ships them, and discovers they've turned a slow monolith into a slow distributed system.

The heuristic that actually helps

Forget bounded contexts and DDD diagrams for a moment. The single most useful question to ask before extracting a service is:

"What happens if these two pieces of code are in different repositories, owned by different teams, deployed on different schedules — and have to evolve in lockstep anyway?"

If the answer is "nothing different, they don't have to evolve in lockstep", you have a real boundary. Cut.

If the answer is "we'd have to coordinate every change, version the contract, and run two-phase deploys", you don't have a boundary. You have two functions that talk to each other a lot. Leave them alone.

Cuts that almost always pay

Asynchronous, long-running work. Email pipelines, report generation, scheduled exports. The boundary is "this work doesn't need to share a database transaction with the request that triggered it." Almost free win.
Different scaling profiles. The image-processing path needs 16 GB of RAM and elastic capacity; the rest of your app needs 1 GB and steady traffic. Splitting them is a cost story, not an architecture story.
Different security boundaries. The piece of the system that handles PII or financial data benefits from running on a more locked-down host with different audit requirements.
Public APIs. If something is a contract you sell to third parties, give it its own deployment so internal refactors don't accidentally break paying customers.

Cuts that look attractive and aren't

"Microservice per domain entity" — UserService, OrderService, ProductService. This is the most common version of getting it wrong. Domain entities are heavily linked. You'll spend your life writing N+1 RPC calls.
"One service per team" — sounds neat, ignores conway's law. The right shape is the other way around: define the services first by their seams, then size the teams to fit.
"Anything that's slow" — slowness is a database problem more often than it's a service-boundary problem. Split a slow service and you usually get two slow services and a network hop between them.

The strangler fig, revisited

The strangler-fig pattern (route some traffic to the new service, keep the old service running, gradually shift) is correct and well-known. The part people skip is the gradual shift. Most teams ship the new service and then never finish the migration, ending up with two systems doing the same job.

Three rules from the field:

Decide the deprecation date before you start. Write it in the design doc. Tell the customer.
Add a deprecation log line on every call to the old path. Make the cost of leaving it behind visible.
Reserve a single sprint for the cutover. Not "as time allows". If you don't put it on the calendar, it doesn't happen.

The unglamorous truth

Most monoliths can stay monoliths for longer than people think. The ones we've helped successfully decompose share a few traits: clear ownership, async boundaries, an actual revenue or scaling reason, and the discipline to finish what they started. The ones that failed all believed that "splitting things up" was inherently good.

Distribution is a tax. Pay it on purpose, for things that earn it back.

More notes

Engineering8 min read

Saga patterns in Temporal

Distributed transactions don't go away when you split the monolith. Sagas — and Temporal's compensation model — are how we keep them sane in production.

Read note

Operations6 min read

Workflow automation isn't a no-code problem

The pitch is always 'drag boxes, ship workflows.' The reality is that durable, debuggable, multi-week processes need real engineering — even when the surface is visual.

Read note