The Refactor That Never Ships
Every engineering org has at least one. The "we're going to clean this up next quarter" project that's been on the roadmap for four years. The greenfield rewrite that's been 70% done since 2023. The migration nobody will call finished, because finishing would force a decision about the legacy system still running in production. We talk about these projects like they're underway — there's a Linear board, there are champions, there are quarterly updates — but somehow the old system is still there, the new one is still partial, and the question of which is the source of truth has gotten worse, not better. These refactors didn't fail. They never had a real chance, because they were optimistic from the start.
The Architecture of False Progress
From the outside, a stalled refactor looks like a project that needs more resources. Throw another engineer at it, give it a quarter of focused attention, hold a working session — surely it'll break loose. It rarely does, because the problem isn't underinvestment. The problem is that the project was structured in a way that makes false progress easy and real completion hard.
Real completion requires deleting the old system. False progress only requires building more of the new one.
So the new system grows. New endpoints. New tables. New consumers. Each addition is celebrated as a step forward. The old system, meanwhile, is untouched — because touching it requires coordinating across the teams that depend on it, and those teams have other priorities, and "we'll handle the cutover when the new system is ready" has been the answer for two years now. The new system is "ready" by every measure except the only one that matters: whether the old system can be turned off.
This is the architecture of false progress. The work that gets done is the work that's easy to do without anyone else's permission. The work that doesn't get done is the work that requires a real decision.
The Optimism Built In
Most stalled refactors are doomed at the kickoff meeting. You can usually tell from the language.
The pitch sounds something like: "We'll build the new system in parallel, gradually migrate consumers over, and eventually retire the old one." Every word in that sentence is a hand-wave. Gradually migrate — by who? Eventually retire — when, exactly, and triggered by what? The plan has the shape of a plan but contains no commitments.
The original justification is often aesthetic, not measurable. The old code is "messy." The architecture is "outdated." The framework is "legacy." None of these are problems you can declare solved. There's no specific user-facing pain that gets fixed when the refactor finishes, which means there's no specific moment when anyone says "okay, we're done." So the project drifts into the indefinite middle, where the new system is good enough to use, the old system is bad enough to complain about, and nobody has a reason to force a decision.
Compare that to a refactor born from a measurable trigger: "Our payment processing has had three production incidents this quarter caused by the legacy adapter; we're moving to the new gateway by end of Q2 because the next incident will cost us a customer." That refactor has a finish line. You'll know when it's done because the legacy adapter is gone from the codebase and the incident pattern has stopped. The aesthetic refactor has no equivalent ending.
The Parallel-System Trap
The cost of an unfinished refactor is rarely accounted for honestly.
On the surface, you've got slightly slower velocity for the team owning the refactor — they're splitting attention. That's the visible cost. The invisible costs are larger.
Every other team that touches either system pays a small tax. They have to ask which one to use, route around half-implemented features, work around behavior that exists in one system but not the other. New hires have to learn both. Bug reports come in and the first ten minutes of triage is figuring out which codebase the bug actually lives in. The on-call rotation has to know both systems intimately, because failures can come from either.
These costs compound silently. They don't show up on a dashboard. The team that owns the refactor doesn't pay them. They're paid in distributed fragments by everyone else, which makes the project look much cheaper than it actually is.
And because the costs are distributed, no single person is incentivized to push for completion. The team owning the refactor has the new system working well enough to defend their work. The teams paying the tax don't have the authority to declare the project finished. Leadership sees a quarterly update with green checkmarks and assumes things are on track. So the parallel system survives indefinitely. Five years from now, both systems will still be there, and the original engineers who started the refactor will be gone, and the next generation will inherit a codebase with two ways to do everything and no documentation explaining why.
The Three Preconditions
Every refactor needs three things at the start, or it shouldn't start.
1. A measurable trigger for "done." Not "the new system replaces the old" — that's a wish. "All writes go through the new system; the old read path is retired by Q3" — that's a finish line. If you can't write the trigger as a sentence with a date, you don't have a refactor; you have a feeling.
2. A scheduled cutover with an allocated downtime budget. The refactor that's "going to be backwards-compatible forever" is the one that never ends. Migration windows are unpopular, but they're the moment the old system actually dies. Put them on the calendar at the start, not when you "feel ready."
3. A name for the person who deletes the old code. This sounds trivial. It isn't. The old code stays in production until somebody is responsible for removing it, and that responsibility is almost never assigned. Name them on day one.
If you can't commit to all three at the start, you don't have a refactor — you have a parallel system you're going to maintain alongside the old one indefinitely. That's not a project; it's a tax. Once you see it that way, the honest call is often to kill the project and stop paying.
So Be Honest
The hardest thing about this isn't technical. It's organizational courage. Killing a refactor that's been going on for two years means admitting that two years of effort produced a parallel system rather than a replacement. Nobody wants to be the one who says that out loud.
But the alternative is worse. The parallel system isn't going to spontaneously resolve itself. Every quarter you don't kill it or finish it, it gets cheaper to keep maintaining and harder to ever delete. The longest-lived parallel systems I've seen are the ones where the original champions left the company without ever forcing the cutover, and the next generation inherited two systems and no mandate to consolidate.
It's a similar pattern to the one I wrote about in Premature Optimization Is a Lie We Tell Ourselves: a comfortable narrative ("we'll refactor it next quarter") lets a team avoid the harder work indefinitely, while still feeling like the right call is being made. The fix is the same. Be the person who forces the decision. Either ship the refactor — really ship it, with a date and a deletion — or stop the project, write down the lessons, and put the engineering capacity on something that can finish.
If you're staring at a refactor that's been "almost done" for too long, the question of whether to push it through or kill it is usually the harder problem. Get in touch if it's worth talking through.