Built on Y Build Build this app yourself — prompt to deployed, on your own domain. Start free
BuildShipCompareThe LabAbout Start building →
The Lab

We let 5 AI agents refactor the same codebase

Same repo, same task, five very different diffs.

Marcus Tan Founding Engineer, Y Build
Published May 24, 2026
13 min
read
cover · 1200×600

We handed five coding agents the same legacy repo and the same refactor brief, then diffed the results. The spread in approach — and in how much they broke — was the interesting part.

What we measured

Tests passing after, lines touched, and whether the public API stayed stable. One agent rewrote half the app; another made a surgical ten-line change. We show all five diffs.

Liked this teardown?
Get the next experiment the day it drops. One email a week, raw numbers included.
Written by
Marcus Tan Founding Engineer, Y Build

Marcus has shipped 40+ production apps with AI tooling and runs the Build Lab experiments — the timed, reproducible head-to-heads behind every Compare and Lab note. Previously built developer platforms at two YC startups.

40+ apps shipped 8 yrs full-stack Author · The Lab
More from Marcus → @marcustan github ↗

Keep reading

All of The Lab →
Build your own app
Free · no card
Start free →