--- Summary:

  • 外包编码给 AI 并不会必然降低质量;如果引入代理后代码和功能质量下降,问题应被视为流程设计失误,而不是工具原罪。作者的核心观点是:用代理交付更差的代码是一种选择,同样也可以选择交付更好的代码。
  • 作者用“技术债”框定“更好代码”的目标:很多债务并非复杂难题,而是“概念简单但耗时”的整理工作,例如 API 统一、术语重命名、重复功能合并、超大文件拆分。真正昂贵的往往不是理解问题,而是投入时间执行清理。
  • 编码代理特别适合处理这类枯燥的重构任务。做法是把任务丢给异步代理在分支或 worktree 中运行,再通过 Pull Request 审查结果:好就合并,接近就继续提示修正,差就直接丢弃。这显著降低了修复小问题和消除 code smell 的成本。
  • AI 还扩大了方案探索空间。它未必总能给出最创新的答案,但常能补足团队忽略的“明显且稳妥”的选择,也就是训练数据中常见、经过验证的“无聊技术”。这有助于在规划阶段减少因错误选型带来的长期技术债。
  • 更重要的是,代理降低了“探索性原型”与实验验证的门槛。面对如 Redis 是否适合高并发活动流之类的问题,不必靠拍脑袋决策,而可以让代理快速搭建模拟与压测环境,并行测试多个方案,用实验结果支撑技术选择。
  • 作者最后强调“复利式工程循环”:代理会遵循指令,因此团队应在每次项目结束后复盘,把有效做法沉淀成未来可复用的指令与流程。随着这些经验不断累积,代码质量改进将产生复利效应,使“持续提升质量”和“持续交付新功能”可以同时实现。

--- Full Article:

Many developers worry that outsourcing their code to AI tools will result in a drop in quality, producing bad code that’s churned out fast enough that decision makers are willing to overlook its flaws.

If adopting coding agents demonstrably reduces the quality of the code and features you are producing, you should address that problem directly: figure out which aspects of your process are hurting the quality of your output and fix them.

Shipping worse code with agents is a choice. We can choose to ship code that is better instead.

Avoiding taking on technical debt#

I like to think about shipping better code in terms of technical debt. We take on technical debt as the result of trade-offs: doing things “the right way” would take too long, so we work within the time constraints we are under and cross our fingers that our project will survive long enough to pay down the debt later on.

The best mitigation for technical debt is to avoid taking it on in the first place.

In my experience, a common category of technical debt fixes is changes that are simple but time-consuming.

  • Our original API design doesn’t cover an important case that emerged later on. Fixing that API would require changing code in dozens of different places, making it quicker to add a very slightly different new API and live with the duplication.
  • We made a poor choice naming a concept early on - teams rather than groups for example - but cleaning up that nomenclature everywhere in the code is too much work so we only fix it in the UI.
  • Our system has grown duplicate but slightly different functionality over time which needs combining and refactoring.
  • One of our files has grown to several thousand lines of code which we would ideally split into separate modules.

All of these changes are conceptually simple but still need time dedicated to them, which can be hard to justify given more pressing issues.

Coding agents can handle these for us#

Refactoring tasks like this are an ideal application of coding agents.

Fire up an agent, tell it what to change and leave it to churn away in a branch or worktree somewhere in the background.

I usually use asynchronous coding agents for this such as Gemini Jules, OpenAI Codex web, or Claude Code on the web. That way I can run those refactoring jobs without interrupting my flow on my laptop.

Evaluate the result in a Pull Request. If it’s good, land it. If it’s almost there, prompt it and tell it what to do differently. If it’s bad, throw it away.

The cost of these code improvements has dropped so low that we can afford a zero tolerance attitude to minor code smells and inconveniences.

AI tools let us consider more options#

Any software development task comes with a wealth of options for approaching the problem. Some of the most significant technical debt comes from making poor choices at the planning step - missing out on an obvious simple solution, or picking a technology that later turns out not to be exactly the right fit.

LLMs can help ensure we don’t miss any obvious solutions that may not have crossed our radar before. They’ll only suggest solutions that are common in their training data but those tend to be the Boring Technology that’s most likely to work.

More importantly, coding agents can help with exploratory prototyping.

The best way to make confident technology choices is to prove that they are fit for purpose with a prototype.

Is Redis a good choice for the activity feed on a site which expects thousands of concurrent users?

The best way to know for sure is to wire up a simulation of that system and run a load test against it to see what breaks.

Coding agents can build this kind of simulation from a single well crafted prompt, which drops the cost of this kind of experiment to almost nothing. And since they’re so cheap we can run multiple experiments at once, testing several solutions to pick the one that is the best fit for our problem.

Embrace the compound engineering loop#

Agents follow instructions. We can evolve these instructions over time to get better results from future runs, based on what we’ve learned previously.

Dan Shipper and Kieran Klaassen at Every describe their company’s approach to working with coding agents as Compound Engineering. Every coding project they complete ends with a retrospective, which they call the compound step where they take what worked and document that for future agent runs.

If we want the best results from our agents, we should aim to continually increase the quality of our codebase over time. Small improvements compound. Quality enhancements that used to be time-consuming have now dropped in cost to the point that there’s no excuse not to invest in quality at the same time as shipping new features. Coding agents mean we can finally have both.