AI writes more code.
None of it solves the problem.
Peer-reviewed analysis of 12,384 developer interactions. The data is consistent: AI-generated code grows in volume, shrinks in quality, and fails end-to-end at every measure. Here is what the research actually shows.
Across all measured trajectories, AI-generated code shows a consistent tendency toward increasing volume. Rather than refactoring or simplifying, agents add new code to address issues even when removal would be more appropriate. Codebases bloat over time.
80% of AI coding trajectories show measurable degradation in structural quality. Increasing coupling, decreasing cohesion, growing inconsistency in architectural patterns. The compounding effect means each successive AI interaction builds on already-degraded foundations.
In SWE-bench evaluation, no agent completed a full iterative software engineering task end-to-end. Individual steps might be performed adequately. Agents failed to maintain coherence across the full development lifecycle.
Senior developers can spot and correct AI issues. Junior developers increasingly rely on outputs they cannot evaluate. This creates a dangerous feedback loop: less experienced teams produce AI-assisted code with structural weaknesses they lack the experience to identify.
AI code carries 1.7x more issues and bugs than human-written code (CodeRabbit 2026). The industry trust gap is now too large to ignore. The frontier is no longer how fast we generate code. It is how confidently we can ship it. Human oversight is the differentiator.
The evidence is consistent: AI excels at generating snippets and boilerplate, but cannot replace engineering judgement. Teams that treat AI as a capable junior requiring review benefit. Teams that treat it as a senior engineer face escalating quality debt.
More code generated faster means more code requiring review. One team reported 30 pull requests per day with six reviewers. Quality gates must be strengthened, not relaxed. Architectural review boards and pair programming become essential safeguards.
The developers who benefit most from AI are those with deep expertise who can evaluate, correct, and guide AI output. Investing in engineer development compounds with AI tools. Skipping that investment in favour of AI dependency creates a fragile organisation where no one truly understands the codebase.
| Source | Method | Sample | Key Metric |
|---|---|---|---|
| Baltes et al. 2026 | Qualitative analysis of developer discussions | 1,154 posts · 15 threads | 89.8% verbosity rise |
| SWE-bench Multi-Agent | Standardised agent evaluation framework | 2,294 engineering tasks | 0% end-to-end solve |
| Reddit / Hacker News | Thematic analysis across 15 codes | 18,421 developer posts | 84.7% quality concern |
| CodeRabbit 2026 | State of AI vs Human Code Report | Large-scale codebase audit | 1.7x more bugs in AI code |
| Structural Analysis | Code quality metric tracking | 5,600+ code files | 80% structural erosion |