MINEDEV
AI & Human Oversight Hub
MINEDEV
AI & Human Oversight Research
MINEDEV Hub · Updated April 2026
Live Research
Research Dashboard · 2026

AI writes more code.
None of it solves the problem.

Peer-reviewed analysis of 12,384 developer interactions. The data is consistent: AI-generated code grows in volume, shrinks in quality, and fails end-to-end at every measure. Here is what the research actually shows.

Structural Erosion
Across all measured trajectories, 80% of AI code paths show measurable degradation in structural quality — increasing coupling, decreasing cohesion, growing inconsistency.
Structural Erosion
0%
of AI trajectories degrade
Code Bloat
AI-generated code verbosity increases 89.8% across interaction trajectories. Code grows in volume without corresponding improvements in output quality.
Code Bloat
0%
verbosity rising
Dev Concern
84.7% of developers in surveyed communities express concern about AI code quality. The trust gap between AI output and developer confidence continues to widen.
Dev Concern
0%
express quality concerns
End-to-End Solve
In SWE-bench evaluation, not a single AI agent completed a full iterative software engineering task end-to-end. Individual steps may be adequate but coherence across the lifecycle fails.
End-to-End Solve
0%
AI agents completed tasks
Verbosity Factor
AI-generated code is 2.2x more verbose than human-written equivalents — more code, same output. This bloat increases maintenance burden without adding functional value.
Verbosity Factor
0x
more code, same output
Erosion by Cluster
Verbosity Rise
89.8%
89.8%
Quality Concern
84.7%
84.7%
Structural Erosion
80%
80%
Maintainability Drop
65%
65%
Duplication Increase
57%
57%
End-to-End Solve
0%
0%
Agent vs Human
AI Solve
0%
Human
82%
Code Volume AI writes 2.2x more
AI 2.2xHuman 1x
Structural Quality 80% AI degradation
AI 20%Human 72%
Maintainability 65% drop with AI
AI 35%Human 68%
Developer Voice
What Developers Are Saying
Reddit & Hacker News · 18,421 posts
S
Senior Backend Eng.
Hacker News · H02
T+6mo
Copilot is solid for boilerplate. The moment you need architecture-level decisions, it falls apart. It suggested three different patterns in one file. The code ran, but it was a maintenance nightmare waiting to happen.
Baltes et al. 2026
E
Engineering Lead
Reddit · r/programming
T+4mo
Lines of code went up 40% over six months of AI-assisted development. Meaningful features shipped stayed flat. We were writing more code to do the same things, and the test suite took twice as long to run.
Baltes et al. 2026
C
CTO, Series B
Reddit · R11
T+2mo
Junior devs on our team rely on AI for everything. The code compiles but they cannot explain why half of it works. When production breaks, they are helpless. We are creating a generation of developers who can prompt but cannot programme.
Baltes et al. 2026
P
Principal Engineer
Hacker News · H14
T+5mo
I reviewed 200 pull requests with AI-generated code. The pattern: massive functions that should have been five smaller ones, duplicated logic across files, zero error handling beyond the happy path. AI does not write bad code. It writes mediocre code at scale.
Baltes et al. 2026
T
Tech Lead
Reddit · R18
T+3mo
Our tech debt sprint took three times longer because the AI-generated modules had no consistent patterns. One service used repositories, another direct DB calls, a third a custom ORM wrapper. All generated by the same tool over two months.
Baltes et al. 2026
S
Staff Engineer
Hacker News · H07
T+1mo
If you need to be an experienced engineer to use AI effectively, but you had to become experienced without AI doing all your work, how are we going to get new experienced engineers? That is not a hypothetical. That is a real problem developing right now.
Baltes et al. 2026
Global Context
Global AI Adoption in Development
High Adoption (50%+)
Medium (25-49%)
Lower (<25%)
Degradation Over Time
Verbosity Trajectory
INTERACTION TURN · 89.8% of trajectories show rising verbosity
Key Insights
01
AI code grows, never shrinks

Across all measured trajectories, AI-generated code shows a consistent tendency toward increasing volume. Rather than refactoring or simplifying, agents add new code to address issues even when removal would be more appropriate. Codebases bloat over time.

02
Structural quality erodes systematically

80% of AI coding trajectories show measurable degradation in structural quality. Increasing coupling, decreasing cohesion, growing inconsistency in architectural patterns. The compounding effect means each successive AI interaction builds on already-degraded foundations.

03
Zero agents completed end-to-end tasks

In SWE-bench evaluation, no agent completed a full iterative software engineering task end-to-end. Individual steps might be performed adequately. Agents failed to maintain coherence across the full development lifecycle.

04
The experience gap is widening

Senior developers can spot and correct AI issues. Junior developers increasingly rely on outputs they cannot evaluate. This creates a dangerous feedback loop: less experienced teams produce AI-assisted code with structural weaknesses they lack the experience to identify.

05
The 2026 shift: speed to quality

AI code carries 1.7x more issues and bugs than human-written code (CodeRabbit 2026). The industry trust gap is now too large to ignore. The frontier is no longer how fast we generate code. It is how confidently we can ship it. Human oversight is the differentiator.

Implications
What This Means for Your Organisation
!
AI is a tool, not a replacement

The evidence is consistent: AI excels at generating snippets and boilerplate, but cannot replace engineering judgement. Teams that treat AI as a capable junior requiring review benefit. Teams that treat it as a senior engineer face escalating quality debt.

!
Code review becomes more critical, not less

More code generated faster means more code requiring review. One team reported 30 pull requests per day with six reviewers. Quality gates must be strengthened, not relaxed. Architectural review boards and pair programming become essential safeguards.

!
Invest in your engineers, not just your tools

The developers who benefit most from AI are those with deep expertise who can evaluate, correct, and guide AI output. Investing in engineer development compounds with AI tools. Skipping that investment in favour of AI dependency creates a fragile organisation where no one truly understands the codebase.

Methodology
Research Sources
SourceMethodSampleKey Metric
Baltes et al. 2026 Qualitative analysis of developer discussions 1,154 posts · 15 threads 89.8% verbosity rise
SWE-bench Multi-Agent Standardised agent evaluation framework 2,294 engineering tasks 0% end-to-end solve
Reddit / Hacker News Thematic analysis across 15 codes 18,421 developer posts 84.7% quality concern
CodeRabbit 2026 State of AI vs Human Code Report Large-scale codebase audit 1.7x more bugs in AI code
Structural Analysis Code quality metric tracking 5,600+ code files 80% structural erosion