A radical organisation design prediction: the death of permanent cross-functional teams, replaced by fluid, project- based assembly of AI-augmented generalists and on-demand human specialists for validation.
"I think that engineers will become a gig economy inside of enterprises. And I don't think it's just engineers. I think that all specialties — from finance to product management to design — all of them are gonna become like a gig economy." — Steve Yegge
Yegge isn't talking about Uber-style freelancing. He's describing an internal consulting model: AI handles baseline work across all specialties, while human experts are "reserved" for short bursts — you book an engineer for a week, a PM for a few days, a security person for a review. The specialist's job shifts from doing the work to validating AI work. Everyone becomes a junior specialist in everything via AI, but you still need seniors to validate.
The analogy: security teams and Google's Launch Coordination Engineers. They don't sit on your team permanently — you book time with them when you need their expertise. Every specialty becomes like this.
The concern (from Brendan Hopper at Commonwealth Bank): this creates a centrifuge effect. The people who rise are those good at managing multiple AIs simultaneously, handling cognitive overhead of parallel work streams, and coordinating with other humans who are also managing AIs. Who loses: people who can't or won't adapt — regardless of seniority or credentials.
Whether or not this prediction plays out, enterprises face immediate questions they can't yet answer — about governance, accountability, and how to safely adopt tools that are already reshaping how code gets written.
This document maps the governance gap, the patterns practitioners are discovering, and the hard questions organizations must answer before adoption at scale.
Development is moving from "human writes code" to "human describes intent, AI implements." The developer becomes a senior designer pitching ideas to a tireless, literal-minded junior. Efficiency gains are real and significant — but the downstream effects are largely unexamined.
- The prediction: Specialists become internal consultants, booked for validation, not execution
- The gap: No playbooks, no maturity models, no governance frameworks — yet
- The patterns: Adversarial models, verification loops, checkpointing, hierarchical agents
- The risk: Comprehension debt, haunted codebases, skill atrophy
- The advice: Learn on your own time now, be conservative at work, stay ahead of the curve
The DevOps revolution came with playbooks, maturity models, and transformation guides. Agentic AI has none of that yet — we're in a Wild West phase. Enterprises are left with vendor hype on one side and unanswered governance questions on the other.
| What exists | What's missing |
|---|---|
| Tool vendors selling productivity | Governance frameworks for AI-assisted development |
| Studies measuring AI impact | Process transformation playbooks (like DevOps, SAFe) |
| NIST AI Risk Management Framework, EU AI Act compliance | SDLC integration patterns for agentic coding |
| Scattered enterprise case studies | Certifications, training programs |
| Addy Osmani's individual practices | Team/organisation adoption frameworks |
| Headlines about 10x productivity | Realistic metrics and expectations |
| Token cost calculators | ROI frameworks accounting for review overhead |
Key stat: Only 9% of enterprises have reached a "Ready" level of AI governance maturity (Deloitte > 2025)
Early adopters pushing hardest:
- Solo devs / indie hackers (Yegge, Willison) who can absorb the chaos personally
- Startups with greenfield codebases and high-risk tolerance
- AI companies themselves (Anthropic says 90% of Claude Code is written by Claude Code — but they have exceptional AI expertise in-house)
Where most enterprises are stuck:
- GitHub Copilot autocomplete
- ChatGPT as a fancy Stack Overflow
- Maybe Cursor for the adventurous
The shadow AI problem: Enterprises can't realistically prevent engineers from using AI tools. If companies don't provide sanctioned options, developers will use them at home — on personal devices, with company code, through consumer APIs whose terms allow training on inputs. IP, trade secrets, and potentially GDPR-regulated PII leak not through malice but through convenience. For most enterprises, this is probably already happening.
Offering outdated or underpowered models doesn't solve the problem. If the sanctioned tools can't keep up with what developers can access on their own, they'll route around them. The question isn't "should we adopt?" but "how do we provide capable, current tools with proper guardrails so developers can work safely — before the unsanctioned usage causes real damage?"
People are experimenting with several approaches to manage AI-assisted development:
- Adversarial models — Have a different model (or same model with different prompt/persona) review the output. The theory is that errors don't correlate perfectly across models. Addy Osmani does this: he routinely spawns a second AI session and asks it to critique code produced by the first.
- Formal verification loops — Let the AI generate, but gate it behind things that can't be faked: compiler passes, test suites, type checks, linting. The AI can iterate until those pass. This is what agentic coding tools like Claude Code already do.
- State checkpointing — Git as the undo mechanism. Commit constantly, so you can always roll back when the AI goes off the rails. Steve Yegge's "Beads" system is basically external memory + state management for agents.
- Hierarchical agents — A "planner" agent that breaks down work, "worker" agents that execute, and a "reviewer" agent that validates. Stability comes from separation of concerns.
Unsolved problem: Drift — over many iterations, the codebase slowly becomes something no one (human or AI) fully understands. This is Yegge's "haunted codebase" problem.
Less established than developer patterns — most are hypothetical or ad-hoc:
- AI code review policies — Define what level of human review AI PRs require. Some orgs mandate senior review for all AI-generated code.
- Prompt audit trails — Capture prompts, context, and iterations as part of compliance. Treat AI interactions like you'd treat production logs.
- Ownership redefinition — Clarify who owns AI-generated code. The prompter? The reviewer? The team?
- Knowledge transfer rituals — Require authors to explain AI-generated code in PRs or docs, even if they didn't write it line-by-line.
A term gaining traction, defined as "the future cost developers will pay to understand, modify, and debug code they did not write, which was generated by a machine."
The concern: "immediate, measurable velocity gains at the individual developer level are creating a hidden, compounding liability at the system and organizational level."
From an indie game dev team study: "AI helps teams build systems more sophisticated than their independent skill level can create or maintain. This paradox — possessing functional systems the team incompletely understands — creates fragility and AI dependency."
The main proposed solution to "vibe coding chaos." Definition: "writing a 'spec' before writing code with AI. The spec becomes the source of truth for the human and the AI." (Martin Fowler)
Tools emerging:
- Kiro (AWS) — spec-driven development, agent hooks, and natural language coding assistance. Workflow: requirements → design → tasks → implementation
- GitHub Spec Kit — Templates and prompts for specification-based workflows
- Tessl — Similar structured approach with a registry of 10,000+ specs
However, Britta Böckeler cautions: "I'd rather review code than all these Markdown files. Even with all of these files and templates and prompts and workflows and checklists, I frequently saw the agent ultimately not follow all the instructions." (Martin Fowler)
METR study found "experienced devs were ~19% slower on familiar repos due to prompt/review overheads" even though developers perceived a 20-24% speedup.
"Individual developer velocity increased, but organizational throughput did not."
-
Shift to orchestration skills — "The skills that matter are shifting toward architecture, system design, prompt engineering, and quality judgment." What CTOs now hire for: "independent validation of AI, system design thinking, security awareness, product sense, and clear communication." (RedMonk)
-
Human checkpoints — Tools like Kiro, Tessl, and Spec Kit emphasize human verification at each phase. But "experienced programmers may find that over-formalized specs can cause unnecessary trouble and slow down change and feedback cycles — just as we encountered in the early stages of waterfall development." (Thoughtworks)
-
Security as a critical skill — "By 2027, expect cybersecurity becoming the #1 skill employers demand from developers" because "AI-generated code carries a vulnerability rate close to 50%, compared to 15–20% in traditionally human-written code." (DEV Community)
-
Multi-agent orchestration (but only for seniors) — "So far, the only people I've heard are using parallel agents successfully are senior+ engineers." — Gergely Orosz, via RedMonk
| Anti-pattern | Why it fails |
|---|---|
| AI implements, engineers only write tests | The human gets the boring work, the AI gets the rewarding puzzle-solving. Also: tests only catch what you anticipate — you lose the implementation feedback loop where you discover edge cases you hadn't considered. |
| Premature automation | Automating with AI before you understand the problem space manually. You can't validate what you don't understand. |
| One-shot architecture | Having AI generate entire systems in one prompt. No iteration, no learning, no refinement. |
| Debugging AI with AI | AI wrote buggy code, you ask AI to fix it, it introduces new bugs, you ask again... layers of uncomprehended fixes. |
- Accountability — Who owns code the AI wrote? The prompter? The reviewer? The AI vendor?
- Audit trails — How do you capture prompts, context, iterations, and human decision points?
- Knowledge transfer — If devs don't understand AI-generated code, how do they hand it off?
- Code review at scale — Reviewers already rubber-stamp human PRs. How do they handle 10x volume?
- Skill atrophy — If seniors stop writing code, what happens to team expertise over 2–3 years?
- On-call nightmare — 3 AM alert, AI wrote this code 6 months ago, you approved but didn't deeply understand it. Now what?
- Verification — AI reviewing AI? Tests? Production monitoring? Type systems? Nobody has a complete answer.
- Cost predictability — API costs scale with usage, but usage is hard to predict. How do you budget for AI-assisted development?
- Incident response — Your postmortems assume someone can trace through the code and explain the failure. Haunted codebases make this impossible.
- Ownership culture — DevOps works because people feel ownership. If "the AI wrote it," does psychological ownership erode?
Yegge's advice for the typical enterprise developer wondering how to navigate this:
"I'm a mid-level engineer at an insurance company who writes Java nine to five, and I got a backlog of Jira tickets. What do I do?" — Changelog podcast
"First of all, you don't do anything that your work doesn't let you do."
"You should only worry about whether you need to be using agents if you see that other people at your company are using agents above board, getting PRs in, and starting to work that way."
"Start practicing now in your hobby time, your spare time... you're not gonna learn it overnight."
"Don't be like us. Be conservative. But learn this stuff because there will come a time sooner than you think when your company is going to expect that you know how to use it."
Simon Willison takes a similar approach — relentless experimentation, but with transparency. He publishes everything he builds with AI, documents what works and what doesn't, and maintains healthy skepticism about the hype while still pushing the tools hard.
The advice for leadership is different: your developers are already using these tools, whether you've sanctioned them or not. The question is whether that usage is visible, governed, and safe.
- Provide capable tools — If you offer underpowered options, developers will route around them. Budget for frontier models, not last year's autocomplete.
- Measure what matters — Laura Tacho's research shows individual velocity gains don't always translate to organizational throughput. Track review overhead, defect rates, and comprehension debt — not just lines of code.
- Start with low-stakes systems — Addy Osmani recommends using AI for "the 70% that's not core" — boilerplate, tests, documentation, migrations. Save critical business logic for humans who understand it.
- Define ownership before you need it — When the 3 AM incident happens, who owns code the AI wrote? Decide now, not during the postmortem.
- Invest in security review — AI-generated code has higher vulnerability rates. Your security team needs capacity and training to handle the increased volume.
Hands-on, discovering patterns through daily practice
| Name | Role | Key contribution | Key Resource |
|---|---|---|---|
| Steve Yegge | Ex-Google/Amazon, Sourcegraph | Building Gas Town (multi-agent orchestrator), Beads (agent memory system), runs 10-20 agents in parallel daily | The Year the IDE Died, Vibe Coding book with Gene Kim |
| Simon Willison | Creator of Datasette, Django co-creator | Prolific experimenter, llm CLI tool builder, no-hype documentation, 80+ vibe-coded experiments published |
AI Engineer World's Fair keynote, simonwillison.net |
| Thorsten Ball | Sourcegraph/Amp, ex-Zed | Demystified agents with "How to Build an Agent" (315 lines of Go), co-designed Agent Client Protocol (ACP) with JetBrains | How to Build an Agent, Register Spill newsletter |
| Andrej Karpathy | Ex-OpenAI, Ex-Tesla AI | Coined "vibe coding" (Collins Dictionary Word of the Year 2025), ML/research perspective on paradigm shift | 2025 LLM Year in Review |
Packaging frontier discoveries for broader adoption
| Name | Role | Key Contribution | Key Resource |
|---|---|---|---|
| Addy Osmani | Engineering Leader, Google Chrome | Synthesizes Google-scale learnings, "70% problem" framing, spec-driven development patterns | Beyond Vibe Coding, Elevate Substack |
| Chip Huyen | Ex-NVIDIA, Stanford ML instructor | "AI Engineering" book (most read on O'Reilly 2025), production AI systems focus | AI Engineering (O'Reilly) |
Addressing governance, metrics, and organizational adoption
| Name | Role | Key Contribution | Key Resource |
|---|---|---|---|
| Laura Tacho | CTO, DX | AI Measurement Framework, Core 4 metrics co-author, research across 180+ companies on real AI impact | Engineering Enablement newsletter, Pragmatic Engineer podcast |
Influential platforms, not frontier discoverers
| Name | Role | Key Contribution | Key Resource |
|---|---|---|---|
| Gene Kim | Author (Phoenix Project, DevOps Handbook) | Leveraging DevOps credibility, co-authored Vibe Coding book, enterprise workshop facilitator | Vibe Coding book with Yegge |
| Gergely Orosz | The Pragmatic Engineer | Platform amplifying other voices, hosting Pragmatic Summit (Feb 2026) | Pragmatic Engineer newsletter & podcast |
| Title | Speaker | Topics |
|---|---|---|
| Steve Yegge on productive vibe coding, the death of the IDE, babysitting a fleet of AI coding agents | Steve Yegge | Vibe coding, IDE evolution, multi-agent orchestration |
| 2026: The Year The IDE Died | Steve Yegge & Gene Kim | IDE evolution, Vibe Coding book |
| Adventures in babysitting coding agents | Steve Yegge | Agents, enterprise adoption, practical advice |
The tools are here. The governance isn't. The winners will be organizations that figure out how to capture the productivity gains while building guardrails against comprehension debt, skill atrophy, and haunted codebases — before their competitors do.
"There's an art to it, and you will discover it yourself if you're just pushing on it. You don't even have to read a book."
"There's no math. There's no science. It's an art."
