The Model Is Only 10%: The Real Lesson of the New SDLC

📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent Google whitepaper reveals that in AI-driven software development, the model itself accounts for only 10% of system behavior. The focus should shift to harness design and context engineering, which constitute the majority of system performance and reliability.

A new Google whitepaper, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, states that the *most significant shift* in software engineering today is moving from focusing on large AI models to prioritizing harness design and context engineering. The paper emphasizes that the model itself constitutes only about 10% of system behavior, with the remaining 90% determined by how the AI is configured, guided, and integrated.The whitepaper, titled ‘The New SDLC With Vibe Coding,’ reports that 85% of professional developers use AI coding agents regularly, with 51% doing so daily, and roughly 41% of new code being AI-generated as of early 2026. The core assertion is that the *smallest part* of an AI system—the model—is less influential than the surrounding harness, which includes prompts, tools, rules, and observability features. Experiments cited in the paper show that changing only the harness or prompts can significantly improve an agent’s performance, even when using the same model. For example, a team improved their coding agent’s ranking from outside the top 30 to within the top five by adjusting only the harness, not the model itself. The paper advocates for a shift towards *agentic engineering*, where verification, judgment, and context management are prioritized over raw model size or complexity.
At a glance
reportWhen: published early 2026
The developmentThe Google whitepaper highlights that the key to effective AI systems is not the size of the model but the harness and context engineering surrounding it.
The Model Is Only 10% — The New SDLC With Vibe Coding
AI Dispatch · Field Notes
Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified
Vibe Coding
Casual prompts · “does it seem to work?” · disposable code · high risk
Structured AI-Assisted
Detailed prompts + constraints · manual testing · features in real codebases
Agentic Engineering
Formal specs · automated tests + evals + CI gates · production scale · low risk
Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.
The idea worth building your strategy around
Agent = Model + Harness
~10%
HARNESS — prompts · tools · context · hooks · sandboxes · observability
MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S
Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.
“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.
The economics: it’s a token-cost problem (CapEx vs OpEx)
Vibe Coding
Low CapEx · High OpEx
Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.
Agentic Engineering
High CapEx · Low OpEx
Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.
85%
of devs use AI coding agents (51% daily)
41%
of all new code is AI-generated
~90%
of agent behavior is the harness, not the model
+19%
longer on some tasks (METR) — verification is the cost
The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.
thorstenmeyerai.com

Why Harness and Context Engineering Are Game-Changers

This shift in focus from model size to harness design and context engineering has profound implications for AI development strategies. It suggests that organizations can achieve better, more reliable AI systems by investing in configuration, tooling, and context management rather than solely chasing larger models. This approach can reduce costs, improve system robustness, and enable more precise control over AI behavior, which is crucial for enterprise applications and safety-critical systems.
AI-Powered Web Design Mastery: Harness the Power of Framer AI to Build, Customize, and Launch Stunning Websites—A Step-by-Step Guide for Beginners and Professionals

AI-Powered Web Design Mastery: Harness the Power of Framer AI to Build, Customize, and Launch Stunning Websites—A Step-by-Step Guide for Beginners and Professionals

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

The Evolution of AI Development and the Rise of Agentic Engineering

Historically, AI development emphasized training larger models with more parameters. Recent advances, including the publication of this whitepaper, challenge that paradigm by demonstrating that the *behavior* of AI systems hinges more on how they are integrated and guided. The concept of vibe coding—quick prompts with minimal oversight—has given way to a more disciplined approach called agentic engineering, which involves structured prompts, verification, and context management. The paper notes that as of early 2026, AI tools are embedded deeply in software workflows, with a significant portion of new code generated by AI, underscoring the importance of effective harness design.

“The model is only 10% of what determines behavior; the harness is 90%. The real work lies in configuration, context, and verification.”

— Addy Osmani

AI Engineering: Building Applications with Foundation Models

AI Engineering: Building Applications with Foundation Models

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What Aspects of Harness and Context Are Still Being Explored

While the whitepaper emphasizes the importance of harness design and context engineering, specific best practices, tools, and frameworks for optimal implementation are still evolving. The precise impact of different configurations across diverse applications remains under study, and industry adoption of these principles is ongoing.
AI-Powered Observability: From Noise to Insight: Transforming How We Monitor, Detect, and Respond

AI-Powered Observability: From Noise to Insight: Transforming How We Monitor, Detect, and Respond

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Next Steps for AI Development and Industry Adoption

Organizations are expected to reevaluate their AI strategies, investing more in harness design, context management, and verification processes. Further research and case studies will likely emerge, providing clearer guidelines and tools for effective implementation. Industry standards may evolve to prioritize configuration and scaffolding as core competencies in AI system development.
Claude AI for Beginners Bible: [5 in 1] The Ultimate Guide to Automate Your Work, Save Hours Every Week, and Use AI for Real-World Results

Claude AI for Beginners Bible: [5 in 1] The Ultimate Guide to Automate Your Work, Save Hours Every Week, and Use AI for Real-World Results

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model only 10% of the system behavior?

According to the whitepaper, the majority of an AI system’s behavior depends on how the model is integrated, configured, and guided through prompts, tools, and verification—collectively called the harness.

How can focusing on harness design improve AI performance?

Better harness design allows organizations to fine-tune AI behavior, reduce errors, and improve reliability without necessarily relying on larger or more complex models.

What is agentic engineering?

Agentic engineering involves structured prompts, verification, context management, and tools to control AI behavior systematically, moving beyond vibe coding to disciplined AI development.

Does this mean larger models are obsolete?

No, but the whitepaper suggests that the value of larger models is limited unless accompanied by effective harness design and context engineering. The focus should shift to how models are used and guided.

What are the economic implications of this shift?

Focusing on harness and context can lower operational costs, improve system reliability, and reduce token waste, making AI development more cost-effective in the long run.

Source: ThorstenMeyerAI.com

This content is for general information only and is not financial, tax or legal advice. Consult a qualified professional for decisions about your money.
You May Also Like

Phone-based injury-risk movement screening for hiring

A new approach uses phone cameras and pose estimation to remotely assess injury risk for physical-labor job candidates, aiming to reduce on-the-job injuries.

China Sphere Capability Gap, Q2 2026 Update: Five Labs, Five Strategies, One Narrowing Frontier

Five Chinese labs launched frontier-tier models within four weeks, narrowing the capability gap with US leaders and transforming the AI landscape.

Diode, CO2, or Fiber? The Laser Engraver Choice That Changes Everything

The laser engraver choice—diode, CO2, or fiber—can transform your projects, but understanding their differences is key to making the right decision.

Disk Is the Contract: Inside Threlmark’s Local-First Architecture

Threlmark treats local disk storage as the definitive data source, simplifying sync, enhancing offline use, and improving interoperability without relying on traditional databases.