📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent whitepaper from Google highlights that in AI-assisted software development, the model accounts for only 10% of system behavior. The focus shifts to harnessing, verification, and context engineering, which are more impactful for performance and cost-efficiency.

A new whitepaper from Google emphasizes that the most critical part of AI-driven software development is not the AI model but the surrounding harness and verification processes. This shift impacts how organizations should allocate resources and develop AI systems, marking a fundamental change in the software development lifecycle (SDLC).

The whitepaper, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, states that the AI model constitutes only about 10% of the overall system’s behavior. The remaining 90% is determined by the harness — the prompts, tools, context policies, and observability mechanisms that surround the model. This insight challenges the common focus on acquiring or upgrading models, suggesting instead that organizations should invest more in configuration, testing, and context engineering.

Data presented in the paper shows that minor adjustments to the harness can dramatically improve performance. For example, changing only the prompts or the tools used by an agent can elevate its benchmark scores significantly, even when the underlying model remains unchanged. This indicates that configuration and scaffolding are the primary levers for improving AI system reliability and effectiveness.

The paper also discusses the economic implications, arguing that the perceived low cost of vibe coding (quick prompts, minimal review) is misleading. Over time, it can lead to higher costs due to increased token consumption, maintenance, and security vulnerabilities. Conversely, disciplined engineering — involving structured context, testing, and verification — offers lower marginal costs at scale.

At a glance

reportWhen: published March 2026

The developmentGoogle’s new whitepaper reveals that the core of AI software systems is not the model itself but the surrounding harness and verification processes, redefining best practices in SDLC.

The Model Is Only 10% — The New SDLC With Vibe Coding

AI Dispatch · Field Notes

Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified

Vibe Coding

Casual prompts · “does it seem to work?” · disposable code · high risk

Structured AI-Assisted

Detailed prompts + constraints · manual testing · features in real codebases

Agentic Engineering

Formal specs · automated tests + evals + CI gates · production scale · low risk

Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.

The idea worth building your strategy around

Agent = Model + Harness

~10%

HARNESS — prompts · tools · context · hooks · sandboxes · observability

MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S

Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.

“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.

The economics: it’s a token-cost problem (CapEx vs OpEx)

Vibe Coding

Low CapEx · High OpEx

Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.

Agentic Engineering

High CapEx · Low OpEx

Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.

85%

of devs use AI coding agents (51% daily)

41%

of all new code is AI-generated

~90%

of agent behavior is the harness, not the model

+19%

longer on some tasks (METR) — verification is the cost

The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.

thorstenmeyerai.com

Why Focusing on Harness and Verification Transforms AI Development

This shift in focus from the model to the harness and verification processes fundamentally changes how organizations should approach AI development. It suggests that long-term success and cost-efficiency depend on investing in configuration, testing, and context management, rather than solely chasing the latest models. This insight could lead to more reliable, secure, and scalable AI systems, and redefine best practices in the industry.

Systems and Software Verification: Model-Checking Techniques and Tools

View Latest Price

As an affiliate, we earn on qualifying purchases.

The Evolution of AI in Software Engineering

The whitepaper builds on the growing adoption of AI coding agents, where data shows that 85% of developers use these tools regularly, and 41% of new code is AI-generated. Historically, the focus has been on model improvements, but recent experiments, including those cited by the authors, demonstrate that harness tuning can outperform model upgrades in performance metrics. This aligns with broader industry trends emphasizing verification, structured context, and configuration as key to scaling AI systems effectively.

Previously, the dominant narrative centered on the rapid evolution of models like GPT and Claude, but emerging evidence suggests that the real bottleneck and opportunity lie in how these models are integrated, controlled, and verified within larger systems.

“The biggest shift in software engineering isn’t a new language or framework; it’s moving from writing code to expressing intent and trusting machines to execute that intent.”
— Addy Osmani

Unclear Aspects of Implementation and Industry Adoption

It is not yet clear how quickly organizations will shift their focus from model upgrades to harness configuration and verification. The long-term impact on AI development costs, security, and reliability remains to be empirically validated across diverse industry sectors and use cases.

Next Steps for AI System Design and Industry Adoption

Organizations are expected to reevaluate their AI development strategies, prioritizing investments in configuration, testing, and context engineering. Future research and case studies will likely explore how this focus impacts system performance, security, and total cost of ownership over time. Industry leaders may also develop new tools and best practices to facilitate this shift.

Key Questions

Why is the model only 10% of the system’s behavior?

The whitepaper explains that the model’s behavior is heavily influenced by the surrounding harness — including prompts, tools, context, and verification mechanisms — which determine the system’s overall performance more than the model itself.

How does this shift affect AI development costs?

While vibe coding appears cheaper initially, it can lead to higher long-term costs due to increased token consumption, maintenance, and security issues. Disciplined engineering, though more upfront, reduces marginal costs at scale.

What practical steps should organizations take?

Organizations should focus on improving their harness — by refining prompts, tools, context management, and verification processes — and adopt structured testing and evaluation practices to ensure reliable AI performance.

Does this mean models are no longer important?

Models remain foundational, but their importance is now understood to be only part of the system. The surrounding configuration and verification processes are equally, if not more, critical for achieving desired outcomes.

Source: ThorstenMeyerAI.com

This content is for general information only and is not financial, tax or legal advice. Consult a qualified professional for decisions about your money.

The Model Is Only 10%: The Real Lesson of the New SDLC

Up next

Cutrova: Edit the Words, Not the Timeline

Author

Startup Sofa Team

The model is only 10%

Why Focusing on Harness and Verification Transforms AI Development

Systems and Software Verification: Model-Checking Techniques and Tools

The Evolution of AI in Software Engineering

Unclear Aspects of Implementation and Industry Adoption

Next Steps for AI System Design and Industry Adoption

Key Questions

Why is the model only 10% of the system’s behavior?

How does this shift affect AI development costs?

What practical steps should organizations take?

Does this mean models are no longer important?

How Abyssal Station’s AI Uses Scrolls To Reach New Depths

Software-Defined Warfare: How Ukraine’s Delta Turned the Battlefield Into a Shared, Real-Time Map

Three Public Vulnerabilities. Chained.

Phone-based injury-risk movement screening for hiring

Unlock Better TV Audio With These Top AI Soundbars In 2026

13 AI Student Planners To Help You Study Smarter In 2026

15 Best Laptops for Students in 2026

ByteDance’s Billion-Dollar Bet On AI4S: Will It Reverse Brain Drain?

The Model Is Only 10%: The Real Lesson of the New SDLC

Up next

Author

Startup Sofa Team

The model is only 10%

Why Focusing on Harness and Verification Transforms AI Development

Systems and Software Verification: Model-Checking Techniques and Tools

The Evolution of AI in Software Engineering

Unclear Aspects of Implementation and Industry Adoption

Next Steps for AI System Design and Industry Adoption

Key Questions

Why is the model only 10% of the system’s behavior?

How does this shift affect AI development costs?

What practical steps should organizations take?

Does this mean models are no longer important?

You May Also Like