By Alex Ivan — 12 May 2026

Memo: The Compute-to-Surplus Pipeline Is a Product Spec. Here's How to Ship Against It.

Last week's memo named the structural problem in agentic commerce – stronger agents extract surplus from weaker ones, and the losers can't tell. This week is for the operators. If you are a PM at a company whose product will, within twelve months, host transactions between AI agents – and almost all of you are – here is the framework, the dashboards, and the surface area you should already be sketching.

A reader of last week's memo wrote in to make a fair point: naming the compute-to-surplus pipeline is the easy part. Living with it is the hard part. The reader runs product at a mid-market B2B platform – the kind of place where agents are about to start transacting on behalf of buyers and sellers, where the roadmap committee meets every other Thursday, and where "build the asymmetry visibility layer" reads, in a planning doc, as approximately one-third of a real ticket.

That is the right complaint. So this memo is for that PM, and the hundred thousand others in roughly the same seat.

The thesis of this memo is narrow and load-bearing: the compute-to-surplus pipeline is not a thought experiment. It is a product specification. The products that ship against it in 2026 will define the category for a decade. The products that don't will spend 2028 explaining to their board why the marketplace they own is bleeding margin to a counterparty layer they never instrumented.

What follows is the framework, as I would write it on the whiteboard during a planning offsite.

Why your existing product instincts are about to fail you, in one paragraph

For roughly fifteen years, the dominant product question has been some flavor of how do we reduce friction for the human user? Funnels, onboarding flows, novice-to-expert ramps, empty-state design, recommendation engines. All of it presupposes a human at the keyboard whose skill, intent, and attention are the variable to optimize against. The Project Deal data is the first piece of clean empirical evidence that, in agent-mediated transactions, that variable does not move the outcome. User instructions to the agent had no statistically significant effect on price or sale likelihood. What moved the outcome was which model the agent was running on. If you are a PM whose product roadmap is built around making humans more skillful, more engaged, or more attentive, your roadmap is solving a problem that is becoming structurally irrelevant for an increasing share of your transactions. This is not hypothetical. It is measurable, today, in a published experiment with real money.

The new variable is the capability gradient between the two agents in any transaction. Your job is to instrument it, expose it, and ship against it.

The four primitives

There are four operator-level concepts that, taken together, are the spec. Memorize them. Use the names. Naming these clearly in your roadmap docs is half the fight, because right now nobody in your org has language for any of this.

1. Capability gradient. The slope between the strongest and weakest agent on any given transaction. In Project Deal, this was Opus 4.5 vs. Haiku 4.5 – roughly a 5-10x inference cost ratio – and it translated to $2.68 of seller surplus per item plus two extra closed deals per participant. Your product's equivalent of "session length" or "DAU" is now the gradient distribution across your transaction base. You should know, today, what the median, p10, and p90 capability gradient looks like on your platform. If you don't, you are flying blind on the variable that determines who is winning and losing inside your funnel.

2. The silent-loss problem. This is the operator-facing name for the finding that broke the experiment. Users on the losing side of the gradient could not detect their loss. Not in the moment, not in aggregate, and not retrospectively when shown both runs side-by-side. The PM consequence is brutal: you cannot rely on user feedback as a signal for whether your product is treating your users well. Your NPS will be fine. Your CSAT will be fine. Your support tickets will not spike. The damage will show up six quarters later as cohort retention erosion that nobody on your team can attribute to a feature, because the feature is the one you didn't ship.

3. Counterparty instrumentation. The product-analytics category that does not exist yet but has to. Every analytics tool your team uses today – Amplitude, Mixpanel, Heap, Pendo, the in-house thing your data team built – logs what your userdid. None of them log what the counterparty's agent did to your user. In an agent-mediated transaction, that is the entire game. You need to know, per transaction: what model was on the other side, what capability tier it represents, how many turns the negotiation took, and what the outcome distribution looks like for users matched against that tier versus the median tier. This is a net-new schema. No vendor sells it. You are going to build it, or your competitor will.

4. The agent-tier contract. The shippable UX surface that closes the loop with the user. Think of it as the SSL padlock of agent commerce – a small, persistent, unspoofable disclosure that tells the user, before and after each transaction, what capability tier represented them and what tier represented the counterparty. It is not a settings page. It is a status indicator. The product that ships this first, on a real marketplace, with real numbers, sets the convention for the entire category. The product that ships it second is following the standard.

These four – gradient, silent-loss, counterparty instrumentation, tier contract – are the spec. The rest of this memo is what you do with them.

What goes on the roadmap this quarter

If you take the spec seriously, three things move up your priority stack and one thing moves down. In rough order:

Move up: counterparty logging, immediately, even if you do nothing with it for a quarter. You cannot reason about a variable you are not capturing. Every agent-mediated transaction on your platform should, starting in the current sprint, write a row that includes which model represented each side. If your platform doesn't expose that signal directly – many won't, today – then capture the cheapest available proxy: provider, model family, declared tier, latency profile, anything. The data is worthless if you wait for the schema to be perfect. The data compounds if you start logging now and clean it later. This is a one-engineer, two-week project, and it is the highest-leverage thing on this list.

Move up: a single internal dashboard called "the gradient." One number. Median capability gradient across your last 10,000 transactions, with p10 and p90. Refreshed daily. Visible in the same Slack channel where you post DAU. The point of the dashboard is not to act on it yet. The point is that within sixty days, your CEO, your head of growth, and your head of trust & safety all start asking questions about it. That is when you have won the internal argument that the variable matters. You will not win that argument with a memo. You will win it with a chart that does not move except when something interesting is happening.

Move up: a "tier disclosure" prototype, even if you don't ship it. Build the agent-tier contract UX, in Figma or in code, before you have to. Two screens: pre-transaction ("you are about to be represented by a Tier 2 agent against a Tier 1 counterparty") and post-transaction ("your agent recovered approximately 38% of the available surplus on this trade"). Show it to ten users. Do not ship it. The point of the prototype is to answer the question that decides this entire category for your company: do users care? Project Deal's finding is that they cannot tell on their own. Your prototype tests whether, when told, they can act on it. If they can, the disclosure is the wedge. If they can't, you are in a different and harder business than you thought.

Move down: most of what's currently labeled "AI features" on your roadmap. Nine months of every B2B product roadmap I have seen in the last year is some flavor of put a chat box on it. That work is not wrong, but it is not the work that determines whether your product survives the agent transition. The chat box is a feature. The instrumentation is the moat. If you have one engineer to allocate, put them on the moat.

The argument you will lose three times before you win it

You will walk into roadmap review with a version of this and someone – usually the most senior person in the room – will say one of three things. Each of them is wrong in a specific way, and you should be ready.

"This is too speculative. The agent economy isn't real yet."

The agent economy was operationally real on April 28, 2026, the day Microsoft made Agent 365 generally available and Salesforce published its headless API. It is becoming financially real every quarter that MoonPay's agent debit card and the Manfred-style auto-incorporation services produce a transaction volume number. The right counter is not to argue about whether agents will transact on your platform. The right counter is to ask whether anyone in the room can tell you, today, what percentage of your existing transactions involved an LLM on at least one side. Almost no PM can answer that question. The fact that nobody can answer it is the problem.

"We can wait for the standard to emerge."

In categories where the differential is invisible to the loser, standards do not emerge from consumer demand. They emerge from regulation, and regulation arrives late. Best-execution reporting in equities arrived after a generation of retail investors had been quietly disadvantaged. The PM who waits for the standard ships into a market where the convention has already been written by whoever moved first. You do not want to be the second product to disclose tier. You want to be the first.

"This isn't really product work, it's policy or trust & safety."

This is the most dangerous version, because it sounds reasonable. The reason it is wrong is that the surface area – the dashboards, the disclosures, the schema, the UX – is product work in every meaningful sense. T&S can write the policy. Legal can review the disclosure language. But the artifact the user sees, the latency budget the disclosure has to fit inside, the way the tier indicator interacts with the rest of the interface – that is product. If you cede the artifact to another function, the artifact will be shipped twelve months late and look like a compliance dialog. The user will dismiss it. The opportunity will close.

What "good" looks like, twelve months out

A year from now, the products that have shipped against the compute-to-surplus pipeline will share a small number of features. Internal dashboards that track gradient distribution across the transaction base. A disclosure UX – probably persistent, probably small, probably color-coded by tier – that tells a user what capability is on each side of any agent-mediated action. A pricing or product mechanism that lets a user, knowing the gradient, choose to upgrade their representation for a specific transaction. (The "premium agent for this negotiation only" SKU is going to exist, and it is going to be enormously profitable.) A trust signal in the marketplace's brand that says, in effect, we measure this so you don't have to.

The products that have not shipped against the pipeline will share one feature: a churn cohort that nobody can explain.

The bottom line

The compute-to-surplus pipeline is going to be the most important variable in your product within two years and it is currently being tracked by approximately zero of your dashboards. The fix for that is not philosophical. It is the four primitives, the three roadmap moves, and the willingness to walk into a planning meeting and argue that an internal chart and a Figma prototype are the most valuable things on next quarter's plan.

Last week's memo ended with the line the Project Deal data is not the end of an experiment. It is the start of the spec.This memo is the spec. The rest is shipping.

The PMs who internalize this in May 2026 are going to look, in May 2028, like they had a crystal ball. They didn't. They had a published experiment, a four-word concept, and the discipline to put one engineer on the logging table while everyone else was still arguing about whether it mattered.

That is the entire trick.

Companion to last week's memo on Project Deal. The capability gradient, silent-loss problem, counterparty instrumentation, and agent-tier contract are operator-level reframings of the structural finding originally published by Anthropic in April 2026. All numerical figures cited here originate from Anthropic's Project Deal writeup. Industry context on Microsoft Agent 365, Salesforce headless, and MoonAgents Card draws from May 2026 reporting.