The Router Is The Vendor

On June 16, 2026, Microsoft made Copilot Cowork generally available to every enterprise customer running Microsoft 365. The product was pitched as an agentic system that runs long, multi-step workplace tasks across company files, apps, and data systems, returning a finished result rather than a suggestion. The marketing carried Microsoft's name. The engineering inside carried four others.

At general availability the platform ran on Anthropic's Opus 4.8 and Sonnet 4.6. Frontier-tier customers could route to OpenAI's GPT-5.5. Microsoft announced that its own model, called Cowork 1, was coming soon. Two days later, reporting confirmed Microsoft was actively testing a hosted version of China's DeepSeek V4 on Azure as a cheap engine for routine tasks, with cost reductions of up to 90 percent targeted for that lane. The same announcement quietly moved Cowork off the flat $30 per-user Copilot license into a metered, usage-based credit model.

So what did the customer actually pick when they signed the GA contract? They picked a router. The vendor name was the label on the wrapper.

This is the procurement shift that quietly changes everything about how a board should think about choosing between AI vendors over the next twelve months. The model you license is no longer the asset that matters. The routing policy is.

The model identity dissolved into a routing decision

When Anthropic shipped Claude Fable 5 on June 9, 2026, the headline number was $10 per million input tokens and $50 per million output tokens. That doubled the price ceiling on the frontier. GPT-5.5 had been the anchor at $5 input and $30 output. Fable 5 set a new top line. Two days later, CNBC reported (citing the WSJ) that OpenAI was preparing significant price cuts to win consumers and enterprises back from Anthropic. Anthropic's annualized run rate had jumped from $9 billion at the end of 2025 to $47 billion by May, a 422 percent move in five months, mostly on the strength of Claude Code. The model market now moves in days, not quarters.

If you signed a "preferred vendor" contract with one of the labs in Q1 of this year, your contract is already mispriced. Either the lab cut prices and you are still paying the old number, or the lab raised the ceiling and your competitor is consuming a cheaper engine for the same workload. Either way, you have locked yourself into the wrong side of a curve that bends faster than your renewal cycle.

The platforms know this. AWS Bedrock now lists close to 100 models, including Mistral, Google, NVIDIA, OpenAI, MiniMax, Moonshot, and Qwen. Microsoft now routes between Anthropic, OpenAI, its own forthcoming Cowork 1, and (soon) DeepSeek inside a single product. Google's enterprise pitch leans on a multi-billion-dollar commitment to Anthropic, its own Gemini stack, open-source Gemma, and whatever lives on Vertex from third parties.

None of the hyperscalers are picking a winner. They are building the muxer. They have read the same pricing data the rest of us have, and they have concluded that the durable economic position is to own the routing layer rather than bet on any single supplier underneath it.

The buyer used to pick the model. Now the buyer picks the policy.

I have sat in enough vendor-evaluation meetings over the last eighteen months to have a strong opinion about this. When a board asks "should we go with OpenAI or Anthropic?", the question feels weighty. It carries the signal that someone in the room is being decisive. It triggers procurement, legal, security, and an RFP that ends in a six-figure annual commit and a CISO sign-off.

The question is the wrong one to answer.

The decision a board actually has to make is who, inside the enterprise, holds the authority to set a routing policy across the firm's AI workload. A routing policy is a stack of rules. This task class goes to that model with this cost ceiling. If the model fails, fall back to the next one in line. If cost overruns a threshold, throttle and alert. If the input contains PII, route only to vendors with the right data-residency commitment. If the output gets cited in an audit later, the lineage has to prove which model returned which token.

None of that is in the OpenAI versus Anthropic question. All of it sits in the architecture one layer above the model.

I have watched mid-market firms commit to a single lab for "consistency" reasons and then discover six months later that 80 percent of their use cases were commodity reasoning that DeepSeek V4 will handle for one tenth the cost, 15 percent needed a coding model where Sonnet 4.6 dominates, and 5 percent needed an audit-bearing reasoning chain where Opus 4.8 or Fable 5 was worth the premium. They paid the premium for the entire workload. The savings they left on the table funded a competitor's next product.

The choice that's being made for you

Here is the unsettling part. If you do not make this decision, the platform you bought made it for you.

Look at what Microsoft did with Copilot Cowork's billing. The product became metered. A Copilot Credit gets consumed per agent action. Microsoft said cheap models would handle routine tasks (potentially at 90 percent cost reduction once DeepSeek V4 is hosted), and premium models would handle complex work. Microsoft decides which is which. The customer pays the credit. The customer cannot see the unit cost of the underlying token.

That is the entire enterprise software industry now. You do not procure a vendor. You procure a routing policy authored by someone else. The policy is written in the platform's engineering, not in your contract. If Microsoft decides that "summarize this document" runs on DeepSeek V4 because the cost per token is one tenth of Opus, you are now consuming a Chinese open-source model through a Microsoft wrapper. You are also funding Microsoft's margin between the cheap inference underneath and the premium credit price on top. Both of those facts are true and both belong on your risk register.

Is that bad? It depends on your governance posture. Is it what you thought you bought? Almost certainly not.

The same dynamic plays out at AWS Bedrock and at Google Vertex. When you bought Bedrock, you bought a router. The router can swap underlying models. The router can re-price the underlying models. The router can introduce a new lab tomorrow and deprecate one next quarter, and your application code will keep compiling. The router is the actual product you procured. The model is a SKU inside it.

The geopolitical layer makes this sharper. In May 2026, the Pentagon awarded eight classified AI contracts to OpenAI, Google, SpaceX, and a handful of others. Anthropic was excluded. Whatever the merits of that decision, it tells you something useful: which models a regulated firm can even consider is now a national-security input, not a procurement input. If your vendor is a router, you can adapt to that overnight. If your vendor is a single lab, you cannot.

The four questions worth asking when choosing between AI vendors

You are still picking a vendor. The question is which layer you are picking at. There are four questions worth asking, and they replace the old ones almost entirely.

The first question is what the unit of routing is. Some platforms route at the task type, separating summarization from coding from reasoning. Some route at the agent step, where one model writes the plan and a cheaper one fills in the actions. Some route at the cost threshold, escalating anything over a per-call cents target to a more capable model or to a human. The platform vendor that lets you set the unit owns its routing. The vendor that hides the unit owns you.

The second question is how visible the cost is in production. Anthropic paused a credit-overhaul change on June 15 after customer pushback about transparency, which tells you transparency is a sore point even for the labs themselves. Microsoft's metered Copilot Credit pricing is opaque enough that several Microsoft-shop CIOs I have spoken to in the last two weeks said they cannot reliably project monthly Cowork spend before turning it on. If you cannot watch the meter run in real time, you cannot govern the spend, and you cannot defend the spend at your next board review when the line item triples without explanation.

The third question is how reversible the choice is. Vendor lock-in used to mean integration depth, the SAP-grade switching costs measured in years. Vendor lock-in in AI is measured in prompts. If your prompts, evals, RAG sources, and tool definitions are written against one provider's API surface, you have embedded a switching cost into a layer your procurement team never saw. Anthropic's tool-use schema is not OpenAI's tool-use schema, and neither matches Google's. The pain shows up in the migration. The contract turns out to be the easy part.

The fourth question is who keeps the savings on the next price move. When Anthropic shipped Fable 5 at double the prior ceiling, the cost passed through to every workload that the platform silently elevated to the new tier. When OpenAI cuts prices in response, who recaptures that saving? In a routed platform, the platform vendor pockets the spread by default. The contract clause that lets the customer share in price drops is not standard. It should be a negotiated line, and it almost never is.

What this looks like in a real engagement

A services firm I worked with last quarter, roughly 200 people, ran an evaluation across three labs before standardizing. They picked the lab with the strongest reasoning scores on their internal benchmark. Six weeks later they noticed two things.

Their actual production workload was 70 percent retrieval-augmented summarization of long internal documents, and the model they picked was the worst of the three options on that specific task class. They had benchmarked the wrong thing because the benchmark predated the deployment. Separately, the model they picked had shipped a price-per-output-token reduction during the pilot, and a competitor running the same model was paying 30 percent less because they had renegotiated quarterly. The client's procurement team had locked the old rate for twelve months "to lock in budget certainty." They locked in a 30 percent overpay.

The fix had nothing to do with switching vendors. They installed a routing layer, built on AWS Bedrock's unified API surface plus a thin in-house policy engine, that lets the firm move 80 percent of the workload to whichever underlying model is cheapest at adequate quality, route the hard 20 percent to a premium model, and renegotiate every quarter. The lift was three weeks of engineering. The savings funded the engineering inside the first month, with optionality left over to ship a new internal product on DeepSeek V4 the week Microsoft confirmed Azure hosting.

That outcome was available to every firm in their cohort. None of the other firms had bought it. They had all bought a vendor.

The architecture move that survives the price war

You can predict the next eighteen months of this market with reasonable confidence. Prices will continue to fall on commodity inference and rise on frontier reasoning. New models will appear roughly monthly. Hyperscalers will continue absorbing labs into routed platforms; AWS Bedrock's expansion from 60 to nearly 100 models in twelve months is the leading indicator, not an outlier. National-security carve-outs will continue, defining for regulated firms which labs are even on the menu. Open-source models from outside the US (DeepSeek from China, Qwen from Alibaba, Moonshot, MiniMax) will keep gaining ground on cost and will keep raising governance questions about data residency and supply chain.

In that environment, the durable architecture move is the one that keeps you indifferent to the vendor question. You should be able to run a cost-only model swap with zero application changes. You should be able to add a new model from a new lab in a single sprint. You should be able to walk away from any single lab without rewriting a single prompt.

Most enterprises cannot do any of those things right now. The reason is not that the technology is hard, because it genuinely is not. The reason is that the procurement decision was made one layer too low, the engineering followed the procurement, and now the engineering encodes the wrong commitment. A routing layer on top of Bedrock or Vertex, plus a policy engine you control, plus an evaluation harness that runs against live traffic rather than against a curated test set, costs less to build than the difference between the right model and the wrong model on a single quarter of workload. It pays for itself before your next earnings call.

Conclusion

Architecting routing is the work. Buying a vendor is the false choice that distracts you from doing it. Every off-the-shelf AI platform you procure right now ships with a hidden routing policy you did not author and cannot inspect. Every "preferred vendor" contract you sign locks in a price curve that is bending under you in real time. Every prompt your team writes against one provider's API surface is a small ratchet of switching cost that will not appear in your renewal review and will not stop ratcheting until someone with the authority to redesign the architecture stops it.

You do not need a vendor. You need a routing strategy, a cost meter your CFO can read in real time, a switching cost you control rather than one that controls you, and an evaluation harness that runs against production traffic rather than against a slide deck. That is an architecture problem, not a procurement problem, and it cannot be solved by signing a bigger contract with one of the labs.

This is what Agor AI Advisory builds for executives who are done treating AI as a license they buy and ready to treat it as an operating system they own. We sit on your side of the table, design the routing layer that keeps your options open across every shift in the model market, and install the governance that lets your board actually see what your AI workload is doing and what it is costing. The next twelve months will reward the firms that architect this and punish the firms that procure their way through it.

Schedule a strategic consultation with us today.