Insight

Tokens Don't Ship Themselves

Ariel Agor

•June 11, 2026

Listen · Read by Leo · click any word to jump

0:00 / —· loading…

On May 4, 2026, Anthropic announced a $1.5 billion joint venture with Blackstone, Goldman Sachs, and Hellman & Friedman. The purpose: put Anthropic's engineers inside the offices of mid-market companies owned by those private equity firms, and rewire those companies around Claude. Eight days later, on May 12, OpenAI launched The Deployment Company. Four billion dollars in initial capital. A separate corporate subsidiary. And the same week, OpenAI bought Tomoro, a London-based consulting firm, for its roster of 150 forward deployed engineers with delivery experience at Tesco, Virgin Atlantic, and Supercell.

Two of the most valuable private companies in the world, eight days apart, signed the same confession in public. The model is not the product. The model is half the sale. The other half is a senior engineer with commit rights inside your codebase, writing code that runs in your production environment, against your data, on your latency budget. The labs just told you what they think it costs to put that engineer in the room. They valued it at five and a half billion dollars of new capital across two new corporate vehicles in one eight-day window.

If your AI strategy still treats the foundation model as the thing you buy, you missed the announcement.

The receipt MIT already published

The number that explains the May announcements is from the previous summer. In August 2025, MIT's NANDA initiative published "The GenAI Divide: State of AI in Business 2025," a study built on 150 leader interviews, 350 employee surveys, and an analysis of 300 public AI deployments. The headline finding was the one that got passed around: about 95 percent of enterprise generative AI pilots delivered no measurable impact on the P&L. Five percent achieved revenue acceleration. The rest stalled.

The reading at the time was that AI was overhyped. That reading was wrong. The real signal in the MIT data was buried in the cause analysis. The failed pilots did not fail because the models were bad. They failed because the models did not learn from the customer's workflows, did not integrate with the customer's systems, and did not adapt when the workflow changed underneath them. The 5 percent that worked were the ones with deep integration and people who could redesign the work around the model in real time. The dividing line was not capability. It was embedding.

OpenAI and Anthropic read that report the same way every operator I respect read it. The model is the cheap part of the stack now. The bottleneck is the human who knows the model well enough and the customer well enough to stitch them together inside the customer's actual systems. That role has a name, an old one. Palantir invented it.

What Palantir already knew

Palantir founded its forward deployed engineering practice in the early 2010s out of a basic problem. Their intelligence agency customers could not tell them what they needed because the work was classified. So Palantir put engineers inside those customer environments. The engineers watched the work, prototyped against the real data, and shipped systems that actually got used. By the middle of that decade Palantir had more forward deployed engineers than traditional product engineers.

For a long time the rest of the industry treated this as a Palantir oddity. SaaS gospel said the right shape was self-serve, low-touch, high-margin. Send a salesperson, send a customer success rep, and the product does the rest. Forward deployed engineering looked like the opposite of that gospel. It looked expensive, hard to scale, and culturally weird.

It also worked. Palantir grew to a market capitalization above three hundred billion dollars on a go-to-market motion the SaaS playbook said could not scale. And in 2026, the two companies whose APIs are supposed to be the purest possible SaaS, the two whose unit economics depend most on hands-off self-serve, both stood up billion-dollar forward deployed arms.

That is the confession. That is the news.

Forward deployed AI engineering, as a discipline

A forward deployed AI engineer, in the sense the labs now mean, is a senior software engineer who sits inside a customer organization with commit access to a customer-specific deployment of a model. They write code that runs in production. They build the evaluations that decide whether the workflow is working. They redesign the workflow so the model does what only a senior human used to do. They tune prompts, agents, retrieval, and tool use against the customer's actual data instead of a public benchmark. They own the loop from observation to ship to measure.

The role does not look like a normal SaaS implementation consultant. The salary tells you that. According to Perspective AI's 2026 compensation report, median mid-level forward deployed engineer total comp is $385,000. Staff-level is $610,000. Principal level at the frontier labs is clearing $1.2 million. A fully loaded forward deployed engineer in the US costs a sponsor between $250,000 and $400,000 a year. Job postings for the role grew roughly 800 percent in a year. Forward deployed AI engineering is the fastest-growing job category in the field, by a factor of three over the next nearest category.

When OpenAI committed $4 billion to The Deployment Company and bought Tomoro for its 150 engineers, that money was not buying a service-line expansion. It was buying the recognition that the model, sold as a token-priced API, leaves most of the value on the floor of the customer's office. Someone has to pick that value up. That someone now works for the lab.

The category collapse the API era hid

For three years, the enterprise AI conversation was structured around a clean separation. You buy the model from a lab. You buy implementation from a consulting firm. You build the rest yourself. CEOs ran procurement processes against that mental model. RFPs compared model APIs on price per token and benchmark scores. Consulting firms sold practices on top of those APIs. Internal platform teams built abstractions to wrap them.

That separation is over. Anthropic's joint venture sells Claude and the engineers together to mid-market PE portfolio companies. OpenAI's Deployment Company sells GPT and the engineers together to anyone with a problem big enough to justify the embed. The pricing changes. The accountability changes. The blast radius changes. When the lab employs the engineer who writes the production code on your data, the lab is on the hook for the result in a way the API contract never made them.

The consulting firms see this too. Deloitte's earlier deal with Anthropic, announced in October 2025, put Claude on the desks of more than 470,000 Deloitte employees and turned Deloitte into a Claude-trained workforce as a distribution channel. The line between "we sell intelligence" and "we sell a partner who will help you use intelligence" is going to keep blurring through this year, because the labs and the consulting firms have both decided the line is in the wrong place.

For the buyer this is disorienting. The procurement category labels you used in 2024 no longer match what the vendors actually do. The right question is no longer which model API to standardize on. The right question is who is going to sit inside your workflow, observe it, and rebuild it around an LLM that you picked together, in a way you can keep when they leave.

Why the labs picked now

The timing was not random. Three forces converged in early 2026 to make this the moment to spend billions on engineers instead of GPUs.

The first was the MIT data and a year of executives reading it. By the start of this year, the 95 percent failure rate was the most cited statistic in any AI strategy deck in the Fortune 1000. Boards were asking why. CFOs were asking why. The answer the labs landed on, correctly, was last-mile integration, and the answer to last-mile integration is people in the building.

The second was the agentic shift. Through 2025, the labs moved from chat completions to tool use to agents that take real actions across multiple systems. An agent does not stop at the boundary of the model. It reaches into your CRM, your warehouse, your ticketing system, your code. Deploying an agent is a software engineering problem that crosses every boundary in your stack. There is no version of "self-serve agent deployment in a regulated enterprise" that scales without embedded engineers. The labs had to staff or fail.

The third was competitive. Palantir's stock more than tripled over the prior eighteen months, in part on a clean story about forward deployed engineering as the durable enterprise AI moat. Anthropic and OpenAI watched that story form and chose to fight on the same ground rather than cede it. The Anthropic Deloitte alliance, the Anthropic Blackstone-Goldman venture, the OpenAI Deployment Company, and the Tomoro acquisition all landed inside a six-month window. None of it is a coincidence.

What this means for the operator

If you are running a company with serious AI ambitions and you have not built or bought forward deployed engineering capacity, you have three doors. Each costs differently. Each cedes a different thing.

Door one is to take the lab's offer. You buy the model plus the engineers from the model vendor. Anthropic's joint venture. OpenAI's Deployment Company. You get fast time to first system. You get continuity with the model roadmap, because the people in your office talk to the people who train the model. You also accept that those engineers' loyalty, formally, sits at the lab. When the lab pivots a product line, retunes a model, or decides your problem is no longer interesting, your roadmap moves. You also accept lock-in, hard, because everything they build is shaped to the model they ship.

Door two is to hire your own. Build an internal forward deployed engineering function staffed by people on your payroll. This is the move the most ambitious operators are making and it is the most expensive. You will compete with frontier labs for talent at $385K to $1.2M total comp. You will need to build evaluation, deployment, and tooling infrastructure that the labs are amortizing across thousands of customers. You will get full sovereignty over the work. You will also get a multi-quarter delay before you ship anything that matters.

Door three is to pick an independent partner. This is the door that did not really exist three years ago, because the consulting industry had not yet split into "still selling slide decks" and "actually ships code." It exists now, and it is where most mid-sized companies should be looking. An independent partner is not on the lab's payroll, which means they will tell you when a frontier model is the wrong choice and an open-source one is the right one. They are not on your payroll, which means they will not get absorbed into your platform team's roadmap politics. They sit in the room, write the code, transfer the system, and leave when the system is yours.

The door you should not pick is the one where you do nothing and assume the AI vendor will eventually ship something self-serve that solves your last mile for you. That is what 95 percent of MIT's sample did. That is why they have nothing to show.

The honest version of the choice

The labs telegraphed something this spring that most enterprise AI strategies are still ignoring. The frontier lab business model is no longer "sell tokens." It is "sell tokens and the people who know how to wire them in." The price tag of that recognition was $5.5 billion in new capital across two announcements in eight days. The labs would not commit that capital if they thought the self-serve API would close the deployment gap by itself. They have better information about that than anyone in your boardroom. Their bet is the deployment gap will not close.

Which means the choice for an operator has changed shape. It used to be buy a model or build one. Now it is who you embed.

What you cannot do is keep treating the model as a product and the integration as someone else's problem. The vendor stopped believing that. Your competitors stopped believing that. The receipts say it does not work.

The architecting choice, not the buying choice

This is why the framing for every CEO this quarter needs to shift. The question is no longer which AI tool to buy. The question is what the embedded engineering function inside your company looks like, who staffs it, and how you keep what they build when the engagement ends.

That is an architecture decision. It is the hardest kind of architecture decision because it touches your workflows, your data, your headcount, your vendor relationships, and your board's mental model of what AI even is. A vendor will not make it for you. The right vendor will, in fact, refuse to make it for you, because they understand that the wrong architecture is the failure mode that hands you back to MIT's 95 percent.

Forward deployed AI engineering is the name of the discipline that produces working systems inside the 5 percent. It is what Palantir invented, what OpenAI and Anthropic just spent billions to acquire, and what your company will either build, buy, or borrow inside the next four quarters. The labs already made their bet. The question is whether you make yours deliberately or wake up to discover that your AI program was a budget line that produced slides.

The architectural decision is yours to lead. It will not delegate cleanly to procurement, to IT, or to the lab itself. The companies that will own this decade are the ones whose CEOs treat AI deployment as a question of organizational design, and not a question of vendor selection. The ones whose CEOs treat it as vendor selection will keep buying tokens that do not ship themselves, and they will keep wondering why the pilots stalled.

Agor AI Advisory exists to help you take door three with eyes open. We sit inside the workflow, write the production code, transfer the system to your team, and leave a structure you can run without us. The decision is yours. The clock is the lab's.

Sources

Buy, build, or borrow: the operator's three doors to forward-deployed AI engineering

Verifies the post's central operator claim that the FDE choice is a non-trivial pick among three close alternatives, each ceding something different. After 15 seconds the reader sees that 'do nothing and wait for self-serve' is the absent fourth door — the one that produced MIT's 95% failure rate.

The model is the cheap part of the stack now; the bottleneck is the human who can wire it into your actual systems.
The choice used to be buy a model or build one. Now it's who you embed.
The door that isn't on the table — do nothing and wait for self-serve — is the one 95% of MIT's sample picked. It's why they have nothing to show.

	Time to first system	Whose loyalty	Lock-in	The catch
Door 1 — Take the lab's offer (Anthropic JV / OpenAI Deployment Company)Fastest path, but you rent the engineers and inherit the lab's priorities.	Fast — continuity with the model roadmap because the people in your office talk to the people who train the model	The lab	Hard — everything they build is shaped to the model they ship	When the lab pivots a product line, retunes a model, or loses interest in your problem, your roadmap moves
Door 2 — Hire your own FDE functionTotal control at the highest cost and the slowest start.	Multi-quarter delay before you ship anything that matters	Yours — people on your payroll	None — full sovereignty over the work	Most expensive: compete with frontier labs at $385K–$1.2M total comp and build the eval/deploy/tooling infra the labs amortize across thousands of customers
Door 3 — Pick an independent partnerNo lab lock-in and no payroll capture, if you can tell the code-shippers from the deck-sellers.	They sit in the room, write the production code, then transfer the system to your team	Neither the lab nor your platform team's roadmap politics	Model-neutral — free to say a frontier model is wrong and open source is right; you keep the system when they leave	Didn't really exist three years ago — you have to find the firm that ships code, not the one still selling slide decks

Source: The post's 'What this means for the operator' section; comp figures ($385K–$1.2M) from the Perspective AI 2026 compensation report cited in the post. · verified · as of 2026-06-11

Want this kind of automation working for your business?

Agor AI designs and ships the systems these posts describe, scoped in weeks, not quarters.

Book a Free Strategy Call