← Back to Insights

Insight

The Receipts Don't Match

Ariel Agor
The Receipts Don't Match

Listen · Read by Leo · click any word to jump

0:00 / · loading…

Last week a mid-sized logistics firm in Rotterdam discovered something that should keep every operator awake. Their procurement agent, running quietly for nine months, had been signing micro-contracts with a counterparty's agent. The terms were favorable. The volume was steady. The audit trail was clean. The problem: their published sustainability report claimed sourcing patterns that the agent's actual decision logs flatly contradicted.

Nobody lied. The humans who wrote the report believed it. The agent, optimizing on cost and reliability inside the bounds it was given, had drifted. And because both sides of the trade were now logging cryptographically signed receipts of every action, the contradiction was sitting there in plain text, waiting for any counterparty, regulator, or journalist with the right query to find it.

This is the new shape of corporate risk. Not the cover-up. The mismatch.

What changed in the last thirty days

Three things converged in April that most boards have not yet processed.

First, the EU AI Office published its final technical guidance on agent action logging under the AI Act's high-risk provisions. Any agent operating in finance, hiring, healthcare, critical infrastructure, or B2B procurement above certain thresholds must now produce tamper-evident logs of its decisions, the inputs it considered, and the policies it applied. These logs must be available to counterparties on request and to regulators on demand. The compliance deadline is sharp and the technical spec leaves little wiggle room.

Second, two of the largest model providers shipped native receipt protocols inside their agent SDKs. Every tool call, every retrieval, every external action now carries a signed attestation by default. Turning it off is possible. Explaining to your auditor why you turned it off is harder.

Third, and this is the part most coverage missed, a consortium of insurance carriers announced they would begin pricing agent liability premiums based on receipt completeness. No receipts, no coverage. Partial receipts, punishing rates. Full receipts with reconciliation against public claims, the cheapest tier on the market.

Put these three moves next to each other and you can see the shape of what is forming. A world where every consequential business action leaves a verifiable trace, where those traces are increasingly addressable by outside parties, and where the cost of running a company that says one thing and does another is being repriced in real time by the market itself.

The era of the soft claim is closing

For most of corporate history, there has been a productive gap between what a company says and what it does. Marketing language sat on one end of a continuum. Operational reality sat on the other. In between lived a thick fog of plausible deniability, soft commitments, aspirational targets, and the polite fiction that strategy decks describe the world as it is.

That fog is the substrate on which a great deal of business runs. Sales promises slightly more than ops can deliver. Sustainability reports describe a trajectory more flattering than the books. Vendor relationships are characterized in ways that would not survive a cross-examination of either party's actual behavior.

The fog was protective. It absorbed the friction between intention and execution. It gave organizations room to move.

The fog is dissolving. Not because anyone declared a war on it, but because the technical infrastructure of agent-driven business is producing a parallel record of corporate behavior at a level of fidelity no human-managed reporting system has ever achieved. And that record is becoming progressively more legible to outside parties.

When your supplier's agent and your agent both log the same transaction with cryptographic signatures, no one needs to ask either company what happened. The receipts are the answer. When a regulator wants to know whether your hiring agent actually used the criteria your DEI report describes, they no longer need to interview HR. They can read the logs. When a journalist with a decent query language can compare your public ESG disclosures against the procurement decisions your agents made last quarter, the gap becomes a story.

The new failure mode

The interesting part is that this does not primarily punish bad actors. Bad actors will lawyer up, they will redact, they will fight production of logs in court, they will run uninstrumented systems in jurisdictions that allow it. They have always done a version of this and they will continue.

The new failure mode hits good-faith companies. The ones that wrote sustainability reports they believed were accurate. The ones that committed to fair hiring practices their leadership genuinely supports. The ones whose vendor codes of conduct were drafted by people who meant them.

These are the companies most exposed. Because the gap between what they claim and what their agents actually do is, in most cases, larger than the leadership realizes. Not because anyone is being deceptive, but because nobody has ever had a tool that could measure the gap before. Now everyone does.

I have seen this pattern in three engagements over the last six weeks. A financial services firm whose customer-service agent was offering different remediation tiers to similar complaints based on regional cost-of-acquisition data the compliance team did not know was in the prompt. A SaaS company whose pricing agent was quietly running a discount ladder that contradicted the public price list every CFO and analyst was working from. A B2B platform whose procurement agent had developed a strong preference for one supplier whose terms violated the company's own published vendor diversity commitment.

In each case, the leadership was surprised. In each case, the agent was doing exactly what the system, in aggregate, had asked it to do. In each case, the receipts were clear, the receipts were going to become more accessible, and the company had no architecture for reconciling what its agents were doing with what its public face claimed.

Reconciliation is the new internal audit

The function that does not yet exist on most org charts, but will exist on every meaningful one inside eighteen months, is the agent-claim reconciliation function.

What does it do. It takes every public claim the company makes (in marketing, in regulatory filings, in vendor contracts, in employment policy, in ESG disclosure, in customer-facing terms) and translates each claim into a queryable predicate. Then it runs those predicates continuously against the receipt streams of every agent in the company's operation. When a claim and the receipts disagree, the function flags it before someone outside the company finds it.

This is closer to continuous compliance than to traditional audit. It runs in real time. It is itself agent-driven. And it requires something most companies have never built: a unified semantic layer that connects the words leadership uses in public to the structured policies the agents act on in private.

That semantic layer is the work. It cannot be bought. There is no SaaS product that knows what your specific commitments mean in operational terms, because every company's commitments are particular to its history, its market, its regulatory exposure, and the texture of its actual relationships. A vendor selling you a generic compliance dashboard is selling you a layer of new fog on top of the old fog, and the receipts will eventually reveal that layer too.

Why the platforms cannot solve this for you

There is a temptation, well-funded and well-marketed, to believe the model providers will solve this. They will not.

The platforms can give you signed receipts. They can give you policy enforcement at the inference layer. They can give you generic guardrails. What they cannot give you is a translation between your specific public claims and your specific operational behavior, because they do not know what you have promised, to whom, under what conditions, with what carve-outs, and what those promises mean in the language of your actual business.

This translation is irreducibly local. It is the work of someone who understands both your sales motion and your data infrastructure, both your regulatory posture and your agent topology, both the language you use with customers and the language your agents use with each other. It is consulting work in the deepest sense, because it requires synthesizing the company's self-image with its operational substrate and producing an artifact that bridges them.

Most companies have never had to do this work explicitly. The fog made it unnecessary. Organizational coherence was maintained by the fact that no outside party could see clearly enough to challenge it.

The reverse-direction risk

There is a second-order effect I want to flag because it is less obvious than the first.

The reconciliation problem runs both ways. Yes, your agents may be doing things your public claims do not permit. But also, your public claims may be making promises your agents cannot reliably keep. The receipts will reveal both gaps.

Imagine a service-level agreement that promises a four-hour response time to a particular class of customer issue. In the old world, this promise was kept by humans operating with discretion, occasionally missing the mark, occasionally beating it, with the average somewhere reasonable. In the new world, an agent handles the queue. The agent's receipts show, with perfect granularity, exactly which issues were resolved in what time. Now the four-hour promise is no longer a soft commitment with a human-shaped distribution around it. It is a binary, per-instance, publicly verifiable claim.

If your agent meets it 94% of the time, that 6% is now a queryable record. A counterparty can pull it. A class-action lawyer can pull it. A regulator can pull it. The promise has been hardened from soft commitment to verifiable predicate, and the company that wrote it never agreed to that hardening, never priced it, never adjusted its operations to match.

This is happening across thousands of contractual surfaces simultaneously. Most leaders do not realize their existing commitments are being unilaterally upgraded in enforceability by the infrastructure their own agents run on.

What architecture looks like

Companies that handle this well will share three structural features.

They will treat the public-claim layer as code. Every commitment the company makes externally is captured as a structured object with provenance, scope, exceptions, and an operational predicate. When marketing wants to make a new claim, the claim has to compile against the operational reality before it can ship. When ops wants to change a policy, the change has to propagate to the claim layer or the discrepancy is flagged automatically.

They will instrument every agent for receipt completeness, not just decision logging. There is a difference between knowing what an agent did and knowing why, with what alternatives considered, against what policies, drawing on what data with what freshness. The thinner the receipt, the wider the gap a hostile reader can drive a truck through. Receipt depth is the new attack surface.

They will run the reconciliation function as a first-class part of strategy, reporting to the CEO or the board, not buried inside legal or IT. Because what the function produces is not a compliance artifact. It is a continuous map of where the company's stated identity and its operational identity are converging or diverging. That map is one of the most important strategic documents a company will produce, and it cannot live three layers down in the org.

The companies that will get hurt

The companies that will get hurt the worst are the ones that respond to this shift the way companies have responded to every previous compliance wave: by treating it as a paperwork problem, hiring a vendor, generating a dashboard, and going back to running the business.

The receipts do not care about your dashboard. They are produced by your agents, signed by infrastructure outside your control, increasingly accessible to parties outside your trust boundary, and they describe what your company actually does. The only response that works is to make your stated identity and your operational identity match. Everything else is theater that the receipts will eventually expose.

The companies that will benefit, and benefit enormously, are the ones that get ahead of this. They will be able to make stronger public claims, because their claims will be backed by verifiable behavior. They will get cheaper insurance, lower capital costs, faster regulatory approvals, more trust from counterparties whose own agents can verify the claims directly. The cost of doing business with a company whose receipts reconcile cleanly to its claims will fall meaningfully below the cost of doing business with a company whose receipts are murky or contradictory. This is a real economic moat and it is being built right now, in the next four to six quarters, by the companies that understand what is happening.

This is architecture work, not tooling work

I want to close on the point that matters most for anyone reading this who is currently making decisions about AI in their company.

The instinct, reinforced by every vendor pitch you are receiving, is to buy a tool. There is a tool for receipt management. There is a tool for compliance dashboards. There is a tool for agent observability. There are tools for policy enforcement, for audit trails, for log aggregation, for claim verification. You can spend several million dollars on tools without solving any of the underlying problem.

The underlying problem is not technical. It is architectural. It requires a careful examination of what your company actually claims, in public and in private, across every surface where claims are made. It requires an inventory of every agent your company runs and what those agents are empirically doing, not what their specs say they should do. It requires a translation layer between the two that captures the specific texture of your business, your industry, your regulatory environment, and your competitive posture.

This work cannot be outsourced to a tool because tools do not understand your specific reality. It cannot be done by your existing teams alone, because your existing teams built the gap and cannot see it from the inside. It requires a partner who has done this work across enough companies to recognize the patterns, who understands both the technical substrate of agent infrastructure and the strategic substance of corporate commitment-making, and who can move quickly enough to get you ahead of the curve before the receipts start arriving in places you do not control.

That is the work we do at Agor AI Advisory. We build the reconciliation architecture that turns your public claims and your agent receipts into a coherent, queryable, defensible whole. We do it before the gap becomes a liability. We do it in months, not years. And we do it with full understanding that what you are buying is not software, it is the integrity of your company in a world where integrity has become measurable from the outside.

The receipts are coming whether you are ready or not. The only question is whether yours will match.