At what granularity should the Cost Attribution Layer be designed?

Three granularity levels are required. Step level: every model call, tool invocation, and data retrieval is attributed to the step in the workflow that generated it. Task class level: step costs are aggregated to the task class they serve — all document extraction steps attributed to the document extraction task class, all classification steps to the classification task class. Job-to-be-done level: task class costs are aggregated to the business function they contribute to — all task classes serving the lead qualification function attributed to that job-to-be-done. The step level is the primary governance instrument: it makes routing decisions testable and [Exception Architecture](https://arcoventure.studio/lexicon/exception-architecture) encoding consequences measurable. The job-to-be-done level is the reporting instrument: it connects the agentic cost structure to the business functions it serves, enabling direct ROI measurement of each autonomous function against its human-staffed equivalent. Build all three levels from the same underlying trace data using the [Deterministic Logging](https://arcoventure.studio/lexicon/deterministic-logging) infrastructure that populates the [Operational Ledger](https://arcoventure.studio/lexicon/operational-ledger).

How does cost attribution connect to Intelligence Arbitrage routing decisions?

Cost attribution is the input that makes [Intelligence Arbitrage](https://arcoventure.studio/lexicon/intelligence-arbitrage) data-driven rather than intuitive. The routing decision — “route this task class to a cheaper model” — should be triggered by cost attribution data showing that the task class is consuming disproportionate cost relative to its output quality contribution. Without cost attribution at the task class level, the routing decision is made on impression. With it, the decision is made on evidence: this task class represents 34% of total token spend while contributing to 8% of escalation risk, making it the highest-priority routing optimisation candidate. After the routing change is made, the [Cost Attribution Layer](https://arcoventure.studio/lexicon/cost-attribution-layer) confirms whether the change produced the expected cost reduction. If the cheaper model meets the [Quality Threshold](https://arcoventure.studio/lexicon/quality-threshold) but does not reduce cost as expected — perhaps because it requires more tokens per call to produce equivalent output — the routing decision must be revisited. Cost attribution makes this feedback loop explicit and rapid.

What is Architecture Cost Drift and how should the Steward respond to it?

Architecture Cost Drift is the cumulative economic delta between the current architecture’s cost profile and the v0 baseline, measured continuously by the [Cost Attribution Layer](https://arcoventure.studio/lexicon/cost-attribution-layer). Positive drift — the system costing less per unit than the v0 baseline — confirms that Steward optimisations are compounding correctly: [Exception Architecture](https://arcoventure.studio/lexicon/exception-architecture) updates are reducing the [Escalation Rate](https://arcoventure.studio/lexicon/escalation-rate) for encoded exception classes, [Intelligence Arbitrage](https://arcoventure.studio/lexicon/intelligence-arbitrage) routing changes are reducing per-execution cost for targeted task classes, and the [Operational Arbitrage](https://arcoventure.studio/lexicon/operational-arbitrage) is expanding as the system matures. Negative drift — rising cost per unit without corresponding output quality improvement — signals an architectural review is required: which task classes are generating the drift, whether recent routing changes are performing as expected, and whether any [Intervention Threshold](https://arcoventure.studio/lexicon/intervention-threshold) calibration decisions have inadvertently increased escalation frequency and the associated human-in-the-loop cost. Architecture Cost Drift alerts surface in the [Audit Surface](https://arcoventure.studio/lexicon/audit-surface) governance digest alongside [Escalation Rate](https://arcoventure.studio/lexicon/escalation-rate) summaries so the Steward can review economic and operational signals in the same daily session.

When should the Cost Attribution Layer be built — before or after the first production deployment?

Before. The v0 baseline — the per-step, per-task-class cost profile of the system at first production deployment — is the reference against which every subsequent Steward optimisation is measured. A [Cost Attribution Layer](https://arcoventure.studio/lexicon/cost-attribution-layer) built six months into production has no baseline to compare against: it can report the current cost profile but cannot confirm whether it represents improvement or deterioration relative to the original design. The cost of building the [Cost Attribution Layer](https://arcoventure.studio/lexicon/cost-attribution-layer) before deployment is a one-time design investment. The cost of not having a v0 baseline is permanent: every optimisation decision made without it is measured against an impression rather than evidence, and the compounding effect of correct decisions is invisible to the [Steward](https://arcoventure.studio/lexicon/stewardship-model) who made them.

How does the Cost Attribution Layer feed the Audit Surface?

The [Audit Surface](https://arcoventure.studio/lexicon/audit-surface) is the Steward’s daily governance digest — the structured summary of system health that the Steward reviews to confirm the autonomous business is operating within design parameters. The [Cost Attribution Layer](https://arcoventure.studio/lexicon/cost-attribution-layer) contributes the economic component of that digest: Architecture Cost Drift status (positive, stable, or negative), the task classes with the highest cost-per-execution variance from the v0 baseline, and the routing optimisation candidates identified by the [Intelligence Arbitrage](https://arcoventure.studio/lexicon/intelligence-arbitrage) signal — task classes consuming disproportionate compute relative to their contribution to output quality. Without the Cost Attribution Layer, the [Audit Surface](https://arcoventure.studio/lexicon/audit-surface) shows the Steward operational signals — [Escalation Rate](https://arcoventure.studio/lexicon/escalation-rate), [Intervention Threshold](https://arcoventure.studio/lexicon/intervention-threshold) anomalies, [Proof of Action](https://arcoventure.studio/lexicon/proof-of-action) trail completeness — but not the economic signals that confirm whether those operational decisions are compounding as [Operational Arbitrage](https://arcoventure.studio/lexicon/operational-arbitrage) or accumulating as cost drift. The two layers together give the Steward a complete governance view: the system is operating correctly and it is operating economically. Neither is sufficient alone.

The Cost That Compounds

Cost Attribution Layer is the architectural component that traces the operational cost of every agentic step, task class, and job-to-be-done against the business’s v0 baseline — enabling the Steward to evaluate the economic impact of each architectural optimisation decision with precision, rather than measuring compounding effect by impression or total spend. The economics of agentic deployment can fail in two structurally different directions. The first is obvious: the system costs more than anticipated because token spend was not modelled correctly. The second is invisible: the system costs less than the human-staffed equivalent it replaced, but Steward optimisations that were supposed to reduce the per-unit cost have not compounded as expected, and the business cannot identify which decisions worked and which did not.

Both failures have the same root cause: cost visibility at the aggregate level. Knowing total monthly token spend is accounting. Knowing cost per execution per step per task class is governance. The distinction matters because Steward optimisations — routing a T1 task class to a cheaper model, encoding a recurring exception into the Exception Architecture, redesigning a workflow step to reduce model calls — each have specific economic consequences at the step level that are invisible in aggregate spend. The Intelligence Arbitrage decision that routes a task class to the cheapest capable model is only verifiable at the step level: total spend may fall for reasons unrelated to the routing change, and total spend may rise while per-unit cost falls because volume increased. The Cost Attribution Layer is the instrument that separates these effects.

The cost failure that compounds before it is visible

The agentic cost failure pattern that has surprised the most well-funded teams is structurally simple: usage scales faster than the cost model was designed for, and the token bill arrives before the attribution layer exists to diagnose which task classes are driving it. The Inference Floor has made capable models accessible enough to build with but not yet cheap enough to ignore at production volume. At low usage, the cost structure is invisible because every task class appears affordable. At scale, the task classes that consume disproportionate compute become the dominant cost driver — and without step-level attribution, they are indistinguishable from the rest of the bill. A business can make something people want and still find that the tokens cost more than the revenue. Both things are possible simultaneously, and without a Cost Attribution Layer, the path from one to the other is invisible until it has already happened.

This failure is preventable with cost attribution designed from the start. A business that tracks token spend per task class from the first week of production can identify which task classes are generating disproportionate cost before usage scales, route T1 task classes to cheaper models bounded by the Quality Threshold, and model the cost structure at the volume levels that adoption would produce. A business that tracks only total spend discovers the cost problem when the invoice arrives — at which point the routing optimisations that would have addressed it are retrofits rather than designed-in properties of the system.

The v0 baseline and why it matters

The Cost Attribution Layer’s primary governance function is not current cost reporting — it is baseline comparison. The v0 baseline is the per-step, per-task-class cost profile of the system at the moment of the first production deployment. Every Steward optimisation made after that point is measured against it. A routing change that routes a T1 task class to a cheaper model should reduce the per-execution cost for that task class. The Cost Attribution Layer confirms whether it did, by how much, and whether the reduction was maintained across the next execution cycle or reverted. Without the baseline, the Steward has impressions. With it, the Steward has evidence.

Architecture Cost Drift — the cumulative economic delta between the current architecture’s cost profile and the v0 baseline, measured continuously — is the signal the Cost Attribution Layer generates for the Steward. A system drifting toward higher cost per unit without a corresponding increase in output quality is accumulating cost at the architectural layer rather than capturing Operational Arbitrage. A system drifting toward lower cost per unit is compounding the Operational Arbitrage as the Steward makes the architectural decisions the Exception Architecture and routing decisions were designed to enable. Architecture Cost Drift alerts are the economic component of the Audit Surface governance digest — the same digest the Steward reviews daily alongside Escalation Rate summaries and threshold anomalies.

Connecting cost attribution to the Operational Ledger

The Cost Attribution Layer and the Operational Ledger must be built from the same underlying trace data or they diverge and lose their comparability over time. Deterministic Logging is the technical foundation of both: causation-level records of every decision are what allow cost to be attributed to the specific step and task class that generated it, and those same records populate the Proof of Action trail that the Operational Ledger compounds. The Operational Ledger accumulates operational intelligence — what was learned from each execution cycle. The Cost Attribution Layer accumulates economic intelligence — what each execution cycle cost. Both are indexed against the same step identifiers, task class identifiers, and job-to-be-done identifiers.

When the Steward identifies a recurring exception class in the Operational Ledger and encodes it into the Exception Architecture, the Cost Attribution Layer should show the economic consequence: fewer escalations for that class, lower per-execution cost for the downstream task classes the escalation was blocking. The compounding is visible at both layers simultaneously, or it is visible at neither. An Exception Architecture update that does not produce a measurable cost reduction in the Cost Attribution Layer within the next measurement cycle should be reviewed: either the encoding was incomplete, the routing decision has not been updated to reflect the new exception class, or the Quality Threshold for the newly autonomous task class requires recalibration.

The Operator’s Verdict

The cost that compounds is the cost that is measured. The optimisation that is not attributed to a specific step cannot be confirmed to have worked. Design the Cost Attribution Layer before the first production deployment — not because the cost will be high at launch, but because the baseline established at launch is the instrument through which every subsequent Steward decision is held accountable. For commercial proof of what correctly attributed agentic cost looks like at production scale, the Vercel Perspective documents the specific cost-to-output ratios that step-level attribution makes possible.

Technology changes what is possible. Attribution determines whether what is possible is also what is profitable.

KEY TAKEAWAY

What is the Cost Attribution Layer and why is it required for Steward governance of agentic economics?

The Cost Attribution Layer is the architectural component that traces the operational cost of every agentic step, task class, and job-to-be-done against the system’s v0 baseline — making the economic impact of Steward optimisation decisions measurable rather than estimated. Without it, cost governance operates at the aggregate level: the Steward knows total monthly token spend but cannot identify which task classes are generating disproportionate cost, whether routing optimisations have reduced per-unit cost as intended, or whether Architecture Cost Drift is positive or negative. With it, the Steward can evaluate each architectural decision economically: a routing change that routes a T1 task class to a cheaper model bounded by the Quality Threshold should show a measurable reduction in per-execution cost for that class in the next measurement cycle. The v0 baseline — the per-step, per-task-class cost profile at first production deployment — is the reference against which all subsequent optimisations are measured. Architecture Cost Drift alerts surface in the Audit Surface governance digest alongside Escalation Rate summaries, giving the Steward economic and operational signals in a single daily review. Key metric: cost attribution at the step level makes Intelligence Arbitrage routing decisions testable — a routing change should produce a measurable reduction in per-execution cost for the targeted task class within the next measurement cycle. A reduction that cannot be measured at the task class level cannot be confirmed to have occurred.