Mechanism
Trust weight calibration is the feedback mechanism that keeps a language model's influence over an agent's state proportional to that model's demonstrated reliability. In the disclosed architecture every language model occupies the structural role of an untrusted proposal generator: no model output is authoritative, and every output is a candidate semantic mutation that the agent-resident infrastructure must validate before it can affect any agent field. When more than one language model produces competing candidate mutations for the same field or operation, the arbitration engine resolves the conflict by trust-weighted evaluation. The trust weight is the per-model value the arbitration engine applies in that evaluation, and trust weight calibration is the process that updates that value as evidence accumulates.
The trust weight is not a static assignment fixed by a model's author or operator. It is a dynamic value that reflects the model's recent performance within the agent's governance context. The trust score for each language model is maintained per agent and per domain, so that a model recognized as highly reliable for one category of proposal may carry a lower weight for another category in which it has performed poorly. Calibration is the running adjustment of these per-agent, per-domain weights, and decay is the gradual erosion of weight in the absence of fresh evidence.
How the Trust Weight Is Used
Trust weight calibration has meaning only in relation to the arbitration engine that consumes the weight. When the mutation engine produces a conflict set, two or more candidate mutations that target the same field and cannot be simultaneously applied, the arbitration engine scores each candidate on a plurality of evaluation dimensions: semantic coherence with the agent's current state, consistency with the agent's intent field, alignment with the agent's policy reference, and compatibility with the agent's lineage trajectory. The per-dimension scores are multiplied by the originating model's trust weight to produce a trust-adjusted composite score, and the candidate with the highest trust-adjusted composite score is selected as the arbitration winner.
The trust weight is therefore a multiplier on a model's voice in conflict resolution, not a gate on whether an individual output is admitted. A candidate that survives validation but originates from a model with a low accumulated trust weight is discounted relative to a candidate from a model with a higher weight. If no candidate achieves a trust-adjusted composite score above a configurable minimum threshold, the arbitration engine may reject all candidates and request new proposals, or escalate the conflict to a governance authority for manual resolution.
Outcome-Based Adjustment
The first calibration mechanism is outcome-based adjustment. When a mutation proposed by a language model is accepted and later evaluated as correct, meaning that the mutation did not produce integrity violations, did not require governance intervention, and did not contribute to negative outcomes, the model's trust weight for the relevant domain is increased. When a mutation proposed by a language model is accepted but later evaluated as incorrect, meaning that the mutation contributed to integrity degradation, governance intervention, or negative outcomes, the model's trust weight is decreased.
The adjustment is asymmetric. The decrease applied for an incorrect accepted proposal may be larger than the increase applied for a correct one, reflecting the asymmetric cost of accepting incorrect proposals into governed state. The adjustment also extends backward in time: a proposal that was validated and accepted, then subsequently determined to have introduced errors, inconsistencies, or policy violations, produces a retroactive trust penalty against its originating model.
Temporal Decay
The second calibration mechanism is temporal decay. Trust weights decay over time in the absence of new evidence, reflecting the principle that a model's reliability demonstrated at one point in time may not persist indefinitely. The disclosed motivations for decay are distribution shift, model updates, and changes in the operational context, each of which can render past performance an unreliable predictor of present performance. Decay ensures that a model must continue to earn its weight through fresh accepted-and-correct proposals rather than coasting indefinitely on historical standing.
The decay rate is configurable per domain and per model category. This per-domain, per-category configurability lets a deployment apply rapid decay where the operating environment drifts quickly and slower decay where a model's behavior in a domain is stable. Decay and outcome-based adjustment operate together: a model whose recent proposals are correct accrues weight faster than decay erodes it, while a model that stops contributing, or that contributes incorrectly, sees its weight fall.
Calibration on a Sealed Record
The calibration feedback loop operates on the agent's lineage, and the lineage entries that feed it are tamper-resistant. Every arbitration decision is recorded as a first-class semantic event whose record includes the identities of the competing language models, the candidate mutations they produced, the trust weights applied, the per-dimension scores computed, the selection or reconciliation logic applied, and the identity of the winning or reconciled candidate. Each arbitration event record is cryptographically signed and sealed into the agent's lineage chain using the sealing mechanism described in Chapter 1, and the sealed event cannot be retroactively altered, deleted, or reordered.
This immutability is what gives calibration its integrity: the trust-weight feedback loop operates on a tamper-resistant historical record, preventing an adversary from manipulating the arbitration history to inflate or deflate a particular model's trust score. Validation records for rejected proposals are likewise persisted in lineage, so the evidence base for calibration includes both accepted and rejected outcomes and is available for governance audit.
Cross-Agent Trust Recalibration
Because arbitration decisions are first-class persisted events, calibration is not confined to a single agent's local history. When an agent's arbitration history reveals a pattern, for example a pattern in which a particular language model consistently produces proposals that fail validation or are overridden in arbitration, that pattern can be propagated to other agents that use the same model, enabling network-wide trust recalibration. An arbitration event in which one model's proposal was selected over another's, and in which the selected proposal subsequently proved correct, increases the selected model's trust weight for similar future proposals and decreases the overridden model's weight.
Calibration also composes with the system's anti-gaming substrate. When the evaluation pipeline detects evidence that mastery signals have been gamed, the trust weight assigned to language model proposals that reference the compromised evidence is reduced, causing the arbitration engine to prefer alternative proposals or to reject the affected unlock proposal entirely. A security or gaming event thus feeds the same per-model trust weight that ordinary outcome-based adjustment maintains.
Prior-Art Context
Conventional treatments of language-model reliability characterize a model's reliability as a static property fixed at one point in time. The disclosed mechanism instead maintains a continuous, per-agent, per-domain trust weight that updates with every arbitration outcome and decays in the absence of fresh evidence. Where ensemble methods aggregate model outputs through voting or averaging to produce a blended result that inherits the authority of multiple models, the disclosed architecture treats each model's output as an independent proposal and uses the trust weight only to weight conflict resolution, never to confer authority on the model itself.
The distinguishing structural contribution is the coupling of per-model trust calibration to a sealed, auditable arbitration lineage. Outcome-based adjustment, asymmetric and retroactive penalties, and temporal decay are computed from immutable arbitration and validation records rather than from a separate, mutable scoring store, which is what allows the trust-weight feedback loop to resist manipulation and to support cross-agent propagation.
Disclosure Scope
Trust weight calibration and decay, comprising the dynamic per-agent and per-domain trust weight applied by the arbitration engine to each language model's proposal, the outcome-based adjustment that increases the weight for accepted-and-correct proposals and decreases it, asymmetrically and retroactively, for accepted-and-incorrect proposals, and the temporal decay of the weight in the absence of new evidence at a rate configurable per domain and per model category, is disclosed in the cognition filing (U.S. Application No. 19/647,395 and its international counterpart) in the LLM skill-gating chapter. This article describes that disclosed mechanism.
The scope extends to embodiments in which the calibration feedback operates on cryptographically sealed arbitration and validation events in the agent's lineage, in which arbitration patterns are propagated for network-wide trust recalibration, and in which security or anti-gaming events feed the same per-model trust weight. The disclosure does not depend on the specific arithmetic by which weights are increased, decreased, or decayed; it is the structural arrangement coupling per-model, per-domain trust calibration to a tamper-resistant arbitration lineage that the disclosure intends to capture.