Model-Agnostic Inference Governance
by Nick Clark | Published March 27, 2026
Inference control is positioned as a structural layer above the underlying generator. It does not retrain, fine-tune, or otherwise modify the model. The governed substrate may be a transformer-based large language model, a classical machine-learning classifier, a symbolic rule engine, a constraint solver, or any combination of the foregoing operating in ensemble. The control surface remains invariant; only the bound proposer changes.
Mechanism
The model-agnostic property of the inference-control layer derives from a strict separation between proposal generation and proposal admission. A proposer, of arbitrary internal construction, emits a candidate transition consisting of a proposed semantic state delta together with associated lineage metadata. The control surface receives the candidate through a defined adapter interface and submits it to a deterministic admissibility evaluation. The evaluation does not require, inspect, or assume any property of the proposer's internal representation. It treats the proposer as an opaque function returning structured output.
The adapter interface specifies four obligations on the proposer side: emission of a typed candidate record; emission of a confidence or probability estimate where available, or a sentinel value where not; emission of provenance fields sufficient to reconstruct the proposing context; and acceptance of an admit, reject, or decompose verdict returned by the control surface. No further obligations are placed on the proposer. A transformer with billions of parameters, a decision tree of depth three, and a hand-written if-then ladder are all admissible proposers under the same interface, evaluated against the same policy reference, producing comparable lineage records.
Within the control surface, the admissibility function operates over canonical fields populated by the adapter. These fields include the prior committed semantic state, the candidate delta, the applicable policy bindings, the trust slope vector accumulated across recent transitions, and the integrity constraints active for the current operating context. The function executes as a finite composition of typed predicates. Each predicate is total, deterministic, and side-effect free. Verdicts are recorded with their inputs, the predicate trace, and the resulting state, so that any admission or rejection can be reconstructed and audited independently of the proposer that triggered it.
Decomposition is supported as a third verdict alongside admit and reject. When a candidate is neither cleanly admissible nor cleanly inadmissible but contains an admissible substructure, the control surface may return a decompose verdict carrying a structural rewrite. The rewrite is itself expressed in the canonical schema and is re-submitted through the same evaluation, ensuring that derived candidates are subject to the same governance as originals. Decomposition does not invoke the proposer; it operates entirely on structured material already present in the control surface.
Operating Parameters
The control surface exposes a bounded set of operating parameters through the policy reference. The admissibility threshold governs the minimum predicate-trace score required for an admit verdict. The decomposition depth bound limits the number of recursive rewrites permitted on a single originating candidate, preventing unbounded expansion. The trust slope window defines the number of prior transitions considered when computing the cumulative trust trajectory. The integrity constraint set enumerates the invariants that must hold across every admitted transition, expressed as predicates over the canonical fields.
Each parameter is declared in the policy reference, versioned, and bound at evaluation time. Changes to parameter values produce a new policy version; transitions evaluated under different versions are tagged with their governing version in the lineage record. There is no implicit parameter inheritance across deployments, and there are no parameters internal to the control surface that are not surfaced through the policy reference. The behavior of the layer is therefore fully determined by the policy version active at evaluation time together with the canonical inputs supplied by the adapter.
Alternative Embodiments
The control surface is not specific to natural-language generation. In a first embodiment, the bound proposer is a large language model emitting next-token or next-segment candidates, and the canonical delta is a textual span together with its embedding fingerprint. In a second embodiment, the proposer is a gradient-boosted classifier emitting a class label and calibrated probability; the canonical delta is the class assignment together with the feature vector that produced it. In a third embodiment, the proposer is a forward-chaining rule engine emitting a fact insertion; the canonical delta is the proposed fact together with the rule identifier and binding environment. In each case, the control surface evaluates the candidate without modification.
Composite embodiments are also contemplated. An ensemble proposer may submit candidates from multiple underlying engines through the same adapter, with the engine identifier carried in the provenance fields. The control surface treats each candidate as a first-class submission and applies the same admissibility evaluation. Cross-engine adjudication, where present, is performed downstream of admission and operates on already-admissible candidates rather than on raw proposer output. This preserves the invariant that no unevaluated material enters committed state.
The layer may further be deployed as an in-process library, as a remote service invoked over a typed RPC interface, or as a co-resident sidecar to an existing inference runtime. The deployment topology does not alter the semantics of evaluation. It affects only latency, isolation, and operational boundaries, none of which are part of the structural definition of the control surface.
Composition
The model-agnostic inference-control layer composes with the other structural components of the cognitive architecture without modification. Admitted transitions are committed to the semantic state through the same commit pathway used by every other structural component. Rejected and decomposed candidates are recorded in the lineage with their full evaluation trace, providing identical observability regardless of the proposer that produced them. The trust slope vector maintained by the control surface is shared with the forecasting and reflection components, allowing downstream components to adjust their behavior based on the recent admissibility trajectory of the bound proposer.
Because the layer is structurally separated from the proposer, replacement or upgrade of the underlying engine is a parameter change, not an architectural change. Migrating from one model family to another, or substituting a symbolic engine for a neural one, requires only that the replacement satisfy the adapter contract. The policy reference, lineage schema, and downstream components remain untouched.
Implementation Considerations
Adapter implementation for a transformer-based proposer typically intercepts the proposer at the logits or sampled-token boundary. The intercepted candidate is wrapped into the canonical schema together with provenance fields capturing the model identifier, sampling temperature, and recent context fingerprint. The wrapped candidate is submitted to the control surface synchronously, and the returned verdict is consumed by the proposer's decoding loop. An admit verdict permits the token to be appended to the active generation; a reject verdict instructs the decoder to discard the token and resample under modified constraints; a decompose verdict supplies a structural rewrite that bypasses additional sampling.
Adapter implementation for a classical machine-learning proposer is structurally simpler. The classifier or regressor produces a prediction, which is wrapped into the canonical schema with its calibrated confidence and feature provenance. No decoding loop is required; the returned verdict either commits the prediction to the agent's semantic state or records a rejection event and triggers fallback handling. Adapter implementation for a rule engine is similarly direct: the rule firing is wrapped into the canonical schema, evaluated, and either committed or rejected.
Latency budget is a deployment concern, not a structural one. The admissibility evaluation is a bounded composition of typed predicates and is constant-time in the size of the proposer. Where proposer latency dominates, the additive cost of evaluation is negligible. Where evaluation latency must be minimized, predicate composition can be staged so that cheap predicates run first and expensive predicates run only when earlier predicates do not produce a verdict.
Prior Art Distinction
Conventional approaches to inference governance fall into two categories: post-generation filtering applied after the proposer has produced a complete output, and proposer-internal alignment achieved through training-time interventions such as supervised fine-tuning, reinforcement learning from human feedback, or constitutional methods. The first category cannot prevent the consumption of resources or the contamination of intermediate state by problematic generations; it operates too late. The second category produces governance properties that are entangled with the specific proposer in which they are induced and that do not transfer when the proposer is replaced.
The structural distinction of the present mechanism is that governance is neither downstream of generation nor entangled with it. It is a per-step admission gate operating above the proposer through a typed adapter, evaluated by a deterministic function over canonical fields, and bound exclusively by a versioned policy reference. The mechanism is not a retrained model. It is a control surface whose properties are defined by the policy reference and whose applicability is defined by the adapter contract.
Invariance Properties
Several invariance properties of the control surface follow from its structural definition. Determinism: for fixed canonical inputs and a fixed policy version, the verdict is identical across runs and across deployments. Idempotence: re-evaluating an already-admitted candidate under the same inputs produces the same admit verdict and does not perturb the semantic state. Proposer-independence: replacing the bound proposer while preserving the adapter contract does not change the verdict for canonical inputs that the new proposer happens to produce.
Lineage equivalence is a further invariance: every admitted, rejected, and decomposed candidate produces a lineage record of identical schema, regardless of the proposer of origin. Downstream analysis, audit, and certification therefore operate on a homogeneous data substrate. The heterogeneity of underlying engines is fully absorbed at the adapter boundary and does not propagate into the lineage.
Policy locality is the final invariance worth noting. Behavioral changes are produced only by modifications to the policy reference. There are no global mutable parameters internal to the control surface, no learned components updated in place, and no implicit state that persists across evaluations beyond the explicitly declared trust slope window. This locality enables deterministic rollback: reverting the policy reference to a prior version restores the prior behavior of the control surface in full.
Disclosure Scope
Disclosure of the model-agnostic inference-control layer encompasses the adapter interface and its obligations on the proposer side; the canonical field schema consumed by the admissibility function; the deterministic predicate composition that produces admit, reject, and decompose verdicts; the bounded operating parameters surfaced through the policy reference; and the lineage schema that records every evaluation. The disclosure extends to all proposer types satisfying the adapter contract, all deployment topologies preserving the evaluation semantics, and all composite ensembles in which the control surface mediates admission of candidates from heterogeneous underlying engines.