LLM as Structurally Untrusted Proposal Generator

Nick Clark

LLM as Structurally Untrusted Proposal Generator

by Nick Clark | Published March 27, 2026 | PDF

A proposal originating from an untrusted source, including a language model, an unverified peer agent, or an external party, is treated as evidence to be evaluated rather than as a command to be executed. Structural separation between the proposing role and the committing role prevents privilege escalation, removes the implicit trust that current integration patterns confer on plausible output, and converts the question of admission from a stylistic judgment into a deterministic gating decision. The architecture is defined in Chapter 7 of the Cognition Patent and operates as a structural component of the agent's cognitive substrate rather than as a downstream filter applied after generation.

Mechanism

The mechanism partitions the agent into two structurally distinct roles. The proposal role is permitted to generate candidate mutations of agent state but is denied any capability to commit those mutations. The commit role is the only role with authority to write to the canonical fields of the agent, and it accepts only inputs that have been routed through validation. A language model, regardless of its sophistication or measured accuracy, is assigned exclusively to the proposal role. Its output is a candidate mutation, not a state transition. The candidate is materialized as a structured object whose schema includes the proposed delta, the lineage of prior fields it reads from, the policy reference it claims to satisfy, and a content hash that allows downstream stages to detect tampering between proposal and validation.

Validation operates as a deterministic evaluation function over the candidate. The function consults the agent's policy reference to retrieve the constraints applicable to the affected fields, consults the integrity field to retrieve any active retention or coherence rules, and consults the capability envelope to determine whether the proposed mutation lies within the agent's authorized range of action. Each of these consultations is itself a structured read against canonical state, and each is recorded in the lineage of the candidate. When validation completes, the candidate is annotated with a verdict: admissible, inadmissible, or contingent. Contingent verdicts are routed to arbitration, where competing proposals from multiple language models or from a language model and a non-generative source are resolved by a tie-breaking function whose rules are themselves declared in policy.

The structural property that distinguishes this mechanism from conventional output filtering is that the proposal never enters agent state until the commit role acts. Conventional integration patterns mutate state and then attempt to reconcile, accepting that some mutations will be wrong and relying on later checks to detect drift. Here, the canonical fields are immutable with respect to proposal-role activity. The agent's observable behavior cannot reflect a proposal that has not been admitted, because no read path traverses the candidate buffer. This eliminates the class of failure in which corrupted state appears internally consistent because it was written before validation occurred.

Operating Parameters

The proposal-to-commit pipeline is parameterized along several declared dimensions. The first is the trust level assigned to each proposing source, expressed as a categorical label rather than a numeric score. A language model operating without external grounding is assigned the lowest trust level; a language model operating against a retrieval index whose contents are signed by a known authority is assigned an intermediate level; a deterministic non-generative source is assigned the highest level. Trust level governs the depth of validation applied: the lowest trust level requires complete validation across all applicable constraints, while higher levels may bypass constraints that are structurally guaranteed by the source.

The second parameter is the breadth of the candidate. A proposal that affects a single field is validated against constraints local to that field; a proposal that affects multiple fields is validated against the cross-field invariants declared in policy. The third parameter is the temporal window within which a candidate remains eligible for commit. Candidates that are not committed within their window are discarded, preventing replay of stale proposals against state that has since evolved. The fourth parameter is the arbitration policy applied when multiple candidates target overlapping fields, ranging from strict first-admitted-wins ordering to weighted merging where the policy declares a fold function over the candidate set.

Each parameter is declared in the agent's policy reference and is auditable without inspection of the language model itself. An external reviewer can determine, from policy alone, what the agent will accept from a given source, under what conditions, and with what arbitration behavior, without needing to reproduce the model's weights or sampling behavior.

Alternative Embodiments

The mechanism admits embodiments that vary along the placement of the validation function and the granularity of the proposal interface. In a co-located embodiment, the proposal role and the commit role are processes within a single runtime, separated by capability boundaries enforced by the runtime itself. In a distributed embodiment, the roles are separate services communicating over a signed message channel, and the validation function executes within the commit-role service so that the proposal-role service never observes admitted state. In a federated embodiment, multiple commit-role services subscribe to the same proposal stream and admit candidates independently according to their own policy references, allowing different downstream agents to maintain divergent but individually consistent views of the same generative output.

The proposal interface itself can be embodied at different granularities. A coarse-grained interface accepts whole structured objects and validates them as a unit. A fine-grained interface accepts field-level deltas and validates each delta independently, allowing partial admission of a proposal whose individual components have different validity. A streaming interface accepts incremental tokens from the language model and validates against partial-proposal predicates, enabling the commit role to abort generation early when the partial proposal is structurally inadmissible.

The mechanism also admits embodiments that vary by the source of the proposal. The architecture is source-agnostic at the structural level: a proposal originating from an external party, a peer agent of unknown provenance, or a newly joined agent in a multi-agent system is processed through the same pipeline as a proposal originating from an internal language model. The trust level differs, and therefore the validation depth differs, but the structural separation between proposal and commit is invariant.

Composition with Adjacent Mechanisms

The untrusted-proposal mechanism composes with capability awareness, with integrity governance, and with the agent's lineage substrate. Capability awareness contributes the envelope against which proposed mutations are checked for feasibility, ensuring that an admitted proposal is not only policy-compliant but also executable on the substrate. Integrity governance contributes the retention and coherence rules that constrain which mutations preserve the agent's normative trajectory. The lineage substrate records every candidate, every verdict, and every commit, producing a complete and reproducible record of how the agent's state evolved under the influence of generative input.

Composition with these adjacent mechanisms is what converts the untrusted-proposal pipeline from a defensive filter into a structural property of the agent. Filters can be removed or bypassed; structural properties cannot be bypassed without rewriting the architecture. An auditor reviewing an agent that uses this mechanism can verify, from the lineage alone, that no admitted state derives from an unvalidated proposal.

Distinction from Prior Art

Prior approaches to language-model integration fall into two categories. The first is direct integration, in which model output is consumed as authoritative and any safety check is applied as a post-hoc filter on the resulting behavior. This approach treats the model as trusted by default and relies on the filter's completeness to catch failures. The second is sandboxed integration, in which model output is executed in an isolated environment and its effects are observed before being merged into agent state. This approach treats the model as untrusted but conflates execution with proposal, requiring the runtime to actually run the proposed action in order to evaluate it.

The mechanism described here is structurally distinct from both. It treats the model as untrusted by default, like sandboxed integration, but does not require execution to evaluate the proposal. Validation is performed against declared policy and canonical state, not against observed behavior. This distinction is material because it allows the agent to reject proposals whose execution would be irreversible, costly, or hazardous, without ever performing the action.

Disclosure Scope

The disclosure covers the structural separation of proposal and commit roles, the deterministic validation function, the arbitration of competing candidates, the parameterization by trust level and breadth, and the alternative embodiments described above. It covers any system in which a language model or other generative source produces structured candidates that must traverse a policy-governed validation stage before affecting canonical agent state, regardless of the runtime environment, the model architecture, or the deployment topology. It covers source-agnostic application of the same pipeline to proposals originating from external parties or peer agents.

Implementations that route language-model output directly into agent state without a structurally enforced commit boundary, or that perform validation only against observed execution rather than against declared policy, fall outside the disclosure. Implementations that apply the pipeline to a subset of fields while permitting unmediated writes to other fields fall outside the disclosure with respect to those unmediated fields. The structural property is that the agent's observable behavior cannot reflect an unvalidated proposal, and the disclosure scope tracks that property.

The disclosure further covers the treatment of partial admissions, in which a proposal affecting multiple fields is admitted with respect to some fields and rejected with respect to others, with the lineage recording precisely which subset entered canonical state. It covers the audit construction in which a verifier reconstructs the validation verdicts from the lineage alone, without access to the proposing model, and confirms that no admitted state derives from an inadmissible candidate. It covers the structural separation between the role that may propose and the role that may commit as an invariant of the architecture, regardless of whether the two roles are realized as separate processes, separate services, separate trust domains, or separate capability tokens within a single runtime. The structural separation is the protected property; the realization is implementation choice.

Finally, the disclosure covers the case in which the proposing source is itself an agent of the same architecture, joining a multi-agent system without prior trust establishment. In that case the receiving agent applies the same pipeline to proposals received from the joining peer as it would apply to language-model output, with the trust level governed by the peer's attestations rather than by a static configuration. The pipeline is invariant under the source; the trust calibration is parameterized by source attestation, and the parameterization is itself declared in policy and recorded in lineage.