Pre-Generation vs Post-Generation Distinction
by Nick Clark | Published March 27, 2026
The distinction between content sourced from the substrate and content generated at inference is made explicit at the call site. Downstream consumers see the source distinction as a first-class property of every value they receive, rather than having to infer provenance from heuristic post-processing.
Mechanism
The pre-generation distinction is defined in Chapter 8 of the cognition patent. Every value that crosses an inference call site carries a provenance tag that identifies the value as substrate-derived, inference-generated, or composite. Substrate-derived values are those that originate in the agent's grounded canonical fields, including retrieved knowledge, tool results, sensor readings, and prior verified execution state. Inference-generated values are those produced by a generative model at the call site under evaluation. Composite values are those whose construction blended both classes; the composition itself is recorded as a structural fact rather than collapsed into a single tag.
The tag is not advisory metadata to be consulted by goodwill. It is a structural property of the value type, attached at the call site by the inference control layer and propagated by the type system through every subsequent operation. Stripping the tag is not an available operation. Operations that would otherwise erase provenance, such as concatenation or summarization, instead construct composite values whose provenance is the union of their constituents.
Because the tag is attached at the call site, downstream consumers do not have to reconstruct provenance after the fact. A retrieval-augmented generation pipeline that, in a conventional system, would receive a single string and apply a heuristic to guess which spans were retrieved and which were generated, instead receives a structured value in which the distinction is already explicit. A safety classifier that, in a conventional system, would treat substrate-derived and inference-generated content uniformly, can apply different admissibility criteria to each class without inferring origin from the content itself.
The mechanism contrasts with post-generation approaches in which generated content is filtered, re-ranked, or evaluated by a learned model only after generation has completed. Post-generation approaches cannot prevent the inference call from consuming resources, cannot prevent the generated content from being committed to intermediate state visible to other parts of the system, and cannot reliably distinguish substrate-derived from inference-generated material once the two have been concatenated into a single output buffer. The pre-generation distinction is structurally upstream of these limitations.
Operating Parameters
The provenance tag schema is declared in the policy reference. Operators may extend the schema to distinguish among substrate sub-classes, such as separating tool results from retrieved knowledge or distinguishing high-trust from low-trust retrieval sources. The inference control layer enforces that every value crossing a call site carries a tag drawn from the declared schema; values lacking a tag are refused at the boundary.
Composition rules are also declared in policy. The default rule treats any composite as carrying the strictest constraints among its constituents, but operators may declare alternative rules for specific composition operators where the default would be too restrictive or too permissive for the deployment. The rule selection is itself recorded in lineage so that the basis for any composite tag is reconstructable.
Visibility parameters control which downstream consumers can observe the full provenance and which see only a coarse classification. A safety classifier may need full provenance to apply class-specific rules, while a downstream summarization stage may need only the coarse substrate-versus-inference distinction. The policy reference declares per-consumer visibility, and the inference control layer projects the appropriate view at each consumer boundary.
Alternative Embodiments
In a retrieval-augmented embodiment, substrate-derived values originate from a vector store or document index, and the provenance tag includes the source identifier and retrieval score. Downstream generation can be conditioned on substrate content while ensuring that any inference-generated continuation is tagged distinctly, supporting faithful citation and grounded answer construction.
In a tool-using embodiment, substrate-derived values originate from external tool calls, and the provenance tag carries the tool identifier, call parameters, and result digest. The downstream agent can distinguish tool-grounded claims from model-generated elaboration, supporting auditable decision pipelines in regulated settings.
In a multi-modal embodiment, the substrate includes sensor readings whose provenance tags carry sensor identifier, timestamp, and calibration state. The inference control layer ensures that generated descriptions of sensor input are distinguishable from the sensor input itself, preventing the failure mode in which a model's hallucinated description of a scene is later treated as if it were a sensor reading.
In a multi-agent embodiment, provenance tags are preserved across inter-agent message boundaries. A receiving agent observes not only that a value was produced by a peer agent, but also whether the peer's value was itself substrate-derived or inference-generated. This supports the construction of trust gradients across cooperating agents without requiring each agent to re-verify the entire causal history of every input it receives.
Failure Modes Prevented
The pre-generation distinction prevents a class of failure modes that are characteristic of post-generation pipelines. The first is provenance collapse, in which substrate-derived and inference-generated content are concatenated into a single output buffer and downstream consumers treat the entire buffer as having uniform provenance. Provenance collapse is the structural cause of citation failures in retrieval-augmented systems, where a generated continuation is later attributed to a retrieved source it never came from.
The second is gate evasion, in which a safety classifier trained on post-generation content fails to recognize a class of generated material because its surface form differs from the training distribution. A classifier that consults provenance instead of, or in addition to, surface features is not subject to this failure mode for the inference-versus-substrate distinction; the structural origin is observable regardless of how the content presents.
The third is silent contamination, in which inference-generated content is committed to intermediate state and consumed by another component before any post-generation gate has run. Within-loop governance evaluates each candidate transition before commitment, so contamination cannot occur by construction. The post-generation alternative cannot offer this guarantee because, by definition, the generation has already occurred.
Composition
The pre-generation distinction composes with the planning graph mechanism. A planning branch annotated with predominantly inference-generated content can be subjected to stricter admissibility criteria for promotion than a branch grounded in substrate-derived content. The forecasting engine can therefore prefer plans whose justification rests on grounded inputs without forbidding speculative branches outright.
The distinction composes with cognitive forensics. Lineage entries record the provenance tags of every value that contributed to a planning graph or a committed action. Forensic reconstruction can therefore answer not only what the agent did but on what kind of evidence it acted; an investigator can determine whether a problematic action was driven by substrate-derived inputs that may themselves have been corrupted, or by inference-generated content that the policy should have excluded from that decision class.
The distinction composes with downstream classifiers and policy gates. Because provenance is explicit at the call site, classifiers do not need to be trained to guess provenance from surface features of the content. This eliminates a class of failure modes in which a classifier is bypassed by content whose surface form does not match its training distribution but whose provenance would have triggered a different rule.
Call-Site Semantics
The call site is the structural locus at which the inference control layer attaches provenance. In the architectural model defined by the cognition patent, every invocation of a generative model passes through this layer, and every value emitted by the layer carries a tag that identifies its origin class. Values bypassing the layer are not admitted as inputs to subsequent stages; the inference control layer is the sole construction site for the tagged value type.
Because the tag is constructed at the call site, the cost of provenance attribution is paid once, at the moment provenance is unambiguous. Subsequent operations that receive the value do not need to re-derive provenance and cannot accidentally erase it. A summarization stage that receives a substrate-derived input and an inference-generated input produces a composite output whose tag reflects both sources; the consumer of that summary observes the composition rather than a flattened result.
The call-site discipline distinguishes the mechanism from approaches in which provenance is reconstructed downstream. Reconstruction approaches require either that the generative model be modified to emit attribution alongside content, which is brittle and model-specific, or that a downstream classifier learn to recognize provenance from surface features, which fails systematically on content whose surface form does not match the training distribution. The call-site discipline depends on neither of these and works uniformly across model substitutions.
Prior Art
Reinforcement learning from human feedback shapes generation through reward signals applied during training. RLHF does not produce per-value provenance at inference time; the trained model produces a single output stream, and the consumer cannot distinguish substrate-derived from inference-generated spans without external annotation. The mechanism described here operates at a different stage of the pipeline and produces a different artifact.
Post-generation filtering and re-ranking systems evaluate complete generated outputs and accept, reject, or reorder them. They cannot prevent the inference call from running, cannot intervene before partial state is exposed to other components, and cannot reliably attribute provenance once substrate and generated content have been concatenated. The pre-generation distinction prevents these failure modes structurally.
Citation and grounding systems annotate generated text with retrieved source references, but the annotation is typically a heuristic post-process applied to the generated string, and it does not constrain what the downstream consumer sees as a value type. The mechanism here makes the distinction part of the value's structural identity, not an annotation that may or may not survive subsequent transformations.
Information flow control systems in operating systems and databases provide a precedent for tagged values whose tags propagate through computation. The mechanism here generalizes the principle to the inference call site, where the tag classes correspond to substrate origin versus inference generation rather than to user labels or security clearances.
Disclosure Scope
The disclosure covers the attachment of provenance tags at the inference call site, the propagation of those tags through subsequent operations as a structural property of the value type, the construction of composite tags for blended values under policy-declared composition rules, the projection of per-consumer visibility over the full provenance, and the use of the distinction to drive admissibility criteria, planning graph evaluation, and forensic reconstruction. It covers retrieval-augmented, tool-using, multi-modal, and multi-agent embodiments, and any inference architecture in which downstream consumers must distinguish substrate-derived from inference-generated content without relying on heuristic post-processing of generated output.