Persistent Semantic State: Eliminating Prompt Reconstruction
by Nick Clark | Published March 27, 2026
The semantic-discovery substrate maintains persistent state across queries within a governance scope. Discovery objects accumulate structured memory, context, and cognitive fields as they traverse the substrate; the accumulated state is bounded by retention policy, sealed into a tamper-evident lineage, and exposed through queryable lineage operators that allow any downstream consumer to reconstruct exactly how a conclusion was reached. There is no prompt to rebuild between steps. The discovery object's persistent state is the substrate of inference, and that state is governed, retained, and auditable as a first-class property of the architecture.
Mechanism: Persistent Semantic State Across a Governance Scope
The mechanism specifies that a discovery object maintains its full accumulated context as structured, typed data fields rather than as a textual prompt that must be reconstructed and reinterpreted at each step. The object's memory field contains an ordered sequence of records — each record an immutable observation drawn from an anchor visited during traversal, sealed at the moment of observation, and addressed by content. The object's context field reflects the current synthesized understanding produced by governed mutation operators that consume memory records and emit a new context state. The object's cognitive fields are derived states computed from memory and context: confidence distributions, hypothesis sets, open questions, dependency graphs.
Persistence operates within a governance scope rather than globally. A governance scope is a credentialed boundary within which the persistent state is shared across queries, retained according to a defined policy, and exposed through defined operators. Scopes are nested: an enterprise scope may contain departmental scopes, which may contain investigation-specific scopes. A discovery object's state is bound to the scope under which it was created, and access to that state by later queries depends on whether the requesting credential resolves into the same or a parent scope. State produced under one scope does not leak into a sibling scope, but a scope's state is available to its own subsequent queries as the substrate they continue from.
Bounded retention is structural. Each scope carries a retention policy as a credentialed object: maximum age of records, maximum cardinality of memory, classes of records subject to automatic redaction or summarization, and rules for the production of derived summaries that may persist after the underlying records are retired. Retention is enforced by governed operators that run on a schedule defined in the policy. Retired records are not deleted in place; they are superseded by retirement events that record what was retired, when, and under what policy clause. The lineage therefore preserves the fact of retirement even when it does not preserve the retired content.
Tamper-evidence is a property of the persistent state itself, not a separate audit subsystem. Every mutation of memory, context, or cognitive fields is sealed into a content-addressed event chain. The current state of any field is the head of its event chain, and the chain is verifiable end-to-end by any party with read access to the scope. Modifications produced outside the governed mutation operators produce a divergent address and are detectable on next read. This makes the persistent state usable as evidence: a downstream consumer can verify that a stated conclusion arose from the recorded observations through the recorded mutations and not from any out-of-band source.
Queryable lineage is exposed through operators that resolve from any field state back to its supporting events. A consumer holding a context field's address can query the operators to retrieve the memory records that produced it, the mutation operators that consumed them, the credentialing chain of each record, and the scope policy under which the chain ran. The operators support both shallow queries (immediate predecessors) and deep queries (the full transitive chain). The lineage is the evidentiary substrate on which downstream consumers — auditors, regulators, adversarial reviewers, the discovery object's own future steps — make decisions about whether to rely on the state.
Operating Parameters: Scope, Retention, and Lineage Depth
Governance-scope parameters specify the credentialing tier required to create a scope, the credentialing tier required to read state within it, the scope's parent (if any), and the rules for inheritance of state from the parent. A scope may be created within a parent only by a credential whose authority is bounded by the parent's authority. State produced under a parent is read-available to its children unless the parent's policy specifies otherwise; state produced under a child is not read-available to siblings or to the parent unless the child's policy explicitly publishes it.
Retention parameters specify maximum age, maximum cardinality, and class-conditioned redaction rules. Maximum age may be expressed as an absolute duration or as a function of the record's class — for example, observations bearing personal identifiers may carry a shorter retention than observations bearing only structural data. Maximum cardinality is enforced by retiring the oldest records past the bound, except where a record is pinned by a current governance hold (a credentialed signal that retention must be extended for compliance, litigation, or audit reasons). Redaction rules specify transformations applied to records before retirement — a record may be retired entirely, or replaced with a structural summary that preserves the lineage's evidentiary value while removing sensitive content.
Lineage-depth parameters specify how far back the queryable operators are required to resolve. Shallow lineage exposes only immediate predecessors; deep lineage exposes the full transitive chain to original observation; bounded lineage exposes the chain back to a defined boundary (for example, back to the most recent credentialed summary). The required depth is set by the consumer's purpose: a regulator may require deep lineage; an internal user may operate with bounded lineage for performance reasons. The lineage operators themselves enforce the depth; they will not return a state's address without also returning the address of its supporting chain at the requested depth.
Mutation-operator parameters specify which classes of mutations are permitted within a scope, the credentialing required to invoke each, and the rate at which they may run. Some operators are inferential (they produce new context from existing memory); some are evidentiary (they incorporate new observations from anchors); some are reductive (they produce summaries that allow retirement of underlying records). Each operator is a credentialed object whose admissibility within a scope is part of the scope's policy.
Cognitive-field parameters specify which derived fields the discovery object maintains, how often they are recomputed, and the staleness bounds beyond which a field is considered out of sync with its supporting state. Derived fields are not part of the evidentiary substrate; they are conveniences computed from it. A consumer that requires evidentiary precision queries the underlying state through the lineage operators rather than reading derived fields directly.
Alternative Embodiments
The persistent-state mechanism admits embodiments at the storage layer that range from in-memory event logs (suitable for ephemeral scopes with short retention bounds) to durable distributed-ledger backings (suitable for long-retention scopes with cross-organizational consumers). The architectural commitment is to content-addressed event chains and governed mutation operators; the storage substrate is interchangeable so long as those properties hold.
Scope structures admit single-tenant embodiments (one scope per organization, with internal sub-scopes), multi-tenant embodiments (a shared substrate hosting many independent scopes under a federation policy), and federated embodiments (independently operated substrates exposing cross-substrate lineage operators under inter-organizational credentialing). Federated embodiments are particularly relevant where investigations or research span jurisdictions and where each jurisdiction maintains sovereignty over its own scope state.
Retention embodiments include unitary retention (a single policy for the scope), class-conditioned retention (different policies per record class), event-conditioned retention (policies that change in response to credentialed events such as the opening of a litigation hold), and consumer-conditioned retention (policies that retain or summarize differently depending on the consumer authorized to read). Mixed embodiments are common: a typical enterprise scope combines class-conditioned retention with event-conditioned holds.
Lineage-exposure embodiments range from full lineage exposure (every consumer with read access to a state can resolve its full chain) to mediated exposure (consumers receive lineage proofs without direct chain access, suitable for cases where the chain itself contains sensitive intermediate records). Mediated exposure uses cryptographic accumulators or zero-knowledge proofs to demonstrate that a state derives from a chain satisfying defined properties without disclosing the chain's contents.
Mutation-operator embodiments include deterministic operators (whose output is fully a function of inputs and is exactly reproducible), probabilistic operators (whose output depends on a sampled element recorded into the lineage so the run remains reproducible), and model-mediated operators (whose output is produced by a credentialed model whose identity, version, and parameters are recorded into the event). Each embodiment preserves the evidentiary property that the recorded chain is sufficient to reconstruct the run.
Composition With the Wider Discovery Architecture
The persistent-state mechanism composes with the anchor-and-traversal layer of the semantic-discovery architecture. Anchors emit credentialed observations into the discovery object's memory as the object visits them; the memory field is therefore continuous with the substrate's anchor records, and lineage operators on the memory side resolve into the substrate's anchor lineage. A traversal that visits a thousand anchors over a long-running investigation accumulates a memory whose every entry resolves to an underlying anchor record under credentialing chains the substrate already maintains.
Composition with the operator-intent layer is structural. Queries against persistent state are themselves operator-intent objects, credentialed according to the scope's read-tier policy. A query that asserts an adverse conclusion against a subject must propagate through the operator-intent due-process credentialing chain before binding effects, and the persistent state's lineage is the evidentiary substrate on which that credentialing depends. A conclusion lacking sufficient supporting lineage is inadmissible regardless of the inferential confidence the cognitive fields display.
Composition with model-mediated inference is governed. A model invoked during traversal does not see a textual prompt reconstructed from history; it sees a typed projection of the discovery object's state. The model's invocation is recorded as a mutation event with the model's credentialing chain, and the model's output is stored as a derived field with a back-reference to the inputs it consumed. This makes model contributions traceable in the same lineage substrate as direct observations and allows downstream consumers to evaluate the model's role in producing a conclusion.
Composition with the mesh-coordinates layer (where applicable) is achieved through observations whose payload includes positional or topological evidence. Positional observations are first-class memory records; their credentialing chains are part of the lineage; cognitive fields that depend on positional evidence carry the positional records as supporting predecessors and inherit their credentialing constraints.
Prior-Art Landscape
Conventional large-language-model interaction patterns reconstruct context at every step through textual prompts. As context grows, prompts become longer, slower to process, and eventually exceed the model's context window. Information is lost to truncation, summarization is lossy and uncredentialed, and the system's understanding degrades with each step. Conversation-history mechanisms provided by current LLM platforms partially mitigate this through truncation and summarization but do not produce credentialed, queryable lineage and do not bound retention as a structural commitment. They are session conveniences, not evidentiary substrates.
Vector-store and retrieval-augmented-generation (RAG) systems persist embeddings and source documents but do not maintain a structured per-discovery-object state with governed mutation, content-addressed event chains, and queryable lineage. They retrieve from a corpus into a prompt at each step, and the prompt is again the substrate of inference. Knowledge-graph systems persist structured facts and relations but typically expose neither tamper-evident lineage nor credentialed mutation operators; their state is a snapshot of asserted facts rather than an evidentiary record of how those facts came to be asserted.
Workflow and case-management systems persist state across user sessions and produce audit trails, but the state is procedural (steps completed, fields populated) rather than semantic (observations, contexts, hypotheses), and the audit trails are typically narrative rather than queryable as a typed lineage substrate. They satisfy compliance requirements within bounded administrative domains but do not provide a substrate suitable for long-running, inference-heavy discovery operations across heterogeneous evidence.
Cryptographic provenance and append-only-log systems supply the tamper-evidence property but, as with the operator-intent layer, do not by themselves supply the governance, retention, and inference primitives required for persistent semantic state. They are necessary substrate, integrated into the present mechanism alongside scope-bounded retention, governed mutation, and queryable lineage operators that together produce a complete persistent-state architecture.
Disclosure Scope
The disclosure scope covers persistent semantic state maintained across queries within a governance scope; nested and federated scope structures with credential-bounded inheritance; bounded retention policies including unitary, class-conditioned, event-conditioned, and consumer-conditioned variants, with retirement events that supersede records without erasing the fact of retirement; tamper-evident content-addressed event chains for memory, context, and cognitive fields; queryable lineage operators with shallow, deep, and bounded resolution; governed mutation operators including deterministic, probabilistic, reductive, and model-mediated embodiments; and composition of the foregoing with anchor-and-traversal records, operator-intent credentialing, model-mediated inference, and positional or topological evidence layers. The scope encompasses storage substrates ranging from in-memory event logs to durable distributed-ledger backings, with the architectural commitment fixed on content-addressing, governed mutation, and queryable lineage rather than on a particular storage technology.
Persistent semantic state allows discovery traversals of unlimited length without context degradation. A research agent operating within a credentialed scope can traverse thousands of anchors over days or weeks while maintaining recall of every observation and every inferred state, with retention bounded by policy rather than by context-window arithmetic and with every conclusion exposed to evidentiary review through queryable lineage. This enables deep, long-running, multi-party discovery operations whose outputs are admissible to legal, regulatory, and audit consumers — a class of operation that prompt-reconstruction architectures cannot reach.