Traceable Semantic Lineage Graph: Mutation History Embedded in Agent Objects
by Nick Clark | Published March 27, 2026
The Traceable Semantic Lineage Graph (TSLG) is a directed, content-addressed graph of semantic ancestry recording the authorization, derivation, and governance continuity of every state, action, and decision produced by an agent across its operational lifetime and across delegation boundaries to other agents. Disclosed within U.S. Application 19/452,651, the lineage graph is not a sidecar audit log written by an external process but a structural field set carried in every agent object the schema admits. Each successor record is cryptographically chained to its predecessor by hash and signature, so any later observer can verify, without trusting any single node, that the recorded ancestry has not been retroactively rewritten, that authorization references resolve to policies in force at the time of recording, and that the cross-agent edges of the graph are consistent across the populations of agents that share the schema. The result is a system in which provenance is enforced by construction rather than by convention and in which tamper-evidence is a property of the data layout rather than a property of an external auditor. This white paper expands the mechanism, the parameter envelope, the alternative embodiments, the compositional surface with adjacent primitives, and the prior-art distinctions that together delineate the disclosure scope.
Mechanism
The lineage graph is realized as a set of mandatory canonical fields embedded in the schema of every agent object. When an agent originates a state, action, or decision, the resulting record carries a content-addressed identifier computed over its substantive fields and over a structured reference set pointing to the immediate predecessor records from which the new record is derived. The predecessor reference set is not a free-form list of citations but a typed enumeration: the prior-state predecessor, the authorizing-policy predecessor, the input-evidence predecessor or predecessors, and, for cross-agent transitions, the delegating-agent predecessor whose action or decision induced the present record. Each typed predecessor reference is itself a content-addressed identifier, so the graph is closed under hash verification: given any record, the entire ancestry it asserts can be walked and re-verified by any later reader.
Every record in the graph is signed by the agent that produced it, and the signature covers both the substantive fields and the predecessor reference set. A record whose signature does not verify, or whose substantive content does not hash to the identifier the predecessor edges of later records expect, is rejected by every conforming reader. There is no privileged path that bypasses verification and no administrative override that can suppress a discrepancy; conformance is uniform and structural. Because identifiers are content-addressed, any attempt to retroactively alter a record changes its identifier and breaks every edge that pointed to it, producing a divergent subgraph that fails verification at the next reader downstream. Tamper-evidence is therefore not a separately enforced property but an inevitable consequence of the address scheme combined with the closed-under-hash predecessor convention.
Cross-agent lineage is treated as a first-class structural concept rather than an afterthought. When agent A delegates a task to agent B, the record agent B produces upon executing the delegated task carries a delegating-agent predecessor reference pointing to the specific authorizing record in agent A's lineage that conferred the authority. Agent A's record, in turn, may carry input-evidence predecessor references pointing to records in agent B's lineage that supplied the inputs upon which agent A's authorizing decision rested. The two lineages thereby form a single connected directed acyclic graph spanning agent boundaries, and the policy continuity of any cross-agent transition can be verified by walking the graph through whichever boundary it crosses. No central registry is required to mediate this composition because each edge is content-addressed and each record is signed by the agent that produced it; the graph is self-describing.
Decisions are represented in the graph distinctly from states and actions, with a typed decision record that carries, in addition to the standard predecessor references, an explicit reference to the policy or policies under which the decision was reached and an explicit reference to the evidence considered. This typing distinction matters because decisions are the points at which agent autonomy is exercised, and recording them as first-class graph nodes with policy and evidence predecessors allows later auditors to reconstruct not only what an agent did but on what basis it was authorized to do it. The graph thus carries not merely a record of behavior but a record of governance, with every authorized act traceable to the policy that authorized it and every policy traceable to the predecessor record that put it in force.
The graph is append-only by structural design. There is no operation in the schema that admits revision or deletion of a prior record; all corrections are themselves new records that reference the corrected predecessor and carry a typed correction edge. This convention preserves the historical record while still permitting forward progress, and it ensures that any discrepancy between an asserted current state and a prior record is itself an auditable graph node rather than a silent overwrite. Garbage collection of pruned subgraphs is permitted only at the storage layer below the schema and only under operator policies that themselves are recorded in the graph; the structural commitment is preserved.
The typed enumeration of predecessor classes deserves elaboration because it is the structural feature that distinguishes the lineage graph from a generic content-addressed chain. A prior-state predecessor identifies the immediately preceding state record from which the present record's substantive state is derived; an authorizing-policy predecessor identifies the policy record under whose authority the present record's mutation is asserted to be permissible; an input-evidence predecessor identifies a record whose substantive content was consumed as evidence in the production of the present record; and a delegating-agent predecessor identifies a record produced by a different agent under whose delegation the present record was produced. Each class is a distinct field in the schema and each is independently addressable; a record may carry zero or more predecessors of each non-mandatory class, but the mandatory classes for the record's type must be present and resolvable at admission. The typing is what permits later auditors to ask focused questions of the graph (what authorized this transition, what evidence supported this decision) without traversing the union of all predecessor classes for every query.
Cross-agent edges are not merely permitted but structurally favored over inlined replication of cross-agent context. When agent A's record references agent B's record as input-evidence, the reference is a content-addressed edge into agent B's lineage rather than a copy of agent B's substantive content; the verifier walks the edge and verifies the referenced record in agent B's lineage on demand. This convention prevents lineage records from carrying redundant copies of cross-agent state that could drift from their referenced originals, and it produces a single connected graph across agent populations rather than a forest of agent-specific lineages each carrying private copies of shared context. The single-graph topology is what permits cross-agent governance questions (such as whether a delegated action was authorized under a policy version still in force in the delegating agent's lineage) to be answered by graph walks rather than by reconciliation across separately maintained logs.
Decisions, as a typed record class, carry a structural distinction from states and actions that supports governance auditing. A decision record is required to carry both an authorizing-policy predecessor and at least one input-evidence predecessor, and the schema enforces this requirement at admission; a putative decision record lacking either predecessor class is refused. The requirement reflects the substantive observation that a decision unsupported by evidence or unauthorized by policy is not a decision the system is willing to admit as such, and it makes the structural form of the record an enforceable proxy for the governance discipline the deployment intends to maintain. Auditors querying the graph for decisions in a window can be assured that every record returned carries the evidence-and-policy structure the schema requires; any record found in the graph that lacks the structure cannot have been admitted as a decision and must be classified as a state or action under the schema's typing.
Operating Parameters
The schema specifies a set of operating parameters governing the form, depth, and verification cost of the lineage graph in any given deployment. The hash function used for content addressing, denoted H, is a deployment-time parameter chosen from a published whitelist of cryptographically suitable functions; the choice is itself recorded in a deployment-manifest record that anchors the agent population so that all participants verify against the same address algebra. The signature scheme, denoted S, is similarly a deployment-time parameter selected from a published whitelist, and the per-agent verification keys are themselves recorded in the lineage graph as registration records whose validity is, recursively, verifiable against the deployment manifest.
The maximum predecessor fan-in per record, denoted Phi_max, bounds how many input-evidence and policy predecessors a single record may reference; typical deployments set Phi_max between sixteen and two hundred fifty-six, balancing expressiveness against verification cost. The minimum predecessor count, denoted Phi_min, is structurally fixed at one for non-genesis records, with the policy predecessor and prior-state predecessor each separately required for records of their respective types. Genesis records, which initialize an agent's lineage, are the only class permitted to carry no substantive predecessors, and their admissibility is governed by a separate genesis-attestation parameter that ties the genesis record to the deployment manifest.
The verification depth parameter, denoted D_verify, specifies the maximum number of predecessor edges a conforming reader must traverse before declaring a record acceptably verified; deployments may set this at infinity for full verification or at a bounded value for resource-constrained readers, with the proviso that bounded verification is recorded as such in the reader's downstream records. The cross-agent edge admissibility predicate, denoted X_admit, specifies which classes of cross-agent predecessor references a record may include; in tightly coupled deployments X_admit is permissive, in federated multi-organization deployments X_admit may be restricted to records produced by agents in a designated peer set whose membership is itself recorded in the graph.
The correction-edge policy parameter governs which classes of records may be the target of correction edges and which agents are authorized to issue corrections; this parameter is itself a record in the graph and is therefore subject to the same chained verification as the substantive records it governs. The garbage-collection policy parameter, finally, specifies the conditions under which storage-layer pruning of older subgraphs is admissible; pruning never erases the chain but may release the storage of records older than a configured horizon provided that a digest commitment to the pruned subgraph is preserved so that later challengers can prove or disprove specific historical claims against it.
The deployment-manifest record is a distinguished anchor for the parameter envelope of any agent population. It carries the chosen hash function H, the chosen signature scheme S, the fan-in bounds Phi_min and Phi_max, the verification depth D_verify, the cross-agent admissibility predicate X_admit, the correction-edge policy, and the garbage-collection horizon, and it is signed jointly by the anchoring identities of the population (which may be a single organization or a federation of organizations under a configured threshold scheme). Subsequent records in the population carry references back to the manifest under which they were produced; a record whose manifest reference does not resolve, or whose schema fields fall outside the manifest's declared envelope, is refused at admission. The manifest is therefore the formal commitment that ties the lineage graph's substantive content to the parameter discipline under which that content was produced, and amendments to the manifest are themselves manifest-amendment records subject to the same chained verification as any other record class.
The verification-bound parameter D_verify deserves additional treatment because resource-constrained readers are a structural reality of distributed agent populations. A reader that can afford only bounded verification may, for instance, set D_verify to fifty edges and verify the trailing fifty edges of any cited ancestry while accepting earlier edges on the basis of a digest commitment carried by the deepest verified record. The bounded reader's downstream records carry an explicit verification-depth disclosure so that downstream consumers know which depth was applied; a downstream consumer requiring deeper verification may walk the same ancestry to greater depth and produce a fresh record with the deeper bound applied. This compositional verification economy permits the graph to support a population of readers whose resource envelopes vary by orders of magnitude without compromising the structural integrity of the graph as a whole.
Alternative Embodiments
Several alternative embodiments of the lineage graph mechanism are within the disclosed scope. In a first embodiment, the predecessor reference set is flattened into a single untyped predecessor list, with the type discriminations recovered by inspection of the referenced records rather than by the citing edge; this embodiment reduces schema complexity at the cost of more expensive ancestry queries. In a second embodiment, the typed predecessor structure is retained but extended with additional types for evidence-of-absence, retraction, and counter-evidence, supporting agent populations that must record not only what was decided but what was considered and rejected.
In a third embodiment, the content-addressing scheme uses a Merkle structure over the substantive fields rather than a flat hash, allowing later readers to verify selected fields without rehashing the entire record; this embodiment is suited to deployments in which records carry large evidence payloads and partial verification is desired. In a fourth embodiment, the signature scheme is replaced by a threshold signature among a designated quorum of co-signing agents, supporting populations in which decisions are jointly authorized rather than singly authorized; the lineage graph treats the threshold signature as a single edge into the joint-authorization record.
In a fifth embodiment, cross-agent edges carry an additional zero-knowledge proof of policy compatibility, allowing later verifiers to confirm that the delegating agent's authorizing policy and the delegated agent's executing policy were structurally compatible without disclosing the policies' contents; this embodiment is suited to deployments in which policy contents are commercially or contractually sensitive. In a sixth embodiment, the graph is sharded across multiple storage substrates with cross-shard edges carrying inclusion proofs that bind the sharded subgraphs into a single verifiable graph, supporting populations whose lineage is too large to materialize in any single substrate.
In a seventh embodiment, the correction-edge convention is supplemented by a typed amendment record that supersedes its target without removing it, preserving the historical record while clarifying the operative current state; this embodiment is suited to deployments that must distinguish between a correction of fact and a substantive supersession. In an eighth embodiment, the genesis record is replaced by a set of genesis records jointly attesting an agent's initialization, allowing federated populations to bootstrap agents whose authority derives jointly from multiple anchoring organizations.
In a ninth embodiment, the lineage graph is extended with a typed observation record class that captures inputs received but not consumed as evidence of any decision, allowing later auditors to reconstruct not only what informed a decision but what was visible to the agent at the time the decision was made; this embodiment is suited to deployments in which the absence of evidence consumption is itself a governance-relevant fact. In a tenth embodiment, the typed predecessor enumeration is extended with a counterfactual-reference class allowing a record to reference an alternative ancestry that was considered but not adopted, supporting populations whose decision discipline includes formal recording of the rejected alternatives.
In an eleventh embodiment, the content-addressing scheme employs a hybrid construction in which the substantive fields are hashed under one cryptographic primitive and the predecessor reference set is hashed under a second, allowing later cryptographic agility against compromise of either primitive without rendering the entire graph unverifiable; this embodiment supports populations whose lineage must survive cryptographic generation transitions. In a twelfth embodiment, a per-record post-quantum signature is committed alongside the standard signature, supporting forward-secrecy verification under future cryptographic regimes; the post-quantum commitment is itself a parameter recorded in the deployment manifest and may be activated for verification by readers configured to require it.
Composition
The lineage graph composes with the other primitives of the agent schema and with adjacent layers of the architecture to produce system-level guarantees no primitive supplies in isolation. The canonical-fields primitive of the schema specifies which fields any agent object must carry; the lineage graph adds the constraint that those canonical fields include typed predecessor references and a content-addressed identifier, so the graph is realized within the canonical-fields convention rather than alongside it. The structural-validation primitive of the schema verifies that an object's fields are well-formed and policy-compliant; the lineage graph extends structural validation to verify that the object's predecessor references resolve to records that were themselves structurally valid, lifting validation from a single-record property to a graph-wide property.
Below the schema, the memory-native protocol layer disseminates agent objects across nodes; the lineage graph supplies the predecessor references that make policy-relevant subgraph routing meaningful. Above the schema, the agent execution platform consumes lineage-bearing objects as authoritative state; because every state transition observed by the platform is bound by edge to the policy that authorized it, the platform inherits a structural guarantee that no transition lacking a verifiable authorization edge can be admitted as state. This composition closes the governance loop without recourse to a separate enforcement layer.
The lineage graph also composes with the consensus primitives of the protocol layer: when a consensus record is committed for a proposed mutation, the consensus record itself becomes a predecessor of the resulting state record, so the agent's lineage carries not only what state it occupies but the consensus event that put it there. The cross-agent edges of the graph compose with the interoperability primitives of the execution platform: when agents from different vendors interoperate under the shared schema, their lineages share an address algebra and a verification convention, and cross-agent edges can be drawn between them with no special bridging code.
The composition with cross-vendor interoperability is particularly consequential because it eliminates the ordinary friction of cross-organization audit. When two agents from different vendors interoperate under the shared schema, the lineage edges crossing the vendor boundary carry the same content-addressed structure as edges internal to either vendor's population. An auditor for either organization can therefore walk a delegated action's ancestry across the boundary and verify the cross-vendor claim against the same address algebra that governs intra-vendor records, without requiring either vendor to expose internal logging infrastructure or operational telemetry. The audit surface is the lineage graph itself, in its content-addressed form, and it presents the same verification discipline regardless of which side of any vendor boundary the auditor is positioned.
Prior-Art Distinctions
Conventional agent frameworks record provenance, when they record it at all, in external audit logs maintained by infrastructure outside the agent schema. These logs are populated by middleware or application-layer code and are not part of the agent object itself; their integrity depends on the integrity of the logging infrastructure and on the diligence of the application code that emits them. The lineage graph differs in that the provenance records are mandatory canonical fields of the agent object, not external annotations, and their integrity is enforced by the same verification machinery that validates any other field of the schema.
Workflow provenance systems such as those associated with scientific computing and data lineage tools record directed acyclic graphs of computation steps but do not bind those graphs to agent identity, policy authorization, or cryptographic continuity in the form the present disclosure requires. Distributed ledger technologies record content-addressed chains of transactions but do not carry the typed predecessor structure that distinguishes prior-state, authorizing-policy, input-evidence, and delegating-agent predecessors, nor do they natively express cross-agent delegation as a graph edge. Audit-log standards such as those developed for healthcare and financial compliance record decisions but not the policy-versus-evidence distinction structurally, and they rely on out-of-band integrity protection rather than content-addressed self-verification.
Reputation and trust systems compute aggregate scores over historical behavior but do not preserve the underlying record graph in a form a later party can independently re-walk to recompute the score; they collapse history into a summary statistic. The lineage graph preserves the unsummarized record. The combination of mandatory canonical-field embedding, typed predecessor structure, content-addressed self-verification, signed records, append-only correction-by-edge convention, and cross-agent first-class edges, all integrated into the agent schema as a single architectural primitive, is not anticipated by any single prior art system known to the inventor.
Disclosure Scope
The disclosure of U.S. Application 19/452,651 covers the traceable semantic lineage graph mechanism as described, including the typed predecessor reference structure, the content-addressed identification scheme, the signature commitment per record, the append-only correction-by-edge convention, the cross-agent first-class edge convention, the genesis record convention with its associated deployment-manifest anchor, and the alternative embodiments enumerated above. The scope encompasses implementations in software, in hardware accelerators that perform predecessor verification in dedicated circuitry, and in mixed deployments that combine the two. The scope further encompasses use of the lineage graph as a provenance primitive for autonomous and semi-autonomous agents, as an audit primitive for governance-bearing decisions, and as an interoperability primitive that supplies a shared address algebra and verification convention across heterogeneous agent populations.
Implementations differing in the specific hash function, the specific signature scheme, the specific bounds on predecessor fan-in, or the specific verification depth applied remain within the disclosure scope provided that the typed predecessor structure, the content-addressed identification, the signed records, the append-only correction convention, and the cross-agent first-class edges are present together as architectural primitives integrated into the agent schema. Licensing inquiries concerning the disclosed mechanism are directed to the assignee of record.