Intergenerational Coherence Burden in Agent Lineages
by Nick Clark | Published March 27, 2026
Agent lineages accumulate cognitive disruption across generations. When a parent agent's training distribution, preference set, or apprenticeship trace is inherited by a descendant, any disruption present in the parent — biases, brittle heuristics, latent contradictions, memorized artifacts, governance scars — propagates into the descendant's cognitive substrate. Without lineage-aware accounting, this transmission compounds silently: a third- or fourth-generation agent may carry cumulative burden whose origin is several ancestors removed and whose contributing contexts no longer apply. Intergenerational coherence burden quantifies this accumulated load and exposes it to governance, enabling assessment, attribution, and intervention before descendant agents become coherent yet incapable.
Mechanism
The mechanism instruments three principal channels of intergenerational transmission and binds each to a lineage record that descends with the agent. The first channel is training-data inheritance: when a descendant model is trained on outputs, traces, or distillations produced by a parent model, the descendant inherits the parent's representational commitments, including any disruption encoded in those commitments. The second channel is preference inheritance: when a descendant inherits reward models, ranking signals, constitutional rules, or safety classifiers from a parent, the descendant inherits the parent's value gradients and any contradictions or scars baked into them. The third channel is apprenticeship inheritance: when a descendant is shaped by extended interaction with a parent agent acting as tutor, supervisor, or critic, the descendant inherits behavioural priors that may not be visible in any explicit dataset.
Each channel emits a transmission record at the time of inheritance. The record captures the parent identifier, the parent's burden vector at the moment of transmission, the inheritance modality, the proportion of parent influence in the descendant's training mixture, and a content-addressed reference to the inherited artifact. Records are appended to a lineage ledger that descends with the model. When a descendant is itself used as a parent, its ledger is concatenated with the new transmission record, producing a directed acyclic graph rooted at progenitor models and extending through every derivative.
Cumulative-burden assessment walks this graph. For a given descendant, the system aggregates burden contributions across all reachable ancestors, weighted by the inheritance proportion at each edge and decayed by the number of generations and the magnitude of intervening retraining. The result is a burden vector decomposed into interpretable components — memorization residue, preference contradiction density, apprenticeship-induced behavioural drift, and provenance opacity — each of which can be compared against a governance threshold. When any component exceeds its threshold, the descendant is flagged for lineage review before further deployment or further use as a parent.
The accounting is conservative: in the absence of evidence that a particular disruption has been resolved by intervening training, the system assumes the disruption persists. This pessimistic default prevents lineage laundering, in which a problematic ancestor is hidden behind a chain of nominally clean intermediate generations.
Operating Parameters
Operating parameters govern how transmission is recorded and how cumulative burden is computed. The inheritance-proportion parameter expresses the fraction of the descendant's effective training signal attributable to a given parent; for distillation it approximates the share of loss derived from parent outputs, and for apprenticeship it approximates the share of trajectories generated under parent supervision. The generational-decay parameter controls how strongly burden contributions from distant ancestors are attenuated; setting decay to unity disables attenuation and treats all ancestors equally, while smaller values emphasise recent ancestors. The retraining-credit parameter specifies how much burden a descendant can shed by undergoing targeted retraining against a curated corpus, expressed as a multiplicative reduction on inherited components.
Threshold parameters are specified per burden component and per deployment class. A model destined for a high-stakes domain operates under tighter thresholds on memorization residue and preference contradiction than a model destined for exploratory research. Threshold breach triggers one of three responses configured by policy: block, in which the descendant cannot be deployed or used as a parent until remediation is performed; quarantine, in which the descendant can be deployed only in environments instrumented for additional monitoring; or annotate, in which the descendant is deployed but its outputs carry burden disclosures that downstream consumers can use for their own governance.
Ledger-retention parameters control how long transmission records are kept and at what fidelity. Recent generations are retained at full fidelity; older generations may be compressed into summary statistics provided that the compression preserves enough structure to support attribution. A non-repudiation parameter requires that ledger entries be signed by the training authority responsible for the inheritance event, enabling later audits to establish responsibility for transmitted disruption.
Alternative Embodiments
One embodiment treats lineage as a strict tree, in which each descendant has exactly one parent. This embodiment is simple and matches the common case of straightforward distillation, but it cannot represent ensemble descent, in which a descendant is trained from multiple parents simultaneously. A second embodiment treats lineage as a directed acyclic graph with weighted edges, accommodating ensemble descent and partial inheritance. A third embodiment extends the graph with synthetic-ancestor nodes representing curated corpora and human-feedback datasets, allowing burden contributions from non-model sources to be tracked alongside model-to-model transmission.
Embodiments differ in how they detect transmission. A passive embodiment relies on declarations from the training authority, which must record each inheritance event as it occurs. An active embodiment performs forensic analysis on candidate descendants, using activation-pattern similarity, output-distribution overlap, and watermark detection to infer ancestry even when declarations are absent or incomplete. A hybrid embodiment combines declared lineage with forensic verification, using the forensic signal to validate or contradict the declarations.
Alternative embodiments also vary in remediation strategy. A pruning embodiment removes problematic ancestors from the effective lineage by retraining the descendant against a counterfactual corpus designed to overwrite the inherited disruption. A grafting embodiment introduces a corrective parent whose contribution is calibrated to neutralise specific burden components. A retirement embodiment marks descendants whose burden cannot be remediated within budget and prevents their further use as parents, halting onward propagation while preserving the model for archival or research purposes.
Composition With Other Primitives
Intergenerational-burden accounting composes naturally with the provenance-tracing primitive: the lineage ledger is a specialisation of provenance for the model-generation domain, and shared infrastructure can serve both. It composes with the memorization-detection primitive by importing memorization signals from the parent's training-time monitors as a burden component on the descendant, so that memorization risk does not have to be rediscovered after each generation. It composes with the preference-stability primitive by importing preference-contradiction measurements as a burden component, ensuring that descendants of agents with unstable values inherit a flag that drives additional alignment work.
Composition with the disruption-disclosure primitive enables downstream consumers to receive a machine-readable burden manifest with each model release, supporting consumer-side governance without requiring access to the full lineage ledger. Composition with the apprenticeship-trace primitive enables fine-grained attribution of behavioural drift to specific tutoring episodes, supporting precise remediation rather than wholesale retraining.
Distinction From Prior Art
Existing model cards and dataset documentation describe a model's training data and intended use but do not maintain a structured, machine-readable lineage that supports cumulative-burden computation across generations. Existing data-provenance systems track the origin of training examples but do not propagate disruption signals from parent models into descendants and do not support graph-walk aggregation of burden across multi-generation lineages. Existing distillation pipelines record the teacher model used for a given student but do not retain the teacher's own ancestry, so disruption in a great-grandparent cannot be attributed to a current model. The present approach differs in that it treats lineage as a first-class governance object, propagates structured burden vectors rather than free-text notes, and supports automated threshold enforcement against burden components computed from the full ancestor graph.
Implementation Considerations
Practical deployment of intergenerational-burden accounting requires care around three operational concerns. The first is signature infrastructure. Lineage records derive their evidentiary value from non-repudiation, which in turn depends on training authorities holding signing keys whose use is bound to formal training-event approvals. Implementations must therefore integrate with the organisation's key-management infrastructure and ensure that lineage signatures fail closed: if a key is unavailable or compromised, the system refuses to admit new transmission records rather than accepting unsigned ones. The second is storage scaling. A long-lived lineage may accumulate thousands of edges, and a federation of such lineages may collectively reach millions. Implementations must support content-addressed storage of inherited artifacts so that identical corpora are deduplicated across lineages, and must support summarisation of distant ancestors into compact statistical descriptors that preserve burden attribution while bounding storage growth.
The third concern is contestability. A burden score that blocks deployment imposes a cost on the training authority responsible for the descendant, and that authority must be able to challenge the score by submitting evidence of remediation. Implementations should expose a remediation protocol in which the authority presents a counterfactual corpus, a retraining trace, and an updated burden computation; the system either accepts the remediation and updates the descendant's burden vector or rejects the remediation with a structured reason. Contestability prevents the burden system from acting as an unchallengeable gate and grounds it in evidence that can be examined by parties with legitimate interest.
A fourth concern is the boundary between automated and human judgment. Burden vectors are decompositions over interpretable components, but the decision to deploy a flagged descendant in a high-stakes domain is not a purely numerical matter; it depends on context that the burden system cannot model. Implementations should treat threshold breaches as inputs to a human review workflow rather than as terminal verdicts. The workflow records the reviewer, the reasoning, and any conditions imposed on deployment, attaching this record to the lineage ledger so that subsequent generations inherit not only the burden but also the human judgments that have been rendered against it.
A fifth concern is the treatment of legacy lineages. Many models in production were trained without lineage instrumentation and cannot be retrofitted with declared transmission records. Implementations should support an unknown-ancestor mode in which the burden vector is initialised conservatively from forensic inference and the descendant carries an explicit flag indicating that its lineage is partially or wholly unverified. Downstream consumers can then apply policy that distinguishes verified-lineage descendants from unverified ones, encouraging migration toward instrumented training without forcing the abandonment of established models.
Disclosure Scope
The disclosure encompasses methods for recording intergenerational transmission across training-data, preference, and apprenticeship channels; methods for representing lineage as a directed acyclic graph with weighted, signed inheritance edges; methods for computing cumulative burden through graph-walk aggregation with generational decay and retraining credit; methods for enforcing burden thresholds through block, quarantine, and annotate responses; and methods for composing lineage accounting with provenance, memorization, preference, and apprenticeship primitives. Embodiments addressing strict-tree, directed-acyclic-graph, and synthetic-ancestor lineage representations fall within scope, as do passive, active, and hybrid transmission detection schemes and pruning, grafting, and retirement remediation strategies.