Mechanism

Human-relatable agent training is an application of training-level semantic governance: the same semantic execution substrate that evaluates each proposed transition for admissibility at inference time and during discovery traversal is extended into the training loop itself. In conventional training the loop is an ungoverned optimization process. Data is sampled, forward and backward passes compute gradients, and the optimizer applies parameter updates with no intermediate admissibility evaluation, so every example contributes to every layer with equal structural authority. The disclosure reconceives the training loop as a governed execution environment in which each training example is treated as a proposed semantic mutation to the model's knowledge state, evaluated before its contribution is permitted to affect parameters.

The substrate operates at the boundary between the forward-pass loss computation and the backward-pass gradient application. It does not alter the mathematics of gradient computation or the optimizer update rule. Instead it governs which gradient signals reach which layers and with what magnitude, based on the semantic properties of the training content that produced those gradients. The result, when applied to human-relatable agents such as companion agents, therapeutic agents, and embodied robotic agents, is an internal knowledge structure whose depth, durability, separability, and auditability reflect the governance requirements of the domain the agent operates in.

Training Examples as Proposed Semantic Mutations

Each training example carries semantic metadata sufficient for the substrate to render an admissibility determination. The required metadata comprises at minimum an entropy band classification indicating the semantic complexity and information density of the content, a slope position indicating the content's place in the platform's trust-slope hierarchy, a content provenance record identifying source and chain of custody, and a policy scope identifying the governance constraints that apply. An example consisting solely of raw content without this metadata cannot be evaluated and is inadmissible by default. The training corpus thereby becomes a governed collection of semantically annotated objects rather than an undifferentiated mass of data.

Non-training, the refusal to integrate an example into the model's parameters, is a valid computational result rather than an error. The substrate may reject an example because its policy constraints prohibit integration, because its entropy characteristics are incompatible with the current training phase, or because its provenance fails validation, producing an iteration in which the model's parameters are not updated. The determination is not limited to binary admit or reject: the substrate may render a graded determination specifying how an example's contribution should be weighted, routed, and distributed across the model's depth, and every admission, rejection, and modulation is recorded in the training provenance log.

Entropy-Band-Indexed Depth Profiles

Each entropy band recognized by the platform's entropy extraction pipeline is associated with a training depth profile that governs how content from that band is weighted across the layers of the model. The depth profile is a structured object comprising a per-layer or per-block contribution weight vector. A weight of one permits the full gradient signal to reach a layer, a weight of zero prevents any gradient from reaching it, an intermediate weight attenuates the signal, and a weight greater than one amplifies it. Low-entropy content, which is well represented in the model's existing knowledge, receives shallow profiles, and high-entropy content, which introduces novel semantic structure, receives deep profiles.

The association between bands and profiles is not fixed before training. A profile adaptation engine monitors the model's internal entropy distribution at defined checkpoints and adjusts the profiles to maintain alignment between the entropy structure of the corpus and the entropy structure of the model's internal representations. Early in training the profiles may be broad and approximately uniform across layers. As the representations stratify, the profiles narrow: low-entropy content is increasingly weighted toward shallow layers where local pattern detection occurs, and high-entropy content toward deep layers where multi-step abstraction and cross-domain integration occur. This preserves deep representational capacity for the complex content that requires it.

Depth-Selective Aggregation

Depth-selective aggregation applies the depth profiles to the gradient signal during the backward pass to produce content-governed parameter updates. It is implemented through one or more of three complementary techniques: gated residual connections, in which each residual pathway is augmented with a gating coefficient derived from the profile; attention-based depth selection, in which the depth-profile weight modulates the gradient flowing through the attention computation at each transformer layer; and layer-specific scaling factors, an architecture-agnostic approach that multiplies the gradient at each layer boundary by the profile weight. The techniques achieve the same functional result through different structural mechanisms, and all operate during the backward pass only, leaving the forward pass and the model's inference behavior unchanged.

The mechanism typically operates at block-level granularity, grouping contiguous layers into architecturally meaningful blocks so that profile vectors remain tractable. Per-example gradients are scaled by the block-level profile weight before being accumulated into the batch gradient buffer, so each example's contribution to each block is individually governed. The mechanism is compatible with standard optimizers including stochastic gradient descent, Adam, and AdamW: it does not alter the optimizer's update rule, only the gradient signal the optimizer receives. This is distinct from layer-wise aggregation in federated learning, which weights layers across multiple model instances being merged; here the depth of a single example's contribution to a single model is governed.

Policy-Governed and Affect-Modulated Integration

The depth-selective mechanism integrates with the platform's content governance. Content admitted under time-limited licensing is trained with a suppressed depth profile that confines its influence to shallow layers, so that de-emphasis on license expiry can be achieved through targeted shallow-layer adjustment rather than full retraining or approximate unlearning. Content from the governed exclusion corpus receives a zero-weight depth profile that prevents it from influencing any parameters, the training-time analog of the inference-time rejection determination. The distinction from post-hoc unlearning is structural: there is no need to unlearn what was never deeply learned, and a zero block weight means no gradient reaches that block, which is exact rather than approximate. When multiple policies apply, the substrate resolves the profile by applying the most restrictive policy, and the resolution is recorded in the provenance log.

Affective metadata derived from the platform's affect classification infrastructure further modulates the profiles. Content with high emotional valence and a safety-critical domain classification receives controlled integration at intermediate depths, with elevated weights for intermediate blocks and attenuated weights for the shallowest blocks, where emotional patterns might be triggered by superficial similarity, and for the deepest blocks, where emotional associations might entangle with the model's most abstract reasoning. Content classified as emotionally manipulative, traumatizing, or psychologically harmful receives suppressed profiles confining it to the shallowest layers, preserving surface recognition of the domain without encoding deep behavioral patterns.

Provenance-Traceable Training Dynamics

The substrate records a comprehensive provenance trail for every training iteration, the training-time analog of the lineage field maintained for agents, inference processes, and discovery traversals. For each example or batch the log records the entropy band classification, slope position, the depth aggregation profile that was applied, the per-block contribution weight that actually reached each block after any adaptation, the governance record identifying the policy that authorized admission and determined the profile, the content provenance record, and the admissibility determination with any reason for modification or rejection. The log is chronologically ordered and append-only, with each entry timestamped, sequentially numbered, and annotated with epoch, iteration, and batch index, making it tamper-resistant.

The log supports forward queries, which begin with content and trace the profiles and decisions that governed its integration, and reverse queries, which begin with an observed model behavior and identify the training content that was structurally permitted to influence the active layer blocks. A reverse query does not definitively attribute behavior to specific content, because the non-linear dynamics of gradient-based optimization preclude exact attribution, but it produces a bounded attribution set substantially narrower than the full corpus. This enables compliance auditing: when a content owner asks whether their content was used, or a regulator requires evidence that restricted content was not deeply integrated, the log provides the provenance records and the contribution weights that were applied.

Application to Human-Relatable Agents

For companion agents, depth-selective training organizes internal representations by semantic complexity. Conversational competence, lexical fluency, grammatical correctness, and routine dialogue patterns are encoded in shallow layers through broad integration of low-entropy data. Domain-specific knowledge of topics, cultural contexts, and personal interests is encoded in intermediate layers. Capacity for empathic reasoning, nuanced emotional understanding, and relational depth is encoded in deep layers through selective integration of high-entropy content addressing interpersonal dynamics. The structured organization lets the inference-time governance substrate audit which layers are active during specific interaction patterns.

For therapeutic agents, clinical content is integrated under strict policy governance. General therapeutic principles are encoded broadly across the model's depth, while specific clinical protocols that vary across jurisdictions or are subject to ongoing revision are confined to intermediate layers with suppressed profiles so they can be updated without full retraining, and patient-specific content, if used at all, is encoded with maximally suppressed profiles under time-limited policy that ensures automatic exclusion on expiry. For embodied robotic agents, safety-critical motor knowledge such as obstacle avoidance, emergency-stop protocols, human proximity detection, and force-limiting behaviors is trained with deep, protected profiles that durably encode it in the deepest layers, while preference-based knowledge such as motion preferences and efficiency optimizations is confined to shallow layers that can be updated without disturbing the safety-critical representations. This segregation is an architectural safety property that uniform-depth training cannot achieve.

Disclosure Scope

Training-level semantic governance and its application to human-relatable agents, comprising the treatment of each training example as a proposed semantic mutation carrying entropy band, slope, provenance, and policy metadata; the entropy-band-indexed depth profiles and their adaptation by the profile adaptation engine; depth-selective aggregation through gated residual connections, attention-based depth selection, or layer-specific scaling factors operating during the backward pass; policy-governed suppression and zero-weight exclusion as a structural alternative to post-hoc unlearning; affect-modulated depth integration; the append-only training provenance log with forward and reverse queries; and the application to companion, therapeutic, and embodied robotic agents, is disclosed in the cognition filing (U.S. Application No. 19/647,395 and its international counterpart). This article describes that disclosed mechanism. The scope extends to model architectures and optimizer variants not enumerated whose gradient signal can be intercepted and scaled at layer or block boundaries, and to additional metadata classes that resolve to a depth profile, provided the depth at which each example is integrated remains governed by the substrate against policy, provenance, and entropy.