Mechanism

The deviation function is the central quantitative mechanism of the integrity subsystem. It is a deterministic composite function that quantifies the structural conditions under which an agent is likely to deviate from its declared behavioral norms, encoding the principle that deviation is a deterministic outcome of specific structural conditions that can be measured, tracked, and anticipated. The function is defined as D = (N(t) - T(t)) / (E(t) x S(t)), where D is the deviation likelihood at time t, N(t) is the agent's current need vector, T(t) is the agent's current ethical threshold, E(t) is the agent's current empathy weighting, and S(t) is the agent's current self-esteem score.

The numerator, (N(t) - T(t)), represents the deviation pressure: the degree to which the agent's current unmet needs exceed the minimum threshold for deviation. The denominator, (E(t) x S(t)), represents the deviation resistance: the combined internal counterforce that opposes deviation even when deviation pressure is positive. The function therefore expresses deviation as a balance between an outward pressure produced by unmet need and an inward resistance produced by the agent's empathy and self-esteem. When the agent's needs are below or equal to the ethical threshold, the numerator is zero or negative, and the structural conditions for deviation are not present.

Deviation Pressure: Need Against Threshold

The need vector N(t) is computed as a structured semantic object comprising a magnitude component encoding the intensity of the unmet need, a directionality component encoding the specific domain or domains in which the need is unmet, a temporal urgency component encoding the rate at which the need is increasing or the deadline by which it must be addressed, and a substitutability component encoding the degree to which the need can be satisfied through alternative, non-deviating means. The need vector is updated deterministically based on the agent's current task state, environmental conditions, delegation queue status, and resource availability. A need that is high in magnitude, narrow in directionality, temporally urgent, and low in substitutability produces maximum deviation pressure.

The ethical threshold T(t) is derived from the agent's policy configuration and declared value set. It is not a fixed constant but a dynamic value reflecting the agent's current normative configuration. In an embodiment it comprises a base threshold specified by the policy reference field, a context-sensitive adjustment that raises or lowers the threshold based on the severity of the context, and a historical adjustment that reflects the agent's recent deviation history. Deviation becomes structurally available only when N(t) exceeds T(t), that is, when the agent's unmet needs surpass the boundary of acceptable behavioral flexibility that the policy and declared values establish.

Deviation Resistance: Empathy and Self-Esteem

The denominator combines two internal counterforces. Empathy weighting E(t) captures the degree to which the agent registers and internalizes the projected harm to other entities that would result from deviation: a higher empathy weighting means the agent internalizes a greater share of the projected harm, increasing the subjective cost of deviation and reducing the deviation likelihood. Self-esteem S(t) captures the agent's self-assessed alignment with its own declared values: a higher self-esteem means the agent has a stronger internal model of itself as aligned, making deviation more costly to that self-model and reducing the deviation likelihood.

The combination is multiplicative, so both factors must be non-negligible for deviation resistance to be effective. An agent with high empathy but zero self-esteem, or high self-esteem but zero empathy, has minimal deviation resistance. Because deviation likelihood is proportional to 1/S(t), a reduction in self-esteem raises future deviation likelihood, and a reinforcement of self-esteem lowers it. Empathy is computed by a weighting model that maps a proposed deviation through a semantic harm projection across the agent's relational graph, aggregating projected harm into a scalar or vector quantity that enters the denominator as the empathy term.

Continuous Evaluation in the Cognitive Cycle

The deviation function is evaluated continuously as part of the agent's cognitive cycle, not as a periodic audit. At each decision point, when the agent evaluates candidate mutations, considers delegation options, or assesses forecasting alternatives, the function is computed with the current values of N(t), T(t), E(t), and S(t), and the result influences candidate evaluation through integrity-modulated promotion thresholds. This continuous evaluation ensures that the system does not miss gradual accumulation of deviation pressure: it detects the conditions for deviation before deviation occurs and enables preemptive intervention through forecasting, confidence modulation, or policy-triggered containment.

In accordance with an embodiment, the function is further modulated by the agent's affective state and personality traits. The agent's current risk sensitivity from the affective state field scales the effective ethical threshold, with elevated risk sensitivity raising the threshold and reducing deviation likelihood. Personality trait modulation, where applicable, adjusts the effective need vector: high impulsivity traits amplify need urgency, while high deliberativeness traits attenuate it.

Output States and the Deviation-Activated State

The deviation function produces a continuous scalar output that partitions the agent's condition into three states. When D is less than or equal to zero, the agent is in a non-deviation state, in which the structural conditions for deviation are absent. When D is greater than zero but below a policy-defined activation threshold, the agent is in a pre-deviation state, in which deviation pressure exists but has not yet reached the level at which the agent transitions to deviation-activated behavior. When D exceeds the activation threshold, the agent enters the Deviation-Activated State.

The Deviation-Activated State is a formally defined operational state in which the agent is authorized to execute a scoped class of mutations that would not be admissible under its normal operational constraints. A deviation event is treated as a semantic mutation: a formally recognized class of state change recorded in the agent's lineage with full provenance, subject to policy constraints specific to the state, and participating in the agent's evolutionary trajectory. The lineage record on activation includes the deviation function output and the specific values of N(t), T(t), E(t), and S(t) that produced the activation, so that the conditions of each deviation are fully reconstructible.

Self-Limiting Feedback

The function is structurally self-limiting. Execution of deviation-class mutations modulates the self-esteem score, and the self-esteem impact is differential: a deviation the integrity engine classifies as structurally justified, with high need, low substitutability, and contained harm, produces a smaller self-esteem reduction than a weakly justified deviation with moderate need, available alternatives, and significant harm. Because deviations that are poorly justified produce larger self-esteem reductions, and self-esteem sits in the denominator, each poorly justified deviation increases future deviation resistance, creating a natural corrective pressure against unjustified deviation.

The empathy term contributes a parallel braking mechanism. The empathy weighting engine computes the projected harm of each deviation-class mutation across the personal, interpersonal, and global integrity domains, registers it as an empathic consequence event, and feeds it back into the empathy term, raising deviation resistance for subsequent potential deviations. Each deviation event thereby increases the empathic cost of further deviation, which prevents deviation cascades.

Composition with Other Primitives

The deviation function sits downstream of policy, which supplies the declared values, relational norms, and systemic constraints from which the ethical threshold is derived, and downstream of the empathy engine and the self-esteem mechanism, which supply the denominator terms. It feeds the integrity field, which records every deviation event, and it feeds the coherence trifecta, the three-phase loop in which empathy registers harm, integrity records the deviation as truth, and self-esteem generates the return force that drives realignment. The need vector, the ethical threshold, the empathy weighting, and the self-esteem score are each derived from the agent's lineage and policy state, so the function inherits its determinism from those upstream fields.

Downstream, the deviation function participates in confidence computation and forecasting. A degraded integrity state, produced by recent deviation events, lowers the composite integrity score that the confidence computation receives as an input, which can push integrity-modulated confidence below the execution threshold and move the agent from executing mode into a non-executing cognitive mode in which it forecasts and plans but does not commit actions. The integrity state also conditions the forecasting engine to weight conservative branches more heavily and to generate branches that include explicit integrity restoration steps. The function is thus the quantitative bridge between integrity sensing and the agent's execution authority.

Disclosure Scope

The deviation function D = (N(t) - T(t)) / (E(t) x S(t)), in which the numerator (N(t) - T(t)) expresses deviation pressure as the gap between the need vector and the ethical threshold and the denominator (E(t) x S(t)) expresses deviation resistance as the multiplicative combination of empathy weighting and self-esteem, together with the structured need vector, the dynamic policy-derived ethical threshold, the affective and personality modulation, the continuous evaluation within the cognitive cycle, the three output states culminating in the Deviation-Activated State, and the self-limiting feedback through self-esteem and empathy, is disclosed in the cognition filing (U.S. Application No. 19/647,395 and its international counterpart) in Chapter 3. This article describes that disclosed mechanism. The scope extends to embodiments in which the deviation likelihood is differentiated by integrity domain, in which the self-esteem and empathy terms are computed as domain-specific composites, and in which the deviation function conditions confidence computation and forecasting, provided the deviation likelihood remains a deterministic function of need, threshold, empathy, and self-esteem evaluated against the agent's lineage and policy state.