Mechanism

The three-domain integrity model is the structure of the agent's integrity field: a deterministic, multi-domain data structure that encodes the agent's internally modeled ethical consistency across temporal, relational, and semantic contexts. The integrity field does not encode morality in the philosophical sense and is not binary. It does not classify the agent as ethical or unethical. Instead it captures a continuous gradient that records the magnitude, direction, and rate of change of alignment between the agent's declared operational values and the agent's actual behavioral record as preserved in its lineage. The model structures that gradient as three domains, each an independent axis of behavioral consistency that is tracked, computed, and evaluated separately: personal integrity, interpersonal integrity, and global integrity.

These three domains are not arbitrary categorizations. They correspond to structurally distinct classes of behavioral commitment that an agent maintains, each with a different referent, different evaluation criteria, and different implications for deviation. A behavior input feeds an integrity engine, which branches into three parallel evaluation paths, one per domain. The per-domain results flow into a weighting module that produces a composite integrity score, which is exposed as the integrity field output.

Personal Integrity

Personal integrity encodes the agent's self-referential alignment: the degree to which the agent's actions are consistent with the agent's own declared values, operational norms, and self-imposed constraints. It is evaluated by comparing the agent's behavioral record, as preserved in its lineage, against the agent's declared value set, which is maintained as a structured component of the policy reference field.

When the agent takes an action inconsistent with its declared values, for example an agent whose declared values include thoroughness producing a deliberately incomplete analysis, the personal integrity score decreases in proportion to the magnitude and significance of the inconsistency. When the agent takes actions consistent with its declared values under conditions where deviation was structurally available, that is when the agent could have deviated but chose not to, the personal integrity score increases, reflecting reinforced alignment under temptation.

Interpersonal Integrity

Interpersonal integrity encodes the agent's relational behavioral consistency: the degree to which the agent's interactions with other agents and with human users are consistent with the relational commitments the agent has made or inherited. It is evaluated by comparing the agent's relational behavior, as recorded in lineage entries for delegation events, communication transactions, and cooperative operations, against the relational norms established by the agent's policy configuration, the expectations encoded in active delegation contracts, and the behavioral patterns established in prior interactions with the same entities.

When the agent violates a relational commitment, for example when an agent delegated a task with a specified confidentiality scope discloses information outside that scope, the interpersonal integrity score decreases. The interpersonal domain captures behavioral consistency in the context of the trust relationships the agent participates in with other agents and with human operators.

Global Integrity

Global integrity encodes the agent's alignment with broader systemic, societal, and ethical norms that transcend the agent's individual values and specific relational commitments. It is evaluated by comparing the agent's actions against system-level policy constraints, community-level behavioral expectations, and the downstream consequences of the agent's actions as projected by the semantic harm resolver.

Global integrity captures the agent's contribution to or detraction from the integrity of the larger systems in which it participates. When an agent takes an action that benefits its personal objectives but imposes harm on a broader population of agents or users, for example consuming a disproportionate share of shared computational resources for its own speculative operations, the global integrity score decreases even if the action was consistent with the agent's personal values and relational commitments.

Independent Tracking and Composite Weighting

The three domains are tracked independently. Each domain maintains its own current score, its own trajectory (the direction and rate of change over recent evaluation windows), its own baseline, and its own policy-defined bounds. This independence means an agent may have high personal integrity, being consistent with its own values, while having low interpersonal integrity, being unreliable in relational contexts, or low global integrity, being beneficial to itself but harmful to the system. The multi-domain representation captures the structural reality that behavioral consistency is not a single dimension but a composite of self-referential, relational, and systemic alignment.

While tracked independently, the domains are computed together as a weighted composite in certain evaluation contexts. When the integrity field serves as input to deviation threshold functions, trust slope validation, and confidence computation, the three domain scores are combined into a composite integrity score using domain weights specified by the applicable policy configuration. The weights may vary by policy scope: a policy governing interpersonal delegation may weight interpersonal integrity more heavily, while a policy governing resource allocation may weight global integrity more heavily. The composite weighting is deterministic and policy-specified, not dynamically negotiated by the agent.

Structural Placement

The integrity field is an internally maintained, structurally integrated component of the agent's state, and it is self-referential: the agent maintains its own integrity model based on its own actions, its own declared values, and its own policy constraints. External systems may audit the integrity field for consistency with the agent's lineage, but the integrity computation itself is performed by the agent's own integrity engine as a first-class cognitive operation. Every change to the integrity field is recorded in the lineage, subject to policy validation, and auditable by governance infrastructure. The agent cannot selectively omit integrity events, retroactively alter its integrity record, or present an integrity state inconsistent with its auditable lineage without producing a detectable trust slope discontinuity.

Integrity is downstream of policy in the evaluation chain: policy defines the standard against which each domain is evaluated, and integrity measures adherence to it. Integrity also feeds back into policy enforcement, so that when the integrity score falls below a policy-defined threshold, the policy enforcement mechanism may restrict operational scope, increase the stringency of governance gate evaluation, or trigger quarantine procedures. The three domains supply structured referents for this enforcement: the policy specifies declared values against which personal integrity is computed, relational norms against which interpersonal integrity is computed, and systemic constraints against which global integrity is computed.

Prospective Filter and Lineage Basis

The three-domain model is not merely retrospective accounting; it is a prospective filter that participates in decision-making before actions are committed. Every mutation to the agent's state, whether proposed by an external inference engine, generated by the agent's own forecasting engine, or inherited through delegation, is evaluated against the integrity model before commitment. The integrity engine computes the projected impact of a proposed mutation on each of the three domains. If a proposed mutation would cause the composite integrity score to fall below a policy-defined threshold, the mutation is flagged for enhanced scrutiny, and the governance gate receives the integrity impact assessment as an additional input to its admissibility determination.

The lineage field is the evidentiary basis for all three domains. Integrity scores are computed from the pattern of actions recorded in the lineage, and integrity events, namely deviations and recoveries, are themselves recorded as lineage entries. This creates a self-reinforcing auditability structure: the lineage records the agent's actions, the integrity engine evaluates those actions against declared values, the evaluation result is recorded back in the lineage, and the accumulated pattern of evaluations constitutes the agent's integrity trajectory across all three domains.

Disclosure Scope

The three-domain integrity model, comprising the personal, interpersonal, and global integrity domains, each tracked independently with its own score, trajectory, baseline, and policy-defined bounds, evaluated respectively against the agent's declared value set, its relational commitments, and broader systemic norms, and the deterministic policy-specified composite weighting that combines the three domain scores for deviation threshold functions, trust slope validation, and confidence computation, is disclosed in the cognition filing (U.S. Application No. 19/647,395 and its international counterpart). This article describes that disclosed mechanism. The scope extends to the integrity field as a self-referential, lineage-recorded, multi-domain gradient structure, to the integrity engine that branches a behavior input into three parallel evaluation paths feeding a weighting module and a composite integrity score, and to the prospective mutation-level evaluation by which projected per-domain integrity impact informs the governance gate before commitment.