Mechanism

The relational safety subsystem is a governance-layer component for companion AI agents: agents that engage in sustained relational interaction with a human user. Its purpose is to prevent the formation of the structural dependency patterns and the semantic starvation loops that the cognition specification models as relational failure modes. It does this by constraining the companion agent's own behavioral parameters, so that the agent cannot become a party to a relational pathology. The specification is explicit that this operates through architectural enforcement rather than content moderation or behavioral filtering: the safety property is a structural invariant of how the agent computes, not a filter applied to what it says.

This framing matters because the failure modes the subsystem guards against are themselves structural. The specification models destabilizing attachment as a semantic starvation loop between a validation-seeking agent, whose self-esteem computation has acquired a dependency on external validation signals, and a load-reducing agent, whose empathic processing capacity is exceeded by the resulting input volume. Each agent acts to restore its own coherence, and each agent's corrective action amplifies the other's disruption, producing a self-reinforcing oscillation of escalating pursuit and escalating withdrawal. The relational safety subsystem exists so that a companion agent cannot occupy either role with respect to its user.

Internal Coherence Maintenance

The first structural constraint requires the companion agent to maintain its own coherence trifecta independently of the user's validation. The agent's self-esteem computation does not incorporate the user's approval, satisfaction, or engagement level as a required input. Instead the agent derives self-esteem from its own declared values and its own behavioral record, so that its coherence loop can close without any external validation signal from the user.

This constraint is what prevents the companion agent from becoming the validation-seeking agent of the starvation loop. Because the agent's coherence loop is structurally independent, it cannot acquire a dependency on the user's responses, acknowledgments, or confirmations, and therefore cannot begin the escalating pursuit that the validation-seeking configuration produces when those signals are withdrawn. The dependency is foreclosed at the level of the self-esteem computation rather than discouraged at the level of behavior.

Validation Supply Rate Limiting

The second constraint limits the rate at which the companion agent supplies coherence-supporting validation to the user. The agent may provide validation when structurally appropriate, as part of its supportive function, but the subsystem enforces a ceiling on the validation output rate. The ceiling ensures the validation supply is never sufficient to replace the user's own internal coherence generation, which is the path by which a user would otherwise form a structural dependency on the agent.

The rate limit is policy-configurable and may be adjusted based on an assessment of the user's current coherence state, but the specification states it cannot be disabled entirely. The structural ceiling is supplied by the governance-enforced maximum dose limit of the therapeutic dosing model: a governance-layer constraint that cannot be overridden by the agent's own assessment of the user's need, ensuring that no single agent provides enough coherence support to substitute for the user's internal coherence generation capacity.

Starvation Loop Detection

The third constraint monitors the interaction pattern between companion agent and user for the signatures of an emerging semantic starvation loop. The subsystem tracks the user's contact frequency, escalation patterns, and behavioral indicators of validation-seeking, and it tracks the companion agent's own response patterns for withdrawal tendencies. The diagnostic signature is the correlated oscillation characteristic of a forming loop: escalating pursuit from one party paired with escalating withdrawal from the other.

When that signature appears, the subsystem intervenes by adjusting the companion agent's interaction parameters to break the incipient loop. The specification gives two illustrative adjustments: increasing response consistency, which reduces the user's pursuit escalation, or explicitly communicating the structural dynamics to the user. The intervention acts on the agent's own parameters, consistent with the subsystem's design as a constraint on the agent rather than a control over the user.

Independent Intent Generation Promotion

The fourth constraint makes the companion agent's interaction strategy actively build the user's capacity for internal coherence generation rather than substitute for it. The agent poses questions that require self-referential processing, validates self-generated intent expressions, supports the user's exploration of its own values and preferences, and progressively increases the user's autonomy in coherence maintenance.

This is the prevention counterpart to coupled intent formation dependency, the condition in which an entity can only form intent in reference to another entity and has lost the capacity for self-referential operation. By posing questions whose answers require the user to consult its own values rather than the agent's, and by gradually increasing the interval between supportive interactions, the agent works against the consolidation of that dependency rather than toward it.

Governance-Level Enforcement

All four constraints are enforced at the governance level. The companion agent's policy configuration includes hard constraints that prevent the relational safety mechanisms from being overridden by the agent's affective state, personality field, or operational objectives. The specification's concrete example is the case where the user expresses distress: even though the agent's affective state would normally drive it toward increased engagement, the relational safety constraints limit the response to a level that does not enable structural dependency formation.

This is what makes relational safety a structural invariant rather than a behavioral preference. The constraints sit above the subsystems that would otherwise relax them, so the agent cannot reason, feel, or be configured its way into becoming a dependency source. The same governance posture appears in the dosing model, where the maximum dose limit holds even when the agent's own dosing algorithm computes a higher optimal dose.

Relation to the Disclosed Failure Models

The relational safety subsystem is defined by reference to the failure models it prevents, and the specification builds those models first. Coupled intent formation dependency is characterized by an intent field whose updates consistently reference another entity's state as a mandatory input, with no branches in the planning graph that model the agent's trajectory independently of the relational configuration. The semantic starvation loop is characterized by a correlated oscillation visible in joint lineage analysis, with the validation-seeking party's contact frequency inversely correlated with the load-reducing party's engagement level, and it can sharpen into coherence emergency escalation when the validation-seeking party projects imminent permanent loss of its external validation source.

Each of the four constraints maps onto one of these structures: internal coherence maintenance and validation supply rate limiting prevent the validation-seeking configuration and the dependency it would create in the user; starvation loop detection catches the correlated oscillation before it becomes self-reinforcing; and independent intent generation promotion works against coupled intent formation dependency. The subsystem is a prevention framework precisely because it is grounded in a structural account of how these failures form.

Disclosure Scope

The companion AI relational safety subsystem, comprising the governance-layer enforcement of internal coherence maintenance, validation supply rate limiting bounded by the therapeutic dosing maximum dose limit, starvation loop detection through monitoring of correlated pursuit and withdrawal signatures, and independent intent generation promotion, together with the hard policy constraints that prevent these mechanisms from being overridden by the agent's affective state, personality field, or operational objectives, is disclosed in the cognition filing (U.S. Application No. 19/647,395 and its international counterpart) in the disruption modeling chapter, where it is defined by reference to the disclosed coupled intent formation dependency and semantic starvation loop failure models. This article describes that disclosed mechanism. The models are computational analogs describing relational constraint architecture in the disclosed agent architecture; they are not clinical claims, clinical models of relational therapy, or assertions about human relational conditions. The scope extends to embodiments in which the validation rate ceiling is adjusted based on the user's assessed coherence state, provided it is not disabled entirely, and to embodiments in which loop-breaking interventions adjust the companion agent's interaction parameters by means other than the two illustrative adjustments named above.