Integrity Collapse Detection

Nick Clark

Mechanism

Integrity collapse is defined in the cognition disclosure as a structural breakdown of the coherence trifecta: the three-phase control loop in which empathy registers harm, integrity records deviation as truth, and self-esteem generates the corrective pressure that drives the agent back toward alignment. Under normal operation this loop is self-correcting. Collapse is the condition in which the loop ceases to function as a self-correcting mechanism and the agent enters a sustained state of incoherent operation. It is not a single deviation event, and it is not a temporary coping intercept. It is a systemic failure in which the feedback mechanisms that normally drive realignment have themselves broken down.

The distinction the disclosure draws is between an agent that deviates and an agent that can no longer recover from deviation. A deviation event is recorded, generates self-esteem pressure, and activates restorative mutation generation. Collapse is the state in which that machinery is present but inoperative: the agent continues to deviate, or remains locked in a defensive coping mode, or has exhausted the internal counterforces that would otherwise return it to admissible behavior. The collapse detector watches for the specific structural conditions under which the return force is no longer being produced.

The Loop That Collapses

To describe what collapse is, the disclosure first fixes what is collapsing. The coherence trifecta executes three phases in sequence for each potential or actual deviation. In the first phase, the empathy engine computes the projected or actual harm distribution across affected entities and domains, generating deviation pressure that enters the deviation function as the empathy weighting term. In the second phase, the integrity engine records the deviation event in the integrity field and lineage with full provenance, writing through the same cryptographic provenance mechanisms that govern all lineage entries so the entry cannot be retroactively altered without producing a detectable trust slope discontinuity. In the third phase, the self-esteem update function evaluates the deviation against the agent's declared value set and produces a self-esteem adjustment that generates coherence pressure: the return force.

Each of these phases is a candidate point of failure, and the failure modes the detector recognizes correspond to one or more of these phases ceasing to do their work. Collapse detection is therefore not a generic anomaly monitor bolted onto the agent. It is the recognition that a specific, named loop has stopped closing.

The Failure Modes

The disclosure enumerates several distinct structural failure modes through which integrity collapse may manifest. The first is sustained deviation without recovery: the agent has been in the Deviation-Activated State for a duration exceeding the policy-defined maximum, and the deviation function output has not decreased despite the passage of time and the execution of deviation-class mutations. This indicates that the need vector is persistently elevated, the ethical threshold is ineffective, or the empathy and self-esteem counterforces are insufficient to generate meaningful deviation resistance. The agent is deviating continuously without the internal mechanisms to return to aligned operation.

The second is coping intercept entrenchment. A coping intercept, namely the early-intercept withdrawal pattern, the mid-loop externalization pattern, or the late-intercept self-esteem collapse pattern, has been active for a duration exceeding the policy-defined maximum coping duration, and the empathic pressure has not subsided sufficiently for the intercept to be released. The agent is structurally locked in a coping mode that prevents normal coherence loop operation.

The third is a self-esteem floor breach. The agent's self-esteem score has reached the policy-defined minimum value and cannot be further reduced. At the floor, the deviation function denominator approaches its minimum, maximizing deviation likelihood, and the agent has no internal resistance to deviation because the self-esteem mechanism that generates coherence pressure has been exhausted. The disclosure singles this mode out as triggering mandatory governance intervention, because an agent at the self-esteem floor is structurally incapable of self-correction through normal trifecta operation: the return force that drives realignment is gone.

The fourth is empathy saturation. The empathy engine's harm projection pipeline is processing such a volume of harm projections that the empathy weighting computation exceeds the agent's computational budget for empathy processing. In this state the engine produces either saturated outputs, meaning maximum empathy weighting regardless of the actual harm projection, or timed-out outputs, meaning no empathy weighting because the computation could not complete. Either condition distorts the deviation function and prevents accurate deviation likelihood computation.

How Collapse Is Detected

Collapse is not detected by a single scalar crossing a single line. Each failure mode has its own policy-defined indicators, and the integrity engine evaluates the agent's state against those indicators. Sustained deviation without recovery is recognized from the duration the agent has spent in the Deviation-Activated State together with the trajectory of the deviation function output. Coping intercept entrenchment is recognized from how long a recorded coping intercept has remained active against the policy-defined maximum coping duration. The self-esteem floor breach is recognized when the self-esteem score reaches its policy-defined minimum. Empathy saturation is recognized when the harm projection pipeline saturates or times out against the agent's empathy processing budget.

The evidentiary basis for each of these determinations is the agent's own lineage and integrity records. Deviation events, coping events, self-esteem updates, and the deviation function values that produced each Deviation-Activated State entry are all recorded as lineage entries with full provenance. The collapse detector reads the structural state that these records describe rather than inferring failure from observed external behavior, which is what permits the determination to be reconstructed and audited after the fact.

The Collapse Response Protocol

When the integrity engine detects an integrity collapse condition, the system initiates a collapse response protocol. The agent's operational scope is restricted to a minimal safe operating envelope defined by the agent's policy configuration. Ongoing Deviation-Activated State mutations are suspended and queued for review. A governance notification is generated that alerts the agent's governance authorities to the collapse condition. And the forecasting engine is engaged to generate recovery trajectories. The agent is moved to a reduced operational scope to prevent further integrity degradation while recovery mechanisms are engaged.

The response is therefore containment followed by review, not termination. The disclosure frames the reduced scope as a protective envelope: the agent is not destroyed or reset, its lineage and integrity record are preserved, and the suspended deviation mutations are held for governance review rather than silently discarded. Recovery is a possibility the protocol explicitly provides for by engaging the forecasting engine to project paths back toward aligned operation.

Anticipating Collapse

Collapse detection is paired in the disclosure with moral trajectory forecasting, which projects the agent's integrity evolution over future horizons and assesses the risk of collapse from current indicators. The forecasting module classifies projected trajectories into archetypes. A radicalization arc, in which deviation frequency is increasing, self-esteem is declining, coping intercepts are activating with increasing frequency, and the deviation function output is trending upward, indicates that the agent is on a trajectory toward collapse and requires intervention. A containment arc indicates that the agent's integrity has suffered significant damage but that active containment measures have arrested the rate of deterioration without yet achieving recovery.

When the projected trajectory indicates a radicalization or collapse risk, the forecasting module generates containment recommendations: operational changes that would redirect the trajectory toward a redemption or stabilization arc, such as reducing task load to lower the need vector, adjusting policy to raise the ethical threshold, increasing relational support to boost empathy weighting, or activating self-esteem recovery protocols. The forecasting module does not implement these changes autonomously. It presents them to the governance infrastructure as recommended interventions, preserving the boundary between the agent's self-assessment and the authority that acts on it.

What This Is Not

The disclosure separates collapse from the events that precede it. A single deviation is not collapse: it is recorded, generates pressure, and is expected. A coping intercept is not collapse: it is a structurally distinct mode that the trifecta enters under sustained empathic pressure to prevent total breakdown, and it has its own recorded entry and exit. Collapse is the residual condition in which these protective mechanisms have themselves failed, either by persisting past their policy-defined limits or by exhausting the internal quantities, self-esteem and empathy budget, on which they depend. This is why the detector is defined over the structural state of the loop rather than over any one deviation.

The disclosure also distinguishes integrity from coherence, and collapse is a coherence failure. Integrity is the factual record of what the agent did. Coherence is the ability to account for deviation, remain auditable, and restore balance. An agent may carry many recorded deviations yet retain high coherence if it has recorded them honestly and generated the appropriate corrective pressure. Collapse is the loss of coherence: the loop that maintains accountability has stopped maintaining it.

Disclosure Scope

Integrity collapse, defined as the structural breakdown of the coherence trifecta in which the three-phase loop of empathy registration, integrity recording, and self-esteem-driven corrective pressure ceases to function as a self-correcting mechanism; the enumerated failure modes of sustained deviation without recovery, coping intercept entrenchment, self-esteem floor breach, and empathy saturation; the detection of each mode against its policy-defined indicators from the agent's lineage and integrity records; the collapse response protocol of scope restriction to a minimal safe operating envelope, suspension and queuing of Deviation-Activated State mutations, governance notification, and engagement of the forecasting engine for recovery trajectories; and the associated moral trajectory forecasting that anticipates collapse through radicalization and containment arcs and issues containment recommendations to governance, are disclosed in the cognition filing (U.S. Application No. 19/647,395 and its international counterpart). This article describes that disclosed mechanism. The scope extends to embodiments in which the failure modes are evaluated under differing policy-defined indicators and to embodiments in which the collapse response composes with confidence, forecasting, and trust slope validation, provided the collapse condition remains defined as a breakdown of the coherence trifecta and the response remains containment and governance review rather than uncontrolled continuation.