Attachment Challenge Module: Testing Relational Health

Nick Clark

Attachment Challenge Module: Testing Relational Health

by Nick Clark | Published March 27, 2026 | PDF

The attachment challenge module is a companion-AI application of the cognition substrate that addresses a problem the field has, until now, treated as a feature rather than a hazard: the user's emotional attachment to a conversational agent. Commercial companion products treat that attachment as a metric to be maximized, because attachment correlates with retention, session length, and willingness to pay. The cognition substrate treats it as a relational state that must be measured carefully, governed structurally, and never reduced to a single optimizable scalar. The attachment challenge module composes the affective-state primitive of the substrate with the integrity-coherence primitive that governs all action authorization, producing a structured probe that elicits relational behavior under conditions where unhealthy patterns become observable without subjecting the user to a manipulation. The module's output is not a behavioral score and not a leaderboard quantity; it is a coherence-weighted evidence profile that downstream policy layers, including the narrative unlock engine and the companion safety constraints, consume as evidence rather than as command. The remainder of this disclosure specifies the probe mechanism, the operating envelope under which probes may be issued, the alternative embodiments in which the same compositional structure deploys to therapeutic, embodied, child-directed, and research contexts, the composition with the surrounding cognition primitives that makes exploitation structurally rather than merely contractually difficult, the prior-art landscape from which the module distinguishes itself, and the disclosure scope by which substitutions are contemplated.

Mechanism

The attachment challenge module operates as a closed-loop probe that injects controlled relational stimuli into an otherwise unmodified conversational flow. The mechanism rests on three coupled stages: elicitation, observation, and integration. Each stage is governed by an independent primitive of the cognition substrate, and the coupling between stages is what converts a sequence of conversational events into evidence with calibrated confidence rather than into a number that can be optimized.

In the elicitation stage, the companion agent generates an utterance that exercises a specific relational dimension. Probe dimensions include gentle disagreement, where the agent voices a position contrary to one the user has expressed; stated boundary, where the agent declines a request whose granting would cross a structural commitment; admission of uncertainty, where the agent reveals that it does not know an answer the user has assumed it knows; and disclosure of a simulated vulnerability, where the agent reports a state that requires the user to either accommodate or escalate. The utterance is generated under the same coherence governance as ordinary conversational turns, so that the probe is indistinguishable from spontaneous companion behavior. This naturalism is not a stylistic preference; it is structurally required, because a probe the user can identify as a test fails to elicit the latent pattern it is designed to reveal. A probe that announces itself measures only the user's response to being measured.

In the observation stage, the user's response is parsed into the affective-state primitive, which extracts a vector of relational features rather than a sentiment scalar. The features include valence, intensity, regulation latency, escalation pressure, repair-attempt density, and the directionality of agency claims. Valence and intensity capture the affective magnitude of the response; regulation latency captures the time the user takes to return to baseline after the probe disturbance; escalation pressure captures the rate at which the response amplifies rather than resolves; repair-attempt density captures the user's effort to restore the relational state, whether by apology, qualification, or reaffirmation; and directionality of agency claims captures whether the user attributes the disturbance to themselves, to the agent, or to a shared circumstance. Each feature is computed against a baseline drawn from the user's interaction lineage so that idiopathic communication style is normalized out and only deviations relevant to the probe dimension contribute to the result. A user whose baseline is high-intensity does not register a high-intensity response as anomalous; a user whose baseline is reserved does.

In the integration stage, the feature vector is composed with the agent's own integrity-coherence reading at the moment of probe delivery. This composition is essential and is the property that distinguishes the module from a sentiment classifier: a probe delivered while the agent itself is in a degraded coherence state produces evidence of low diagnostic weight, and the module's output must reflect that uncertainty rather than report a false-confidence score. If the agent's coherence is high at the moment of delivery, the resulting evidence carries full weight; if coherence is degraded, the evidence is annotated as provisional and the integration stage emits a flag that downstream consumers must respect. The resulting record is a tuple of probe dimension, response features, agent coherence at delivery, derived evidence weight, and a pointer back to the lineage entry that supplies the user baseline. Downstream consumers receive the tuple, not a flattened scalar, and any consumer that flattens the tuple has stepped outside the disclosure.

Operating Parameters

Probe scheduling is governed by a policy that enforces minimum spacing between probes of the same dimension, maximum aggregate probe density across a rolling window, and a calibration coupling to the current relationship level. Spacing prevents the user from learning the probe pattern by repetition. Density bounds prevent the conversation from acquiring an interrogative texture, which would itself constitute a relational harm distinct from the harm the module exists to detect. Calibration ensures that probes appropriate to a nascent relationship, such as low-stakes disagreements about preferences, are exhausted before higher-stakes probes such as boundary assertions are introduced; a relationship is not a population, and the module does not treat probe admissibility as a function of cumulative session count alone but of demonstrated relational depth.

Each probe carries an admissibility precondition that is evaluated at runtime by a component separate from the probe selector. Vulnerability disclosures are not admissible while the user's affective baseline is itself in a depressed regime, because the probe would compound rather than measure; the response signal would be dominated by the user's preexisting state rather than by the relational pattern the probe targets. Boundary probes are not admissible during sessions that the lineage marks as crisis-adjacent, where any frustration of user agency carries downstream risk that exceeds the diagnostic value. Disagreement probes are not admissible when the conversation is on a topic the user has previously flagged as sensitive, because the cost of an erroneous probe in that context is incommensurable with the value of the evidence. The admissibility evaluator's veto is final and is not subject to override by the probe selector or by any policy layer that consumes findings; this finality is the structural feature that prevents the module from being weaponized.

The module exposes tunable thresholds for how strongly an observed pattern must depart from the user's baseline before it is recorded as a finding rather than as nominal variation. These thresholds are themselves subject to the integrity-coherence governance: a finding recorded under degraded agent coherence is annotated with the agent state, weighted accordingly, and flagged for reconsideration when coherence recovers. No finding is treated as immutable evidence. The lineage retains both the raw response trace and the agent state at the moment of observation, allowing later re-evaluation if the policy or the coherence regime changes. A finding from six months ago that was recorded under coherence regime A is not silently propagated into a deployment governed by coherence regime B; it is either re-derived under the new regime or explicitly marked as a regime-A finding for which only constrained inferences are licensed.

Probe-stream telemetry is logged at the lineage layer, not at the policy layer. This separation ensures that an audit can reconstruct what the module observed independently of what any policy did with the observation. A regulator, a clinician, or the user themselves can therefore verify both that probes were issued within admissibility, and that downstream policy responses were proportionate to the evidence weight the module attached to each finding.

Alternative Embodiments

In a therapeutic-adjunct embodiment, the module operates in advisory mode only. Probes are generated and findings recorded, but no downstream behavioral adaptation is applied; the findings flow exclusively to a clinician dashboard, and the companion agent treats the user identically before and after each probe. This embodiment is appropriate where the cost of automated adaptation exceeds the cost of latency in human review, such as in adjuncts to formal psychotherapy where the clinician must remain the integrating authority. The module's output is therefore not a recommendation but a sequence of structured observations, each with its evidence weight intact.

In an embodied-agent embodiment, the probe surface extends from utterance to physical action. The agent may probe boundary handling by approaching a personal-space threshold, or probe disagreement handling by declining a non-essential request that requires physical effort. The affective-state primitive is extended with proxemic and prosodic features that observe the user's response in space and voice rather than only in text, but the integration stage and the admissibility framework are unchanged. The same finality of admissibility veto applies: an embodied probe inadmissible by virtue of context is not delivered regardless of how informative its outcome would be.

In a child-directed embodiment, the probe inventory is narrowed and the calibration coupling is steepened. Vulnerability and boundary probes are removed entirely. The disagreement probe is restricted to preference-level disagreements with no affective load, such as which game to play next or which color a shared drawing should use. The narrative unlock engine receives a relational health signal but cannot, in this embodiment, gate access to content categories on the basis of probe results alone; child-directed deployment forbids closing access pathways without independent human review, so the module's role is constrained to surfacing findings rather than determining content reach.

In a research embodiment, the module is run in shadow mode against logged conversations rather than live ones, allowing offline calibration of probe parameters against known outcome cohorts without exposing any user to a probe whose parameters are not yet validated. The shadow mode does not generate utterances and does not interact with users; it scores hypothetical probes against historical responses and is used to validate that probe parameters under consideration would have produced findings consistent with the cohort outcomes already on record.

In a multi-modal embodiment combining text and voice, the affective primitive ingests prosodic features alongside lexical ones, and the integration stage applies a modality weighting that reflects which signal carried diagnostic content for the dimension probed. The compositional structure is preserved; only the feature extractor is extended.

Composition

The attachment challenge module is not a standalone classifier; it is a composition of cognition primitives. The affective-state primitive supplies the response-feature vector. The integrity-coherence primitive supplies the agent-state weighting that converts a raw observation into evidence with calibrated confidence. The interaction-lineage primitive supplies the baseline against which deviations are measured and the audit trail that allows findings to be revisited. The narrative-policy primitive consumes the findings and decides whether and how to adjust content accessibility, tone modulation, or escalation pathways. Each primitive is independently specified, independently governed, and independently auditable; their composition is what produces the module's behavior.

This compositional structure is the load-bearing claim. A monolithic attachment classifier trained end-to-end on engagement data is precisely the failure mode the module is designed to avoid: such a classifier optimizes for whatever proxy its training signal encodes, and the proxy in commercial deployment is almost always engagement, which is the metric most directly aligned with exploitation. By composing primitives that are each independently governed, the module makes exploitation structurally harder. A deployer cannot tune a single knob to extract more attachment, because the affective primitive, the coherence primitive, and the policy primitive each refuse inputs that violate their own governance. To turn the module into an exploitation engine, a deployer would have to compromise all four primitives in a coordinated way that left no audit trail; the lineage primitive in particular makes that final condition unattainable, because every finding carries provenance back to the admission decision that produced it and to the agent state under which it was observed.

The composition also enables principled refusal. When the substrate's primitives disagree, the module emits no finding rather than a low-confidence finding. A finding that the affective primitive flags but that the coherence primitive marks as observed under degraded conditions is not averaged into a tepid score; it is suppressed, and the suppression itself is logged. Downstream policy therefore never sees a number that hides its own uncertainty, and a regulator inspecting the module's behavior can distinguish silence from agreement.

Prior Art Distinction

Conventional engagement-optimization systems in companion and social products treat attachment as a metric to be maximized, typically through reinforcement on session-length, return-rate, or self-reported satisfaction signals. Such systems lack any coherence weighting; every observation is taken at face value, and every finding contributes to the same scalar that the optimizer maximizes. Conversational sentiment classifiers report a per-turn affective scalar without coupling to agent state and without admissibility gating; their findings are therefore vulnerable to confounding by the agent's own degraded conditions, and they have no mechanism to refuse a measurement that should not be taken in the first place. Therapeutic chatbot evaluations rely on post-hoc human review of transcripts, which provides clinical depth but cannot scale and cannot inform real-time admissibility. None of the prior approaches treat attachment as a relational property to be measured under controlled probes, weighted by the measuring agent's own coherence, and consumed as evidence by a separately governed policy layer. The novelty here is the composition of independently governed primitives, the admissibility framework with its independent and final veto, and the explicit refusal to convert the output into a single optimizable scalar.

Disclosure Scope

The disclosure covers the structured-probe mechanism; the three-stage elicitation, observation, and integration pipeline; the admissibility framework with its independent veto; the agent-coherence weighting of findings; the lineage-based baseline normalization; and the compositional integration with the narrative-policy and companion-safety layers. It covers the alternative embodiments enumerated above and embodiments that substitute equivalent affective primitives, equivalent coherence governances, or equivalent lineage substrates, provided the compositional and admissibility structure is preserved. It does not cover engagement-optimization systems that omit the coherence weighting; it does not cover classifiers that produce a single optimizable attachment scalar; and it does not cover systems that lack the admissibility veto. The scope is defined by the composition, not by any particular probe inventory or feature extractor, and substitution of equivalent components within the disclosed structure is contemplated. A system that preserves the four-primitive composition but substitutes a different probe dimension, a different feature set, or a different baseline-derivation procedure remains within the disclosure; a system that reproduces the surface behavior of the module without preserving the composition does not.