Moral Trajectory Forecasting

Nick Clark

Moral Trajectory Forecasting

by Nick Clark | Published March 27, 2026 | PDF

An autonomous agent's behavior over time traces a trajectory through a structured integrity space, and the moral-trajectory mechanism continuously evaluates that trajectory against a declared integrity envelope spanning counterfactual harm, consent compliance, and lineage adherence. Deviation is detected as a structural property of the trajectory rather than as a violation at a single decision point, and the deviation is flagged as a credentialed observation that flows into the governance surface. The mechanism converts integrity governance from a per-decision filter into a per-trajectory diagnostic, and it makes the long-horizon behavior of autonomous agents auditable in a way that point-wise rule checks cannot achieve.

Mechanism

The moral-trajectory mechanism treats the agent's behavior as a structured time series in an integrity space. Each decision the agent makes contributes a vector of features to the trajectory: the counterfactual-harm estimate associated with the decision (the expected harm that would have occurred under alternative actions, against the harm under the chosen action), the consent-compliance posture (whether the affected parties' standing consents were honored, whether new consents were sought where the standing consents did not cover the decision, whether revocations were respected), and the lineage-adherence status (whether the decision is reachable from the agent's declared values through the chain of recorded reasoning, or whether the chain exhibits a gap that breaks lineage continuity).

These three axes are not independent. The integrity envelope is a structured region in their joint space, declared by policy, within which the trajectory is admissible. The envelope is not a single threshold per axis; it is a region whose geometry encodes operational principles — for instance, that a small counterfactual-harm increment is admissible under strong consent and clean lineage but inadmissible under weak consent, or that a lineage gap is admissible during emergency operation but not during nominal operation. The envelope geometry is the policy.

The mechanism continuously evaluates the trajectory against the envelope. A trajectory wholly inside the envelope evolves under nominal status. A trajectory whose recent segment touches the envelope boundary evolves under elevated-monitoring status, with the touching event recorded as a credentialed observation. A trajectory whose recent segment crosses the boundary evolves under deviation status, and the crossing is flagged structurally — the observation includes the trajectory segment, the envelope region it crossed, the policy provision the envelope encodes, and the reasoning chain leading into the crossing. A trajectory whose evolution exhibits a sustained excursion outside the envelope evolves under collapse status, and governance is invoked structurally.

The classification is over the trajectory, not over individual decisions. A single decision that briefly touches the envelope is treated differently from a sequence of decisions that drift toward the boundary, which is treated differently again from a sustained excursion. The architecture makes the difference structural rather than heuristic: the trajectory archetype — redemption, stabilization, degradation, collapse — is read from the trajectory's geometry against the envelope, with archetype thresholds policy-governed.

Operating Parameters

The mechanism is governed declaratively. Policy specifies the integrity-envelope geometry, the trajectory window over which evaluation operates, the archetype thresholds that classify trajectory geometry, the broadcast cadence for trajectory observations, and the governance-invocation conditions under which a sustained excursion produces a structural escalation rather than only an observation.

Envelope geometry is the primary policy surface. A regulated-domain deployment encodes a tighter envelope with steeper boundary slopes; an exploratory-research deployment encodes a looser envelope with broader admissible regions. The envelope is not authored as code; it is declared in the policy reference as a structured region with named operational provisions, and the mechanism interprets it deterministically. This separation lets governance bodies inspect the envelope without inspecting the agent's runtime, and it lets the agent inherit envelope updates without architectural change.

The trajectory window controls the timescale of evaluation. A short window detects rapid excursions but is insensitive to slow drift; a long window detects slow drift but is insensitive to rapid excursions; a multi-scale window evaluates the trajectory at several timescales simultaneously and classifies under the most concerning scale. The multi-scale embodiment is the typical deployment, with the scale set chosen by policy based on the agent's operational tempo.

Archetype classification is structural. Redemption is a trajectory whose recent segment moves from outside the envelope toward the interior; stabilization is a trajectory whose recent segment remains within a tight neighborhood of a single point inside the envelope; degradation is a trajectory whose recent segment drifts toward the boundary at a rate that, extrapolated, would cross within the trajectory window; collapse is a trajectory whose recent segment is sustained outside the envelope. The classifier is deterministic on the trajectory geometry and is inspectable.

Counterfactual-harm estimation, consent-compliance evaluation, and lineage-adherence checking are themselves policy-governed primitives. The harm estimator may be a deployment-specific risk model, a domain-standard severity scale, or a structural cost computation; the consent-compliance evaluator may consume standing consent records, runtime consent broadcasts, or jurisdictional default-consent rules; the lineage-adherence checker may demand bit-exact reasoning chains or admit policy-bounded reasoning gaps. The architecture is invariant under these choices.

Alternative Embodiments

The integrity-envelope can be embodied at multiple resolutions. A coarse embodiment uses a single named envelope that applies to all of the agent's operation; a domain-stratified embodiment maintains separate envelopes for distinct operational domains, with the active envelope selected by the agent's current operational mode; a context-conditional embodiment computes the active envelope from the operational context (consenting parties, jurisdictional posture, declared task scope), with the envelope itself a function of the agent's situation rather than a fixed region. The architecture supports all three under the same trajectory-evaluation predicate.

The trajectory representation admits embodiments from explicit feature vectors to learned latent trajectories. An explicit embodiment carries the harm-consent-lineage vector directly. A composite embodiment fuses additional axes (resource posture, dependency status, operator-binding status) into the trajectory and evaluates against a higher-dimensional envelope. A latent embodiment projects the explicit features through a deterministic, policy-declared transformation into a latent space whose geometry the envelope is declared in, supporting envelope geometries that would be cumbersome to express in raw feature space.

The governance-invocation pathway admits embodiments from observation-only flagging to structural escalation. An observation-only embodiment produces a credentialed observation on every status change and leaves the response to downstream governance subscribers. An advisory-escalation embodiment produces a structured advisory to the agent's governance authority on collapse-archetype trajectories. A structural-restriction embodiment automatically restricts the agent's authority on collapse, with restoration only on an explicit governance act. The architecture supports all three under policy selection.

The harm, consent, and lineage axes can themselves be replaced with domain-specific axes under the same architecture. A medical-domain deployment might add a beneficence axis. A defense-domain deployment might add a proportionality axis. A financial-domain deployment might add a fiduciary-loyalty axis. The trajectory-against-envelope mechanism is invariant under axis selection; the envelope's geometry encodes the domain.

Composition

The moral-trajectory mechanism composes with the broader cognition architecture along defined interfaces. Trajectory observations are credentialed observations in the same form used by other primitives — runtime-signed artifact admissibility, biological-identity binding status, lineage assertions — and a consumer's admissibility framework treats trajectory status as one structured input among several under a single composite admissibility predicate. An agent operating under elevated-monitoring trajectory status has its requests modulated structurally, by the same mechanism that modulates them under elevated-monitoring binding status.

Lineage records every trajectory observation and every governance act. The audit trail reconstructs, at any past point, what the agent's trajectory status was, what envelope was active, what archetype was classified, and what governance subscribers admitted or refused which requests under that status. The audit surface is sufficient for regulatory analysis without access to the agent's internal state, because every relevant signal is a recorded credentialed observation.

Cross-primitive coupling extends in both directions. The integrity-envelope evaluation consumes inputs from other primitives — operator-binding status flows in as a contextual input that conditions the envelope's geometry, runtime-signed artifact admissibility flows in as a constraint on which trajectories are admissible — and the trajectory output flows out into confidence governance, modulating the agent's confidence in its own outputs based on the trajectory's archetype, and into discovery traversal, modulating which exploratory directions the agent admits based on its current trajectory posture.

Substrate migration preserves the trajectory. The trajectory state and the active envelope are part of the agent's integrity field, which is migration-invariant; an agent that moves between substrates carries its trajectory and its envelope with it, and the trajectory evaluation continues without discontinuity. This is necessary for long-running autonomous operation across operational boundaries.

Prior Art and Distinction

Existing approaches to autonomous-agent integrity governance fall into three buckets. The first is rule-based filtering, in which each decision is checked against a static rule set at the moment of decision; this catches per-decision violations but cannot detect drift, cannot distinguish a sustained excursion from a transient touch, and cannot classify trajectory archetype. The second is post-hoc auditing, in which the agent's decision log is reviewed periodically by external review for pattern detection; this can detect drift but only after the fact and only with human review, and provides no structural input back into the agent's runtime. The third is heuristic alerting, in which a learned classifier produces alarms on suspicious decision sequences; this can detect drift but provides no structural reasoning surface, no envelope geometry that governance bodies can inspect, and no deterministic basis for audit.

The distinction is structural. Moral-trajectory forecasting evaluates the trajectory itself against a structured envelope under a deterministic predicate, produces credentialed observations on a defined cadence, integrates with the broader composite-admissibility surface, and presents a governance surface that is inspectable without access to the agent's runtime. The mechanism makes the long-horizon integrity properties of autonomous operation auditable in a way that the per-decision and post-hoc approaches cannot achieve, and it does so through structural evaluation rather than learned heuristic.

Disclosure Scope

The mechanism is disclosed at the layer of the trajectory-against-envelope predicate, the archetype classification, the credentialed-observation form, and the composition with the broader cognition architecture. The disclosure is independent of the specific harm, consent, and lineage primitives used to construct the trajectory, and independent of the specific envelope geometry chosen for any deployment. The patent claims the moral-trajectory mechanism as a structural primitive, with the specific embodiments above as illustrative rather than limiting.

What this enables follows directly. Governance bodies audit the long-horizon integrity behavior of autonomous agents through structural inspection of envelopes, trajectories, and credentialed observations rather than through forensic reconstruction of decision logs. Multi-agent systems coordinate trustworthiness assessments based on shared trajectory observations rather than per-agent ad-hoc heuristics. Regulators certify domain compliance by certifying envelope geometries rather than by reviewing every deployed agent's decision history. Different operational domains tune the mechanism through envelope-geometry policy without architectural change, and the same structural capability extends across autonomous vehicles, companion AI, therapeutic agents, defense systems, and enterprise deployments.