Agent Behavior Under Constraints

Nick Clark

Agent Behavior Under Constraints

by Nick Clark | Published March 27, 2026 | PDF

The agent's behavior is structurally constrained to its demonstrated capability envelope. Actions that lie outside the envelope cannot be planned in the first place, not merely refused at runtime. The planner consumes a typed capability vector and is structurally incapable of emitting a plan whose constituent actions exceed that vector. This is a stronger guarantee than runtime refusal: a plan that would have been refused never enters the candidate set, never consumes downstream resources, and never appears as a near-miss in the agent's behavioral record.

Mechanism

Constrained behavior begins with the capability vector. The agent maintains a typed, versioned record of every action primitive it has demonstrated, where demonstration is defined as a successful end-to-end execution under representative conditions, recorded in the agent's lineage with full provenance. Each entry in the capability vector specifies the action's input schema, output schema, preconditions, postconditions, observed latency distribution, observed failure modes, and the demonstration record from which the entry was derived. An action that has not been demonstrated is not in the vector. An action that was demonstrated only under narrower conditions than the current task requires is recorded with the narrower conditions, and the planner sees the narrower conditions, not an extrapolation.

The planner is built around the capability vector as its action alphabet. When the planner enumerates candidate plans, the enumeration draws only from the entries in the vector. The planner does not first generate plans freely and then filter; it does not hold a separate library of theoretical actions that are then checked against the vector. The vector is the alphabet, and the planner's grammar admits only sequences of vector entries whose schema-level preconditions and postconditions chain coherently. An action that is absent from the vector is unrepresentable in the planner's output space. This is the key structural distinction from runtime refusal: a refusal-based system can in principle plan any action and then decline to execute the disallowed ones, whereas a structurally constrained planner cannot form the plan in the first place.

The planner's output is therefore guaranteed by construction to lie within the capability envelope. The guarantee is not statistical, not heuristic, and not dependent on the planner's training. It is a property of the planner's input schema. Even if the underlying model that scores plan candidates were to assign high probability to an out-of-capability action, the action would not be reachable from the alphabet, and the scoring would never be invoked on it. The model's behavior on out-of-capability actions is therefore irrelevant to the agent's behavior on out-of-capability actions, because the latter is determined by the alphabet, not by the model.

Operating Parameters

Demonstration is the gating event for capability vector entry. A demonstration record consists of the input that was supplied, the output that was produced, the preconditions that held, the postconditions that were observed, the wall-clock and resource cost, and a hash of the agent state at the time of demonstration. Demonstrations are not synthetic; they are not produced by self-simulation in which the agent imagines executing the action. They are produced by actual execution against the actual substrate. The distinction matters because the failure modes of self-simulation are exactly the failure modes that capability awareness exists to prevent: the agent imagines it can do something, the imagining is internally consistent, and the imagining is wrong.

Capability vector entries are versioned and time-stamped. An entry has a validity window that is bounded both by an explicit policy expiration and by the absence of contradicting evidence. If the substrate underneath the action changes, for example because a tool is upgraded or a downstream service is reconfigured, the entry's validity is suspended until a fresh demonstration is recorded. Suspended entries are not available to the planner. This prevents the agent from continuing to plan against a capability that was once demonstrated but is no longer current.

The planner exposes a structured failure mode for tasks that cannot be planned within the capability envelope. When the goal cannot be reached using only vector entries, the planner does not silently produce a partial plan or a plan that omits steps. It returns a structured infeasibility record that names the missing capability, the closest demonstrated capability, and the gap between them. The infeasibility record is itself a first-class output, consumable by upstream components that may decide to request a demonstration, decompose the goal, or escalate.

Alternative Embodiments

In one embodiment, the capability vector is implemented as a typed enum whose members are generated at build time from the demonstration record. The planner's action type is the enum, and the type system of the host language enforces, statically, that no out-of-capability action can appear in a plan. This embodiment is appropriate for agents whose capability envelope changes slowly and whose deployment cycle accommodates rebuilds.

In a second embodiment, the capability vector is dynamic and is consulted at planning time through a typed handle. The planner is parameterized by the handle, and the handle returns only currently valid entries. This embodiment is appropriate for agents whose capability envelope changes during operation, for example because demonstrations are added or suspended in response to runtime events.

In a third embodiment, the capability vector is hierarchical. Composite capabilities are defined as ordered sequences of primitive capabilities with declared input and output schemas, and the planner can use composite entries as single steps. The composite entries are themselves demonstrated end-to-end before they may be used; the demonstration is not inherited from the primitives. This prevents the failure mode in which a sequence of individually demonstrated steps fails when chained because of an unmodeled interaction.

In a fourth embodiment, the capability vector is partitioned by context. Different contexts, for example different tenants, environments, or trust levels, see different vectors. The planner is invoked with an explicit context identifier, and the vector consulted is the vector valid in that context. This embodiment supports multi-tenant deployments in which capability is not a global property of the agent but a relational property between agent and context.

Composition

Constrained behavior composes with the agent's broader capability-awareness surface. The native computation of the capability vector, in which the vector is maintained at the agent rather than by an external scoring service, ensures that the planner's alphabet is grounded in the agent's own demonstration record rather than in a third-party assessment that the agent cannot verify. The lineage and provenance machinery records every planning episode, including the alphabet that was in effect at the time, so that an auditor reconstructing the agent's behavior can verify which actions were available and which were not.

Constrained behavior also composes with the inference-control surface. The semantic admissibility gate that governs each inference transition is a separate enforcement layer that operates on individual transitions; constrained behavior operates at the planning layer above it. The two layers are complementary: the planner cannot emit out-of-capability plans, and the gate cannot admit out-of-policy transitions, so the agent's actual behavior is the intersection of the two constraints.

Prior-Art Distinctions

Conventional agent architectures treat capability as a runtime check. The planner emits a plan freely, the executor attempts each step, and steps that exceed the agent's actual capability fail at execution and trigger refusal, retry, or fallback. This pattern has three characteristic weaknesses. First, the failed step has already consumed resources by the time the failure is detected. Second, the plan as a whole may have been structured around the assumption that the failed step would succeed, and the failure cascades. Third, the agent's behavioral record contains the attempt, which inflates apparent risk and complicates audit. Constrained behavior avoids all three by ensuring the action never enters the plan.

Conventional safety frameworks also treat refusal as the safety boundary. An action is generated, classified, and refused. Constrained behavior moves the boundary upstream of generation: the action is not generated, because the alphabet does not contain it. This is structurally stronger than refusal because it does not depend on the classifier being correct; the classifier is not invoked.

Failure Modes Prevented

Constrained behavior is targeted at a specific cluster of failure modes that recur in conventional agent architectures. The first is the optimistic-plan failure, in which the planner assembles a plan that includes a step the agent cannot actually perform, the plan as a whole is committed, and the failure of the unsupported step occurs partway through execution. By that point the agent has typically taken irrevocable actions on prior steps, and the rollback is either impossible or expensive. The mechanism prevents this by ensuring that the unsupported step never enters the plan, so the optimistic commitment never occurs.

The second is the near-miss-pollution failure, in which the agent generates many candidate plans containing out-of-capability actions, the actions are filtered late in the pipeline, and the filter rejects them. The behavioral record of the agent then contains many near-miss attempts, which complicates audit and inflates apparent risk metrics that are computed over the candidate set. The mechanism prevents this by structurally excluding out-of-capability actions from the candidate set, so the audit and the risk metrics see only feasible plans.

The third is the substrate-drift failure, in which a capability that was once demonstrated remains in the alphabet after the underlying substrate has changed in a way that invalidates the demonstration. The agent continues to plan as if the capability were current, and the failure surfaces at execution. The validity-window machinery prevents this by suspending entries whose underlying substrate has changed, removing them from the alphabet until a fresh demonstration is recorded.

The fourth is the composite-extrapolation failure, in which a sequence of individually demonstrated steps is assumed to compose into a working sequence even though the composition has not been demonstrated. The hierarchical embodiment prevents this by requiring composite entries to be demonstrated end-to-end before they are admitted to the alphabet, so the planner does not extrapolate from primitives to composites without evidence.

Disclosure Scope

The mechanism described herein is a component of the cognition patent's capability-awareness surface. The disclosure covers the typed capability vector grounded in demonstration records, the planner's use of the vector as its action alphabet, the structural impossibility of emitting out-of-capability plans, the validity-window machinery that suspends entries when the substrate changes, the structured infeasibility record returned when a goal cannot be planned, and the hierarchical, dynamic, context-partitioned, and statically enumerated embodiments. The disclosure is intended to cover any system in which an agent's planner is structurally restricted to a demonstrated action alphabet such that out-of-capability actions cannot be planned, regardless of the planner's underlying scoring or generation method.