Confidence-Driven Inquiry Mode

Nick Clark

Confidence-Driven Inquiry Mode

by Nick Clark | Published March 27, 2026 | PDF

When the agent's computed confidence value falls below a configured sufficiency threshold, the cognitive runtime does not produce a best-guess output. It transitions deterministically into a structured inquiry mode that solicits clarification, gathers targeted evidence, generates and tests hypotheses, and re-evaluates confidence before resuming substantive action. Inquiry is bounded, auditable, and policy-governed; abstention is treated as a first-class behavior rather than a failure path.

Mechanism

The inquiry-mode mechanism is defined in Chapter 5 of the cognition patent as a structural component of the agent's cognitive architecture. It is triggered by a deterministic comparison of a recomputed confidence scalar against a sufficiency threshold drawn from the active policy reference. When confidence is insufficient, the runtime suspends the substantive action that would otherwise have been taken and enters a structured inquiry state. Within that state, the agent executes a bounded sequence of operations: information ingestion, hypothesis generation, hypothesis ranking, evidence solicitation, and confidence re-evaluation.

Information ingestion is the first operation. The runtime identifies the canonical fields whose insufficiency or contradiction caused confidence to fall, and it constructs targeted retrieval queries against those fields. Retrieval is restricted by policy to authorized sources; the inquiry mode never expands the agent's authority beyond what its certified skills allow. Each ingested datum is recorded in lineage with its source, timestamp, and the field it is intended to populate.

Hypothesis generation produces a small, bounded set of candidate interpretations of the underdetermined situation. Each hypothesis is structured: it consists of a proposed assignment to the contested fields, an explicit set of assumptions, and a predicted observation that would distinguish it from competing hypotheses. The number of hypotheses generated is policy-bounded so that inquiry cost is bounded.

Evidence solicitation is the externally visible face of inquiry. The runtime composes a clarification request, a sensor query, or a tool invocation whose answer would maximally discriminate among the live hypotheses. When the operator is human, this surfaces as a structured question rather than a free-form prompt. When the operator is upstream machinery, it surfaces as a structured query against a defined interface. Confidence re-evaluation closes the loop: incoming evidence is merged into the candidate state, the confidence scalar is recomputed, and the runtime either returns to substantive action, issues a further inquiry, or abstains.

Abstention is treated as a deterministic terminal state. If the inquiry budget is exhausted without confidence rising above threshold, the agent records the unresolved underdetermination, declines to act, and emits a structured abstention record. The abstention is not a crash and not a silent timeout; it is a first-class output with the same lineage status as a substantive answer.

The transition between substantive mode and inquiry mode is itself a recorded lineage event. The recorded transition contains the pre-transition confidence value, the threshold that triggered the transition, the canonical fields whose insufficiency drove the transition, and the policy version under which the transition was evaluated. This record is what permits an auditor to ask not only "what did the agent answer" but "why did the agent ask this question instead of answering" and to verify that the question-versus-answer choice was governed by the same deterministic rule applied uniformly across the agent's history.

Hypothesis ranking, although bounded in count, is not arbitrary in order. Each generated hypothesis carries a prior probability derived from the agent's existing canonical state and a likelihood factor derived from the observations available at the time of generation. The combined score determines which hypothesis is treated as the working interpretation if the inquiry budget is exhausted before discrimination is complete. The working interpretation is never silently adopted as truth; it is emitted as an explicit lower-confidence answer accompanied by the dispositive question that was not answered.

Operating Parameters

Operating parameters governing inquiry mode are declarative and policy-resident. The sufficiency threshold is a scalar in a normalized confidence space, configurable per skill, per domain, and per risk tier. Hysteresis margins prevent the agent from oscillating between substantive and inquiry states near the threshold boundary; entering inquiry uses one threshold, exiting uses a slightly higher one.

The inquiry budget bounds total cost: a maximum hypothesis count, a maximum evidence-solicitation count, a wall-clock deadline, and an arithmetic cost ceiling for retrieval and tool invocations. When any budget element is exhausted, the runtime is required to terminate inquiry and emit either a substantive answer (if confidence has risen) or an abstention record. Budgets are tunable per deployment without code change.

Discrimination policy governs which hypothesis-distinguishing question is asked first. The default policy selects the question whose expected information gain, normalized by elicitation cost, is greatest. Alternative policies can prefer the cheapest discriminating question, the least intrusive question to the operator, or a question whose answer is independently verifiable. The policy is selectable per deployment and recorded in lineage so that auditors can reconstruct why a particular clarification was issued.

Confidence recomputation cadence is a configurable parameter. In low-latency deployments the scalar is recomputed each time a new datum is admitted to candidate state; in batch deployments it is recomputed at the end of each evidence-gathering cycle. Recomputation is deterministic: identical inputs produce identical confidence values, ensuring that inquiry behavior is reproducible from lineage alone.

Alternative Embodiments

In a clinical decision-support embodiment, inquiry mode is triggered when the confidence on a differential-diagnosis ranking falls below a regulator-approved threshold. The agent ingests structured patient history fields, generates ranked candidate diagnoses, and emits a clarification request to the clinician identifying the single most informative additional finding. Abstention surfaces as an explicit "insufficient information for recommendation" record rather than a low-confidence guess.

In an autonomous-vehicle embodiment, inquiry mode is triggered when sensor fusion confidence on a lane-occupancy classification drops below the threshold required for a planned maneuver. The agent solicits additional sensor sweeps, narrows hypotheses to {occupied, unoccupied, occluded}, and either resumes the maneuver after confidence is restored or transitions to a conservative fallback (slowing, yielding, or requesting handoff) without ever executing the original maneuver under insufficient confidence.

In a customer-support agent embodiment, inquiry mode surfaces as a structured clarifying question rather than a hallucinated answer. The agent identifies which user-intent field is underdetermined, generates two or three candidate interpretations, and asks the user which one matches their need. The interaction is logged and reusable as training signal because each clarification is structured and tied to a specific field.

In an enterprise-tooling embodiment, inquiry mode is implemented as a multi-stage retrieval-augmented loop. The agent's first inquiry is internal: query authorized data stores. The second is external: prompt the human operator. Tier ordering is policy-controlled so that inexpensive automated inquiry is exhausted before any human is interrupted.

Composition With Other Mechanisms

Inquiry mode composes with the integrity-feedback loop: an inquiry that surfaces an integrity violation in a candidate proposal raises the threshold for subsequent recomputation cycles, making the runtime progressively more conservative within a session. It composes with the skill-gating mechanism: an inquiry cannot invoke a skill the agent is not certified for, so the runtime never escalates its own authority during inquiry.

Inquiry mode also composes with lineage and provenance: every datum admitted during inquiry carries its source, and every hypothesis generated carries the lineage of the inputs that produced it. The downstream substantive action is therefore traceable not only to its inputs but to the inquiry chain that admitted those inputs in the first place. This composition is what makes the mechanism auditable end-to-end rather than only at its terminal output.

Inquiry mode further composes with multi-agent orchestration. When one agent enters inquiry, downstream agents that were waiting on its output do not receive a degraded best-guess; they receive a structured pending-inquiry signal that allows them to either pause their own processing or branch to an alternative information path. This propagation is itself bounded and recorded, so that the entire orchestration's response to a single underdetermined field is reconstructible from lineage.

Distinction From Prior Art

Prior systems address insufficient confidence in three principal ways, each of which is distinct from the claimed mechanism. Sampling-based abstention frameworks reject low-confidence outputs but do not gather additional evidence; they do not enter a structured inquiry state with bounded operations. Active-learning loops solicit additional labels but operate at training time rather than at inference time and do not gate substantive action on the result.

Conversational clarification systems ask follow-up questions but do so heuristically, without a deterministic confidence scalar, without an inquiry budget, and without lineage of which fields were underdetermined. The claimed mechanism differs by being deterministic, policy-governed, lineage-recorded, and bounded; abstention is a first-class output rather than a fallback; and inquiry is structurally integrated with the rest of the agent's cognitive architecture rather than bolted on as a dialog manager.

Disclosure Scope

The disclosure encompasses any cognitive runtime in which a recomputed confidence scalar gates a transition into a structured inquiry state, where the inquiry state is bounded by policy-resident parameters, executes a deterministic sequence of information ingestion and hypothesis generation operations, and either restores substantive action upon confidence recovery or emits a first-class abstention record on budget exhaustion. The scope is independent of the modality of the underlying agent — text, multimodal, robotic, or embedded — and independent of the substrate on which the confidence scalar is computed.

The mechanism may be deployed as a software library, a runtime service, an embedded component, or a regulated certified module. In every embodiment the structural elements are the same: a confidence scalar, a sufficiency threshold, an inquiry budget, a hypothesis-generation operation, an evidence-solicitation operation, a recomputation operation, and an abstention terminal state. Implementations that omit any of these elements are outside the claimed scope; implementations that include them are within scope regardless of programming language, deployment topology, or domain of application.

The disclosure also extends to embodiments in which inquiry mode is invoked as a service from agents that do not themselves implement the confidence scalar — a thin-client agent may delegate confidence evaluation to a remote service, receive back either a substantive answer or a structured inquiry directive, and surface the resulting clarification to its operator. Distribution of the mechanism across processes or hosts does not remove an embodiment from scope so long as the structural elements remain present and the lineage chain remains reconstructible across the distribution boundary.