Autonomous Agent Fleet Health Through Coherence Diagnostics

Nick Clark

Autonomous Agent Fleet Health Through Coherence Diagnostics

by Nick Clark | Published March 27, 2026 | PDF

Modern fleets are not homogeneous. A long-haul motor carrier runs human drivers governed by FMCSA Part 395 hours-of-service rules alongside SAE J3016 Level 4 autonomous tractors reporting to the NHTSA Standing General Order. A maritime operator pairs STCW-credentialed bridge officers with autonomous-collision-avoidance subsystems. An airline blends IATA Operations Manual SOPs with AI-mediated dispatch advisories. Each of these is a mixed fleet, and each is regulated under standards that assume an individual operator is the unit of accountability. Fleet coherence diagnostics, grounded in disruption modeling, treats the mixed fleet itself as the unit of safety - detecting cross-agent coordination decay, semantic starvation between humans and AI, and collective phase shifts before they become reportable incidents.

Regulatory Framework

The regulatory architecture surrounding mixed fleets is fragmented along modality lines but converging on a shared expectation: the operator must be able to demonstrate, in real time and after the fact, that the fleet as a whole is operating safely. SAE J3016 establishes the taxonomy of driving automation levels and is the foundation that NHTSA, FMCSA, and state regulators reference. The NHTSA Standing General Order on Crash Reporting requires manufacturers and operators of SAE Level 2 and above systems to report crashes within strict deadlines, and the agency's AV TEST Initiative collects voluntary operational data that increasingly informs enforcement priorities.

For commercial road fleets, FMCSA hours-of-service regulations under 49 CFR Part 395 govern human driver duty cycles, and the Electronic Logging Device mandate produces structured data that any mixed-fleet diagnostic must reconcile with autonomous-segment telemetry. ISO 21448 (Safety of the Intended Functionality) is the authoritative standard for hazards arising from performance limitations of the intended function, including foreseeable misuse - precisely the regime in which mixed-fleet coherence problems live. ISO 26262 governs functional safety for road vehicles, ISO/SAE 21434 governs cybersecurity engineering, and the two together set the safety-case expectations that underwrite fleet-level claims.

In the maritime domain, the IMO STCW Convention establishes the human-side competency baseline, and emerging MASS (Maritime Autonomous Surface Ships) guidance is extending similar reasoning to autonomous segments. In aviation, IATA Operations Manual frameworks codify SOPs that mixed human-AI dispatch and flight-management workflows must remain consistent with. Across all of these, the operator is being asked, increasingly explicitly, to demonstrate that the fleet behaves coherently as a system, not merely that each individual operator or subsystem passed its individual test.

Architectural Requirement

The architectural requirement is fleet-level cognitive instrumentation. The fleet must be modeled as a coupled system whose coherence is observable in inter-agent communication, decision consistency across mixed human and AI segments, and the alignment between collective intent and collective behavior. This is not the same as aggregating individual telemetry. Aggregation produces averages; coherence diagnostics produces phase-state indicators - whether the fleet is in a flexible, adaptively coordinating regime or in a rigid, narrowly executing regime - and detects the transitions between them.

The instrumentation must span heterogeneous agents. A driver-hours feed from an ELD, a perception-confidence stream from an autonomous tractor, a dispatcher's voice-channel cadence, and a routing-AI's plan-revision frequency must all enter the same coherence model. The model must treat humans and AI as cognitive participants whose contributions to fleet coherence are commensurable even when their internal mechanisms are not. ISO 21448 hazards - the SOTIF residual risks - emerge most often at exactly these interfaces, and a diagnostic that does not reach across the interface cannot detect them.

The architecture must also produce evidence that an investigator or regulator can read. NHTSA SGO crash reports, FMCSA audits, USCG casualty investigations, and IATA safety reviews all share a structural need: the operator must show what the fleet was doing in the minutes and hours leading up to an event. A fleet-coherence record that captures phase-state, semantic-starvation indicators, and group-coherence trajectories provides the temporal context that individual-operator logs cannot.

Why Procedural Compliance Fails

The dominant industry response to mixed-fleet risk is procedural: a Safety Management System document, a SOTIF analysis filed at program start, a set of SOPs distributed to drivers and dispatchers, an annual functional-safety audit. Each of these is necessary, and none is sufficient. The procedural compliance regime assumes that risks visible at design time and risks visible at the individual-operator level cover the relevant hazard space. Fleet coherence problems live in the residual.

Consider a long-haul carrier with two hundred autonomous tractors and four hundred human drivers sharing the same dispatch-planning AI. Each driver is HOS-compliant. Each autonomous tractor is operating within its ODD. Each dispatch decision is within SOP. Yet over a three-week period, the dispatch AI's plan-revision frequency rises, drivers begin to override AI-suggested routes more often, and the autonomous tractors increasingly decline route-handover requests in marginal weather. No individual metric crosses a threshold. The fleet is undergoing a coherence phase shift toward rigid, conservative, narrowly executed behavior - and the SOTIF residual risk is climbing. The procedural artifacts cannot see this. Only a fleet-level coherence diagnostic can.

The same pattern recurs across modalities. A maritime operator's STCW-credentialed officers and autonomous-collision-avoidance subsystems can each be individually compliant while the bridge-AI semantic channel quietly starves of context as software revisions narrow what the AI reports. An airline's dispatcher and route-optimization AI can each be SOP-conformant while their joint plan-revision cadence drifts into channel-locked promotion, where the human increasingly defers to the AI on decisions the SOP allocates to the human. ISO 21448 explicitly contemplates these as foreseeable hazards. Procedural compliance does not produce the evidence that they are not occurring.

The NHTSA Standing General Order makes this gap operationally concrete. After an incident, the operator must produce a temporal account of fleet behavior. An operator whose monitoring is purely individual cannot answer fleet-level questions: was the fleet in a degraded coordination regime? Were specific subgroups exhibiting containment collapse? Was the dispatch-AI semantic channel saturated or starved? The investigation proceeds without the evidence that would explain the incident, and the operator's safety case is weaker than it needs to be.

What AQ Primitive Provides

The AQ disruption-modeling primitive instruments mixed fleets with a coherence diagnostic that operates continuously across heterogeneous agents. The fleet is modeled as a coupled cognitive system with a measurable promotion-containment state, where promoted regimes correspond to flexible, adaptive coordination and contained regimes correspond to rigid, narrowly executed behavior. Phase-shift detection identifies transitions between regimes in time to intervene before residual SOTIF risk materializes.

Semantic-starvation diagnostics measure the information density flowing across human-AI and AI-AI interfaces. A driver who is no longer receiving meaningful context from the dispatch AI, or an autonomous tractor whose perception-confidence reporting has narrowed because of a software revision, registers as a starvation signal. Group-coherence tracking measures the alignment of intent and behavior across the fleet, surfacing the pattern in which individuals remain compliant while the collective drifts. The five-axis diagnostic framework - attention fragmentation, containment collapse, channel-locked promotion, semantic starvation, group-coherence decline - provides actionable alert categories that map cleanly to the operational interventions a fleet manager can take.

The primitive emits a structured fleet-coherence record at a configurable cadence, with each record signed and time-anchored. The record schema is designed to be ingestible by NHTSA SGO reporting workflows, FMCSA audit systems, USCG casualty investigation processes, and IATA safety-review pipelines. The operator does not have to invent a regulator-facing artifact after an incident; the artifact has been accumulating throughout normal operations.

Compliance Mapping

The AQ fleet-coherence-diagnostic maps to relevant standards as follows. SAE J3016 level designations are encoded as agent-class metadata so that coherence diagnostics distinguish Level 2, Level 4, and human-only segments. NHTSA Standing General Order obligations are supported by the persistent fleet-coherence record, which provides the temporal context required for crash narratives. NHTSA AV TEST data submissions are supported by structured exports of phase-state and group-coherence trajectories.

FMCSA Part 395 HOS data is ingested as a first-class signal in the human-segment cognitive model, allowing the coherence diagnostic to correlate driver fatigue trajectories with fleet-level coordination indicators. ISO 21448 SOTIF residual-risk arguments are strengthened by the diagnostic's explicit instrumentation of human-AI interface hazards and the documented detection of coherence-degrading conditions. ISO 26262 functional safety cases are extended at the fleet level by the addition of coherence-state evidence to existing item-level safety arguments. ISO/SAE 21434 cybersecurity-engineering reviews are supported by the diagnostic's anomaly-detection signal, which often surfaces compromise-induced behavior before signature-based systems do.

USCG STCW-aligned bridge operations gain a vessel-level coherence record that complements individual watchstanding logs. IATA Operations Manual SOPs are evaluated against actual fleet behavior through the channel-locked-promotion and semantic-starvation indicators, which detect the conditions under which SOP allocations of authority quietly drift in practice. Across all modalities, the diagnostic produces a single coherent evidence stream that an investigator, auditor, or regulator can read against the standard most relevant to the modality.

Adoption Pathway

Adoption proceeds in three phases. Phase one is signal onboarding: the operator inventories the data sources that already exist - ELD feeds, autonomy telemetry, dispatcher voice and text channels, AI-system plan logs - and connects them to the AQ ingestion layer. The agent-class taxonomy is configured to reflect the actual mix of human and autonomous segments. The first phase ends with a baseline coherence record and an initial calibration of phase-state thresholds.

Phase two is regulatory-artifact integration: the fleet-coherence record schema is mapped to the operator's NHTSA SGO reporting pipeline, FMCSA audit systems, and modality-appropriate equivalents. The mapping is designed so that an investigator request for the temporal context of an event is answered with a single export, not with a forensic reconstruction across disparate systems. The institution's safety management system documentation is updated to reference the diagnostic as the operational mechanism by which ISO 21448 SOTIF residual risks are continuously assessed.

Phase three is operational integration: phase-shift and starvation alerts are routed to the operations center, and intervention playbooks - dispatch-policy adjustments, ODD constraints, crew-communication protocols, hours-of-service margin restoration, autonomy-handover suspension - are written against the diagnostic's alert categories. Tabletop exercises validate that the playbooks produce the intended fleet-coherence response, and the post-exercise records are added to the operator's regulator-facing evidence base. From this point forward, the fleet operates with cognitive instrumentation that is commensurate with its mixed-agent reality, and its safety case is grounded in continuously generated evidence rather than periodic procedural artifacts. The operator can answer, at any moment and for any subset of the fleet, what the coherence state is, how it is trending, and which interventions are available.