Curriculum-Integrated Depth Scheduling
by Nick Clark | Published March 27, 2026
Curriculum-integrated depth scheduling coordinates training depth profiles with the curriculum engine's progression stages, applying depth-selective gradient gating per skill stage. Skills certified at one depth tier carry forward to the next tier as composite admissibility evidence, so that progression through the curriculum is governed not merely by step counts or loss thresholds but by demonstrated competence at each authorized depth. The result is a training regime in which the system architecture itself encodes pedagogical ordering, and in which advanced skills are admitted to deep integration only after the foundations on which they depend have been certified at the same or shallower depth.
Mechanism
The curriculum engine defines a sequence of training stages, each associated with a skill manifest and a corresponding depth profile. The depth profile specifies, for each skill in the manifest, the maximum integration depth at which gradient updates are permitted to flow during the stage in question. Depth, in this architecture, corresponds to the layer index at which gradient signals attributable to a given skill are gated; shallow depth gating restricts gradient flow to upper layers while deep gating admits gradient flow into representational substrate. The gating is selective per skill rather than global per batch, so that within a single training step the foundational skill manifest may receive deep gradient flow while an advanced skill manifest receives only shallow gradient flow on the same batch.
Stage transition is governed by certification. The model must demonstrate competence at the current stage's skill manifest, evaluated through the same evidence-based gating mechanisms applied to skill admissibility in the deployed agent architecture, before the depth profile of the next stage is activated. Certification is depth-tagged: a skill certified at depth tier N is recorded with a credential indicating the depth at which competence was demonstrated. When the next stage admits that skill to a deeper tier, the prior certification carries forward as composite admissibility evidence, contributing to the certification budget required for the deeper tier without requiring re-certification at the shallower tier.
The carry-forward mechanism is not merely bookkeeping. The depth-tagged certification record participates in the gating predicate of the next stage: the predicate evaluates whether the skill's prior shallow-tier certification, combined with the new stage's deep-tier evidence, satisfies a composite admissibility threshold. Skills that fail certification at a tier do not carry forward, and the curriculum engine either holds the stage open until certification is achieved, escalates to remedial sub-curricula, or marks the skill as deferred for a later stage in which alternative foundational evidence has accumulated.
Operating Parameters
The principal operating parameters of curriculum-integrated depth scheduling include the depth profile granularity, the certification threshold per tier, the carry-forward weight applied to prior-tier credentials, the stage hold policy on certification failure, and the gradient gating implementation. Depth profile granularity is typically expressed as a per-skill maximum depth index drawn from a discrete set of tiers (for example, embedding-layer-only, mid-network, full-depth), though continuous gating with a soft mask is also contemplated. Certification thresholds are stage-specific and may be calibrated either against fixed evaluation suites or against dynamically generated probes drawn from the curriculum manifest.
Beyond the principal parameters, deployments configure secondary parameters that govern certification suite composition, probe generation policy, evaluation batch sizing, and the credential expiry policy that determines whether older certifications retain admissibility weight as training progresses or whether such credentials must be refreshed against current model state. Probe generation may draw from a fixed evaluation suite, from a dynamically generated probe distribution conditioned on the curriculum manifest, or from adversarial probes designed to test the boundary of the depth tier's certification envelope. Each policy choice trades certification cost against certification fidelity, and deployments select among them according to the operational stakes of the model under training.
Carry-forward weighting governs how strongly a prior shallow-tier certification offsets the certification budget required at a deeper tier. A weight of unity treats prior certification as fully sufficient for the foundational portion of the deeper-tier requirement; a weight below unity requires partial re-demonstration; a weight above unity (which the architecture supports for skills in which depth tiers are highly correlated) permits deeper-tier admission on the strength of shallower-tier evidence alone, subject to ratification by composite predicates. Stage hold policies determine whether the curriculum waits for certification, advances with provisional credentials, or branches to remedial paths. Gradient gating is implemented either through skill-tagged loss masking, through depth-indexed gradient stop operations, or through router-level admission controls in mixture architectures.
Alternative Embodiments
One embodiment expresses the depth profile as a continuous mask rather than a discrete tier assignment, with the mask updated at stage transitions through a smoothed schedule rather than a step change. A second embodiment binds depth scheduling to data sampling rather than gradient gating, restricting the data on which a skill's representations are updated rather than the layers through which the gradient flows; this embodiment is structurally weaker but simpler to implement in conventional training stacks. A third embodiment integrates depth scheduling with parameter-efficient adapters, assigning each curriculum stage a distinct adapter set whose insertion depth is governed by the stage's depth profile.
A further embodiment treats the curriculum as a directed acyclic graph rather than a linear sequence, with depth profiles defined per node and certification carry-forward following graph edges. This embodiment supports curricula in which multiple foundational skills feed into a single advanced skill, and in which the composite admissibility predicate aggregates carry-forward credentials from multiple parent nodes. A still further embodiment binds depth scheduling to deployment-time governance, so that a model trained under curriculum-depth governance carries certification credentials into the deployed agent's skill admissibility checks, allowing deployment-time gating to inherit the training-time depth tier assignments.
Composition
Curriculum-integrated depth scheduling composes with the broader training-governance architecture in which evidence-based skill gating, depth-selective integration, and credentialed admissibility are unified. The same evidence schemas used to gate skill invocation in the deployed agent are reused to evaluate certification at curriculum stage boundaries. The same depth indexing used to mark skill credentials in deployment is reused to mark the depth tier at which a skill was certified during training. The same composite admissibility predicates used to combine multiple credentials at deployment time are reused at curriculum stage transitions to combine carry-forward credentials with new-tier evidence.
This composition reduces the architectural surface that must be specified separately for training and deployment, and produces a training-deployment continuum in which the credentials accumulated during training are directly usable at deployment without translation. Composition with the model's evaluation infrastructure permits stage transitions to be triggered automatically when certification thresholds are met, rather than at fixed step counts, supporting a curriculum that paces itself to the model's actual learning trajectory.
A further compositional benefit arises in the interaction with the disruption-modeling subsystem. Skills that fail certification during training generate diagnostic records of the same form used at deployment to flag operational disruption, enabling the same restoration protocol library to be applied during training as during deployment. Failed certification at depth tier N may invoke a remedial protocol that re-establishes shallower-tier foundations before re-attempting the deeper-tier certification, with the remedial protocol's dosing parameters governed by the same therapeutic-dosing primitive used in deployment recovery. This unified treatment of training-time certification failure and deployment-time disruption produces a single governance surface across the model lifecycle, reducing the verification burden on the engineering organization and ensuring that learning-phase failures and operational-phase disruptions receive consistent corrective treatment.
Composition with the broader curriculum manifest also supports cross-skill credential sharing. Where two skills share a foundational substrate, certification of the substrate during one skill's stage carries forward to the other skill's stage, reducing redundant certification work and producing a credential graph whose density reflects the actual representational overlap of the model's skill set. This credential sharing is governed by the same composite admissibility predicate that governs depth-tier carry-forward, so that the architecture admits cross-skill carry-forward only where the predicate confirms representational compatibility.
Prior-Art Distinction
Conventional curriculum learning approaches order training data by difficulty but do not gate gradient flow by depth, do not produce depth-tagged certification credentials, and do not propagate carry-forward admissibility evidence between stages. Layer-wise training and progressive growing approaches modify the depth at which training occurs but do so for the entire model in a single global schedule rather than per-skill. Continual learning approaches address catastrophic forgetting through replay or regularization but do not encode skill-specific depth governance or composite admissibility. The combination disclosed here, in which depth-selective gradient gating is coordinated per skill with the curriculum engine, in which certifications are depth-tagged, and in which carry-forward credentials participate in composite admissibility predicates at stage transitions, is not present in the prior art.
Disclosure Scope
This article forms part of the disclosure of the Cognition Patent. The disclosure encompasses the per-skill depth-selective gradient gating mechanism, the depth-tagged certification record, the composite admissibility predicate at stage transitions, the carry-forward weighting parameter, and the alternative embodiments described above. The disclosure scope extends to all training architectures in which a curriculum engine governs depth-selective gradient flow per skill and in which certification credentials carry forward across depth tiers under a composite admissibility predicate, regardless of the specific gating implementation, certification suite, or curriculum topology employed.