Snowflake Cortex AI

Nick Clark

Snowflake Cortex AI

by Nick Clark | Published April 25, 2026 | PDF

Snowflake Cortex is the AI capability layer integrated directly into the Snowflake Data Cloud, exposing managed LLM functions (COMPLETE, SUMMARIZE, EXTRACT_ANSWER, TRANSLATE), Cortex Search for retrieval, Cortex Analyst for natural-language-to-SQL, and Cortex Agents over data already governed by Snowflake's RBAC, masking, and row-access policies. Its core architectural choice is to run inference inside the customer's Snowflake account boundary so prompts and outputs never traverse third-party APIs. What Cortex does not provide — and structurally cannot retrofit within the Snowflake execution model — is pre-execution policy resolution with capability-gated inference and deterministic non-execution as a first-class outcome class. This article positions Snowflake Cortex against the AQ inference-control primitive.

1. Vendor and Product Reality

Snowflake Inc., publicly traded since 2020, operates the Data Cloud as a multi-cloud (AWS, Azure, GCP) managed data platform built on a separation of compute and storage and a shared metadata layer. Cortex is the AI-native extension to that platform: a set of SQL-callable LLM functions backed by hosted models (the Snowflake Arctic family, Mistral, Meta Llama, Reka, plus partner models accessed through Cortex Connectors), document understanding and embedding services, vector storage as a native data type, Cortex Search as a managed retrieval index, Cortex Analyst for governed text-to-SQL on a defined semantic model, and Cortex Agents as the orchestration surface for multi-step tool use over Snowflake data.

The customer base is the Snowflake enterprise base — financial services, healthcare, retail, technology — plus a growing set of regulated customers who want LLM capability without exfiltrating data to external APIs. The architectural strength is governance-by-proximity: Cortex inference runs against tables already protected by Snowflake's row-access policies, dynamic data masking, tag-based policies, and account-level network policies, so sensitive data does not have to move to be modeled. Cortex Guard adds output filtering for unsafe content; the Trust Center adds posture monitoring; the Cortex Search and Cortex Analyst products express a consistent semantic model so that natural-language queries map deterministically to governed SQL.

Within its scope, Cortex is the cleanest commercial implementation of "bring AI to the data" for regulated enterprises that have already standardized on Snowflake. It avoids the data-exfiltration objection that has slowed enterprise LLM adoption, and it inherits Snowflake's audit logs, account boundaries, and customer-managed-key options. The platform is rigorous within its operating model.

2. The Architectural Gap

The structural property Cortex's architecture does not exhibit is pre-execution policy resolution with capability-gated inference and deterministic non-execution as a defined outcome class. Snowflake's policy enforcement is data-side: row-access policies and masking apply when a query touches a table, and Cortex inherits this because the LLM call runs inside the same SQL execution context. But the inference itself is not gated on a pre-execution policy resolution that determines, before the model loads or runs, what capabilities the prompt-and-context combination is permitted to invoke and which outcome class — full execution, restricted execution, deterministic non-execution — applies.

This matters because the LLM call is the moment policy actually needs to bind. Once the model has consumed the prompt and is generating, the architecture has already lost the opportunity to refuse on a structural ground rather than a content-filter ground. Cortex Guard filters outputs after generation; row-access policies filter inputs at the SQL boundary; account-level controls turn a model on or off. Between those layers is the pre-execution capability-resolution gap: nothing in the architecture takes the prompt, the resolved context, the requesting principal's authority, the data classification, the model's capability profile, and the policy state, and produces a deterministic decision that this specific inference is permitted, restricted to a capability subset, or must not execute at all.

Snowflake cannot patch this from inside the Cortex architecture because the platform's design centers on "let SQL call the model" as the integration point. Capability-gated inference requires an architectural layer above the SQL boundary that resolves policy across the prompt-context-principal-data-model tuple before SQL dispatch, and it requires deterministic non-execution to be a first-class outcome that Snowflake's SQL surface does not currently express. Adding more output filters or pre-prompts does not produce structural inference control; it adds heuristic guards around an architecture that still treats inference as an unconditional model invocation.

3. What the AQ Inference-Control Primitive Provides

The Adaptive Query inference-control primitive specifies three structural elements: pre-execution policy resolution, capability-gated inference, and deterministic non-execution as an outcome class. Pre-execution policy resolution means that before any model is loaded, before any prompt is dispatched, the requesting principal's credential, the resolved context's classification, the policy state for the principal-context combination, and the candidate model's capability profile are composed into a structured policy resolution. This resolution is a decision artifact in its own right and is recorded as a credentialed observation that re-enters the governance chain.

Capability-gated inference means the model is invoked under a capability set derived from the resolution, not under its full as-trained capability surface. The capability set restricts what the inference can produce: which tools it may call, which retrieval sources it may consult, which output classes it may emit, which downstream actuators it may trigger. Capability gating is structural rather than prompt-engineered: a model running under a capability set cannot exceed that set even if the prompt or in-context instructions request it, because the gate sits in the runtime above the model rather than inside the prompt.

Deterministic non-execution is the third structural element: the policy resolution can produce an outcome of "do not execute" that is observably distinct from "execute and return refusal." Deterministic non-execution emits a credentialed outcome record stating that for this principal, this context, this model, and this policy state, no inference was performed; the lineage records the resolution and the non-execution as a first-class outcome. This is what makes inference-control auditable: a regulator or operator can verify that prohibited inferences did not occur, not merely that prohibited outputs were filtered. The primitive composes with the governance-chain umbrella so the resolution, the gating, and the non-execution events are property-one observations under a published authority taxonomy. The inventive step disclosed under provisional 64/049,409 is the structural triad of pre-execution resolution, capability-gating above the model, and deterministic non-execution as an outcome class.

4. Composition Pathway

Cortex integrates with AQ as the model-execution layer running under an inference-control substrate. What stays at Snowflake: the data platform, the row-access policies and masking, the hosted models and infrastructure, Cortex Search and Cortex Analyst as the developer-facing surfaces, the SQL integration, and the customer commercial relationship. Snowflake's investment in proximate-to-data inference and in the multi-model catalog remains its differentiated layer.

What moves to AQ: pre-execution policy resolution, capability-gating, and deterministic non-execution recording. Integration points are well-defined. A SQL call to a Cortex function (or a Cortex Agent step) emits an inference-intent event to the AQ resolution gate before the model is dispatched; the gate composes principal credential, resolved context classification (already available from Snowflake's policy engine), candidate model capability profile, and policy state, and emits a resolution outcome — full execution under specified capability set, restricted execution, or deterministic non-execution. Snowflake dispatches the model only under the capability set the gate emits, and the output is admitted to the calling SQL session only after capability-conformance verification. Non-execution outcomes return a credentialed non-execution result rather than a model-generated refusal, and the lineage records both the resolution and the outcome.

The new commercial surface is governed inference for regulated industries — healthcare PHI, financial MNPI, defense classified-adjacent data — where customers cannot accept the architectural risk of "model decides to refuse" and need "model not invoked" as a verifiable outcome. The substrate also enables coalition and cross-jurisdiction inference where capability sets must vary by jurisdiction.

5. Commercial and Licensing Implication

The fitting arrangement is an embedded substrate license priced on credentialed-inference volume: Snowflake embeds the AQ inference-control primitive into Cortex and sub-licenses chain participation to enterprise and regulated customers as a governed-inference tier of the Cortex subscription. The commercial fit is natural because Cortex pricing is already capability-and-volume based.

What Snowflake gains: a structural answer to the regulatory pressure converging on AI inference (EU AI Act high-risk classification, sectoral AI rules in healthcare and finance, sovereign-AI requirements), a defensible architectural floor against Databricks Mosaic AI and the hyperscaler-native AI offerings, and a path into customer segments that today refuse to enable LLM functions because output-filter-only governance is not auditable enough. What the customer gains: verifiable non-execution as an outcome class, capability gating that survives prompt injection because the gate is above the model, and lineage that supports cross-jurisdiction inference governance. The honest framing — Cortex is excellent at bringing inference to data; AQ gives that inference the pre-execution policy substrate it needs to be regulator-defensible.