Dify Made LLM Application Development Visual. The Applications Have No Agent Schema.

Nick Clark

Dify Made LLM Application Development Visual. The Applications Have No Agent Schema.

by Nick Clark | Published March 28, 2026 | PDF

Dify is an open-source LLM application development platform that combines visual workflow orchestration, retrieval-augmented generation pipelines, multi-model routing across commercial and self-hosted models, and an agent node type that can plan, call tools, and iterate. The drag-and-drop canvas makes LLM application development reachable for product teams that would otherwise need to assemble the same pieces from scratch in code. The visual approach is the right entry point. But what Dify produces, when a user finishes wiring nodes together and clicks publish, is an application configuration: a graph of nodes, edges, prompts, and tool bindings. It is not a structurally defined agent. The applications and agents Dify ships have no canonical schema that defines what an agent is, what fields it must carry, or how its memory, identity, and governance are validated. This article positions Dify against the AQ agent-schema primitive.

1. Vendor and Product Reality

Dify, originated by LangGenius and operating under an open-source model with a hosted commercial tier, is one of the most widely adopted visual LLM application development platforms in the post-ChatGPT generation of tooling. The flagship product combines a drag-and-drop workflow canvas, a chat-application builder, a knowledge-base management interface for retrieval-augmented generation, and an agent node type that supports plan-and-act loops with tool invocations. The platform routes across the major commercial model providers (OpenAI, Anthropic, Google, Mistral) and self-hosted endpoints (Ollama, vLLM, OpenAI-compatible local servers), exposing model selection as a per-node concern rather than as a platform-wide commitment. Authentication, team workspaces, and per-application observability traces are first-class features.

The architectural shape is straightforward. The application designer assembles a directed graph of nodes — LLM calls, retrieval steps, conditional branches, code blocks, tool invocations, agent loops — connected by typed edges that carry variables. The agent node, when used, encapsulates a plan-and-act loop with a tool list, a model selection, and a prompt template that defines the agent's behavior within the loop. At publish time, the graph is serialized as a Dify-internal configuration, deployed to the runtime, and exposed through HTTP and WebSocket endpoints, an embeddable chat widget, and a public API. Each user request walks the graph, threads variables along edges, invokes the model and any tools, and returns a response. Conversation history is retained across turns within an application's session model. Observability traces capture node-level latencies, model token counts, retrieval hits, and tool call results.

Dify's strengths are real. The visual orchestration genuinely lowers the cost of building useful LLM systems, and the open-source license has made the platform widely usable across enterprise, startup, and independent-developer contexts. The knowledge-base ingestion is solid, with chunking and embedding flows that handle the majority of practical RAG cases. The agent node provides a workable plan-and-act primitive without requiring the developer to assemble it from scratch. The model-routing abstraction is pragmatic and lets organizations migrate between providers without rewriting their application logic. Within its scope — visual application construction over LLM primitives — Dify is competently engineered and broadly useful.

2. The Architectural Gap

The structural property Dify's architecture does not exhibit is a canonical agent schema underneath the visual builder. In Dify's model, an application is a graph; an agent, when present, is a node within that graph. The graph is executable: the runtime walks it for each request and produces a response. None of that requires the existence of an object that is the agent, in the sense of a typed entity with required fields, declared memory, identity that persists across deployments, governance constraints, and a verifiable lineage. The platform produces something that runs; it does not produce something that is, by construction, an agent.

A workflow graph is not an agent object. A canonical agent schema defines the agent as a typed object with required fields: an identity that distinguishes it from other agents and persists across deployments, a memory region whose shape is declared and validated, a governance specification that constrains which mutations and tool invocations are permissible, a capability set, an execution state cursor, and a lineage record. A workflow graph in Dify is consistent with many possible agents and does not by itself satisfy any of them. Two functionally similar Dify applications can have arbitrarily different shapes; there is no structural assertion the builder enforces beyond what the runtime needs to execute.

Visual configuration is validated; agent structure is not. Dify's editor does extensive configuration validation: variables are connected, nodes have required parameters, prompt templates reference valid inputs, tool schemas are well-formed. This is execution-readiness validation. It is not structural validation against an agent schema. The editor does not, and cannot, validate that the resulting application carries an identity field, that its memory is typed, that governance constraints are present, or that the agent's interface is compatible with cognition substrates other than Dify itself. Interoperability with anything outside Dify therefore depends on bespoke export logic per integration target, because there is no canonical object that the export could simply serialize.

Workflow state is not lineage. Dify maintains state during execution: variables flow along edges, intermediate outputs are available to downstream nodes, and conversation history is retained across turns. This is execution state, observable by the runtime and logged in the platform's traces. It is not lineage in the sense an agent schema requires: a record of which governance policy authorized each mutation, which version of the agent's capability set was in effect, which inputs are admissible under which trust assumptions, and how a state transition is reconciled with the agent's prior committed memory. Lineage is a property of a governed object that records why each transition was permitted, not just that it occurred. Dify's logs answer the second question well; they do not answer the first, because there is no governance object whose decisions could be recorded against.

Identity does not survive the platform. A Dify "application" has an internal ID within the platform's database, but that ID is a Dify primary key. It is not an agent identity in the schema sense — a stable, cryptographically attestable identifier under which the agent's memory, lineage, and governance bindings are held independently of the platform that hosts execution. Migrating an agent from Dify to another runtime, or running it across two runtimes simultaneously, is not a defined operation because there is no canonical object to migrate. The platform produces application configurations, not agents.

3. What the AQ Agent-Schema Primitive Provides

The Adaptive Query agent-schema primitive specifies that an agent be a typed object with required fields, validated against a published schema, and structurally portable across cognition substrates. The schema defines, at minimum: an identity field (a stable, attestable identifier independent of any hosting runtime); a declared memory layout (typed regions for short-term context, episodic memory, semantic memory, and any application-specific stores, each with declared shapes and access policies); a governance specification (the policy under which mutations and tool invocations are admitted); a capability set (the tools, knowledge bases, and external integrations the agent is authorized to use); an execution cursor (the current point in any in-flight task or conversation); and a lineage record (the auditable history of mutations admitted under the governance specification).

The primitive makes the agent a first-class object whose shape is the contract. Runtime behavior is one valid interpretation of that contract among many. A schema-conformant agent built in Dify is loadable by any cognition substrate that understands the schema; evaluable by any governance layer that consumes the schema's policy fields; and migrateable across hosts without losing identity or lineage. The visual builder remains the entry point. The agent schema is what makes the artifact the builder produces a real object rather than a platform-specific configuration.

Three properties are load-bearing. First, structural validation at publish time: the platform refuses to publish an "agent" that does not satisfy the schema, exactly as a strict type system refuses to compile a program that does not satisfy its types. Validation is at the schema level, not the execution level, so structural completeness is enforceable independent of whether the agent has been run. Second, declared memory and governance: the agent's memory layout and governance policy are explicit fields in the object, not implicit consequences of the workflow graph. They can be inspected, audited, and reasoned about by tools that have never executed the agent. Third, lineage as a typed property: every admitted mutation is recorded against the agent's lineage field with the governance decision that admitted it, the policy version in effect, and the antecedent state. Lineage is structural, not platform-specific.

The primitive is technology-neutral. The schema may be expressed in JSON Schema, Protocol Buffers, CBOR, or any equivalent typed-object language. The signing and lineage chain may be Ed25519, ECDSA, or any signature scheme. The hosting runtime may be Dify, LangChain, LlamaIndex, a custom Python service, or a serverless endpoint. What is required is the structural condition that an agent be a typed object validated against a published schema, with the schema's required fields present and the lineage chain verifiable. The inventive step is the canonical, cognition-compatible agent schema as a structural condition for portable, governed, auditable agents — disclosed independently of any particular platform that chooses to adopt it.

4. Composition Pathway

Dify composes with the AQ agent-schema primitive without changing the visual builder's surface. What stays at Dify: the drag-and-drop canvas, the node library, the model routing, the knowledge-base ingestion, the team workspaces, the observability traces, and the entire UX that has made Dify accessible to product teams without ML engineering staff. The visual interface continues to be the primary way users assemble agents. Workflow graphs continue to be the way logic is expressed.

What is added beneath Dify is a schema-conformance layer that lifts the platform's published artifacts from "application configurations" to "schema-validated agents." The integration point is the publish step. When a user clicks publish on a Dify application that is intended to be an agent, the platform runs schema-conformance validation in addition to its existing configuration validation: does the application carry an identity field, a declared memory layout, a governance specification, a capability set, an execution cursor, and a lineage record? If any required field is missing, the publish step blocks with a clear remediation path. The user fills in the missing fields through the same visual interface — adding an identity binding, declaring a memory region, attaching a governance policy from a library — and republishes.

On successful publish, the platform emits two artifacts: the existing Dify application configuration (which the Dify runtime consumes as it does today) and a schema-conformant agent object (which any cognition substrate that understands the schema can consume). The two artifacts are kept in sync by Dify automatically; the user does not maintain them separately. The runtime continues to be Dify's, by default, but the schema artifact opens the door to running the same agent on a different substrate, evaluating it against an external governance layer, or migrating it to a customer-hosted runtime when contractual or jurisdictional constraints require it.

The composition is non-disruptive in the steady state. Existing Dify applications that are not intended to be agents — chat applications, simple workflows, RAG-only pipelines — continue to publish as today, without schema validation. Only applications that opt into the agent classification trigger schema conformance. The visual builder gains a small set of new nodes and field types (identity binding, memory declaration, governance attachment, capability set) that surface the schema's requirements as user-visible constructs. The user experience is "the visual builder now knows what an agent is," not "the platform has been re-architected." Dify's existing user base benefits; users who do not need agent semantics are unaffected.

5. Commercial and Licensing Implication

The fitting commercial arrangement is an embedded substrate license under which Dify incorporates the AQ agent-schema primitive into its publish pipeline as a first-class agent classification. Pricing aligns with how organizations actually consume agent governance: per-published-agent or per-governed-mutation, with the schema-conformance layer as a Dify Enterprise-tier feature for teams that need portable, auditable agents. The open-source Dify offering can include the schema-conformance validator without the lineage and governance enforcement infrastructure, which is the natural commercial wedge.

What Dify gains: a structural answer to the "Dify applications are platform-specific configurations" critique that emerging enterprise buyers (regulated industries, government, large enterprises with multi-vendor cognition strategies) increasingly raise; a defensible position against in-platform competition from LangChain Studio, n8n's AI workflows, and Microsoft Copilot Studio by elevating the architectural floor from workflow configuration to schema-validated agent; and a forward-compatible posture against EU AI Act, NIST AI RMF, and SEC AI-disclosure regimes that are converging on per-agent identity, governance, and lineage requirements. What the customer gains: agents that are portable across runtimes, evaluable by external governance layers, and auditable independent of the platform that hosted execution. The schema artifact belongs to the customer, not to Dify's database, so the customer's agent fleet outlives any particular vendor relationship — which paradoxically makes Dify stickier, because the visual builder, the model routing, and the knowledge-base ingestion become the differentiated route to producing those agents.

Honest framing: the AQ primitive does not replace Dify. It gives Dify the agent-classification layer that the market is beginning to demand and that the visual workflow builder, by itself, does not deliver. Dify's visual builder, paired with a canonical agent schema underneath, would produce agents that are at once accessible to non-developers and interoperable with the broader cognition ecosystem the schema is designed to serve. Visual construction and structural definition become distinct concerns, addressed by distinct mechanisms, and combined into a coherent posture in which what the user builds is, by construction, a real agent rather than a platform-specific workflow.