OpenAI Assistants API Provides Agent Tooling. It Does Not Define Agent Structure.

Nick Clark

OpenAI Assistants API Provides Agent Tooling. It Does Not Define Agent Structure.

by Nick Clark | Published March 28, 2026 | PDF

The OpenAI Assistants API is the most widely adopted commercial agent runtime in production. Its v2 surface area, generally available since late 2024 and now feature-frozen in favor of the Responses API as OpenAI consolidates its agent stack, exposes a coherent set of primitives: an Assistant object holding a model, system instructions, and a tool list; Thread objects holding ordered message histories; Run objects driving an Assistant against a Thread with streaming token output; and built-in tools including Code Interpreter, File Search (formerly Retrieval), and developer-supplied Function Calling. For the engineer who needs an agent in production this week, the Assistants API is a reasonable default. For the question of whether an Assistant is a structurally well-defined object whose definition can be reasoned about independently of OpenAI's runtime, the answer is no. The structural gap is between configurable agent tooling whose rules and identity live server-side in OpenAI's organization and project IAM, and a canonical agent schema that ships rules and identity with the agent object itself.

Vendor and product reality

An OpenAI Assistant is created via POST /v1/assistants with a JSON body specifying the model name, an instruction string, a tools array (entries of type code_interpreter, file_search, or function with an attached schema), an optional tool_resources object pointing at vector stores or file IDs, and metadata fields. The Assistant is persisted in OpenAI's storage tier, scoped to the creating organization and project. Threads are independent objects; a Run binds an Assistant to a Thread and produces streamed deltas under the v2 streaming protocol, with intermediate requires_action states when developer-supplied function tools must be resolved client-side. Code Interpreter executes Python in an OpenAI-hosted sandbox with file I/O against the Thread's attached files. File Search runs OpenAI-managed retrieval against a vector store the developer populates. Authentication is by API key at the organization or project scope, with role-based controls administered through OpenAI's dashboard.

The platform works. Latency is acceptable for interactive use, the streaming protocol is well-documented, and the tool-resolution loop is sound. Production deployments at scale exist across customer support, internal copilots, and document workflows. None of this is the subject of the gap analysis. The gap concerns where the rules that govern an Assistant's behavior reside, and what definition of an agent is implied by the API surface.

The architectural gap

The Assistants API exposes configuration, not schema. Two Assistants in the same project may differ in model, instruction text, tool list, attached vector stores, and metadata, and both are equally valid Assistant objects. There is no required field beyond model. There is no notion of a memory field that must be present, a governance field that must be evaluated before activation, a lineage field linking the Assistant to its predecessors, or an identity field that is content-derived rather than server-assigned. The Assistant ID is an opaque OpenAI-issued string (asst_...) whose meaning is resolution against OpenAI's storage; export the same configuration JSON to a different OpenAI organization and a different ID is issued, with no structural relationship to the original.

The instruction string is the closest analogue to a rule, but it is unstructured prose passed to the model as a system prompt. It is not a typed governance field with declared semantics, and its enforcement is whatever the underlying model chooses to honor at inference time. A rule that says "do not produce output if the user is outside jurisdiction X" is a string the model may or may not comply with; it is not a precondition the runtime evaluates and refuses to execute against. Identity, similarly, is an OpenAI-side fact: the Assistant exists because OpenAI's database row exists, and OpenAI's IAM determines who may invoke it. Move the configuration outside OpenAI and there is no Assistant; the object is constituted by the platform, not carried by the configuration.

Threads compound this. A Thread is an ordered message log, persisted by OpenAI, attached to vector stores and files at the project scope. Memory in this model is conversational history, not a typed, lineage-bearing, governance-constrained field. There is no mechanism by which a Thread declares the policies under which its contents may be read, mutated, or exported. There is no per-message provenance that survives a serialization round-trip outside the OpenAI environment. The Thread is a useful runtime convenience; it is not a structural memory field of an agent.

What an agent schema provides

A canonical agent schema specifies a small, fixed set of typed fields that together constitute an agent: an identity field whose value is derived from the content of the other fields and survives transport; a memory field with declared type, lineage, and access policy; a governance field expressing activation, mutation, and termination rules in a form the runtime evaluates rather than the model interprets; a capabilities field enumerating tools and their permissioning; an execution-state field describing the agent's current lifecycle position; and a lineage field linking the agent to its predecessors and to the operations that produced it. These fields are not optional. An object lacking any of them is not an agent under the schema; it is configuration that some runtime may choose to execute.

The point of the schema is not to add fields to OpenAI's Assistant. It is to relocate the locus of definition. Under the schema, the agent is the object whose serialized form contains those fields, and any conformant runtime can host it. Rules travel with the agent. Identity is content-derived and portable. Two systems exchanging an agent see the same object.

Composition pathway

The Assistants API is not displaced by the schema; it is wrapped. A schema-conformant agent serializes to a document that includes, among its capability entries, the OpenAI Assistant configuration needed to instantiate the agent's tool-using model loop. At runtime, a schema-aware host materializes the OpenAI Assistant on demand via the Assistants API, drives Runs against a Thread it manages itself, and treats the OpenAI side as a stateless inference back-end whose persisted Assistant and Thread objects can be regenerated from the schema document. Memory does not live in the Thread; it lives in the schema's memory field, and is projected into the Thread only as the Run requires. Governance is evaluated by the host before each Run is dispatched, refusing dispatch if activation predicates fail. Identity is the schema document's content hash, not the OpenAI-issued Assistant ID.

The result is that customers continue to benefit from OpenAI's model quality, Code Interpreter, and File Search, while their agents become objects they own structurally rather than configurations they hold inside an OpenAI tenant. Migration to a non-OpenAI back-end (Anthropic, Google, an open-weights deployment) becomes a substitution at the capabilities-field level rather than a re-platforming.

Commercial and licensing

The Assistants API is a commercial OpenAI service billed on token consumption, tool-call surcharges (Code Interpreter session minutes, File Search storage and queries), and storage of files and vector stores. There is no source license; the runtime is proprietary. Customer data handling is governed by OpenAI's enterprise terms, with project-scoped API keys providing the access boundary. The agent schema layer described here is implemented entirely on the customer side and does not require any contractual change with OpenAI; OpenAI is invoked exactly as it is today, through the public API. The schema and its tooling are licensable independently. Customers running on the Assistants API today can adopt the schema incrementally, agent by agent, without disturbing existing deployments, and retain the option to substitute back-ends without renegotiating with any single vendor.