AutoGen Enabled Multi-Agent Conversations. The Agents Have No Structural Definition.

Nick Clark

AutoGen Enabled Multi-Agent Conversations. The Agents Have No Structural Definition.

by Nick Clark | Published March 27, 2026 | PDF

Microsoft Research released AutoGen in late 2023 as an open-source Python framework for orchestrating conversations among multiple LLM-driven agents, with optional human-in-the-loop participation and pluggable code execution. Distributed under the MIT license and developed in the open on GitHub, AutoGen quickly became one of the reference implementations in a space that also includes LangGraph, CrewAI, OpenAI's Assistants API, and a growing field of in-house frameworks at large software vendors. Its conversation primitives — AssistantAgent, UserProxyAgent, GroupChat, and the more recent typed actor model in AutoGen 0.4 — gave researchers and product teams a practical substrate for building systems where agents propose, critique, and refine each other's work. What AutoGen does not provide, and what no comparable framework provides today, is a canonical schema that binds an agent's identity, governance policy, capability envelope, memory, execution state, and lineage to the agent object itself. An AutoGen agent is, structurally, a Python instance with a name, a system message, an LLM configuration, and a set of registered tools. The rules that ought to govern its behavior live in code, in the prompts that construct it, in the runtime that orchestrates it, and in the conventions of the team that deploys it. They are not bound to the agent. This article examines that structural gap and how Adaptive Query's agent-schema primitive composes above AutoGen to close it without displacing the orchestration model that has made the framework useful.

Vendor and product reality

AutoGen is a Microsoft Research project rather than a Microsoft product, and that distinction matters operationally. The codebase is hosted under the microsoft GitHub organization and accepts external contributions; the maintainer group includes researchers from Microsoft, Penn State, and the broader academic community. The MIT license places no restrictions on commercial use, derivative works, or redistribution. AutoGen Studio, a graphical interface for assembling agent teams, ships alongside the core library. AutoGen 0.4 introduced a redesigned actor-based runtime that addresses several of the threading and scaling limitations of the original design while preserving the conversation-oriented programming model that adopters have built around.

The competitive surface is crowded. LangGraph (from LangChain) emphasizes graph-structured agent workflows; CrewAI emphasizes role-based crews with hierarchical delegation; OpenAI's Assistants and Responses APIs provide hosted equivalents with built-in tool use and threading; Anthropic's Agent SDK provides a typed harness for tool-using agents; numerous in-house frameworks at companies running production agent systems implement variations on the same themes. Across all of these, agents are defined by some combination of a system prompt, an LLM configuration, a tool registry, and orchestration metadata. The agent object is, in every case, a software construct whose governance properties are external to its definition. This is not a deficiency unique to AutoGen; it is the state of the field. AutoGen is examined here because its open license, its visibility, and its actor-based runtime make it a particularly clear instance — and a particularly tractable composition target.

Architectural gap: rules and identity not bound to the agent

Consider what an AutoGen agent actually is. AssistantAgent takes a name, a system message, an LLM config, and optional tool registrations. UserProxyAgent adds human-input handling and code execution. GroupChat composes a set of agents with a speaker selection function and termination conditions. The agent's behavior at runtime is the concatenation of its system message, its tools, the conversation history it has accumulated, and the LLM's response sampling. There is no typed field on the agent that declares its governance scope. There is no typed field that bounds its capability envelope. There is no typed field that anchors its identity to a verifiable root. There is no lineage attached to the agent's mutations of memory or policy. The system message is prose, the tools are Python callables, the memory is whatever the conversation buffer happens to contain.

The operational consequences are familiar to anyone who has tried to put a multi-agent system into production. An agent whose trust posture has degraded — because its tool use has produced incorrect results, because its outputs have been rejected by downstream validators, because its integrity has drifted as the system message was edited across versions — continues to participate in group chats because the orchestration layer has no field to consult. An agent whose capability envelope was meant to exclude database writes accidentally gains that capability when a tool is registered without scope checks because there is no scope to check against. Two agents with the same name and the same system message are, by the framework's own model, functionally identical; identity is a string. Audit is reconstruction: who did what, under which version of the system message, with which tools available, against which memory state, is recoverable only if the operating team independently captured all of those moving parts at the time.

The gap is not that AutoGen orchestrates poorly. Its conversation patterns, its code execution sandbox, its group chat abstractions, and the new actor runtime are well-considered solutions to the orchestration problem. The gap is that the things being orchestrated have no canonical structure. Governance, identity, capability, memory, and lineage are concerns the framework leaves to the integrator, and integrators handle them inconsistently because there is no shared schema to handle them against.

What the agent-schema primitive provides

Adaptive Query's agent-schema primitive defines a canonical typed structure that every agent carries as part of its definition. Six fields anchor the schema: identity (a verifiable root binding the agent to its governance scope and operator); memory (a typed reference to the agent's persistent state, with versioning and access policy); governance (the policy under which the agent operates, including trust posture, allowed scopes, and consensus requirements for mutations); capabilities (the typed envelope of tools, scopes, and resources the agent may invoke); execution state (the agent's current activity, including outstanding obligations and timing constraints); and lineage (the chain of mutations to identity, memory, governance, and capabilities, cryptographically bound to authorizing parties).

These fields are not annotations. They are the agent. An agent without a verifiable identity field cannot participate in a governed system; an agent whose capability envelope does not authorize a tool cannot invoke that tool, regardless of what the orchestrator requests; an agent whose governance posture has dropped below the scope's threshold is structurally excluded from contributing until it is restored or replaced. Two agents with identical system messages are distinct because their identity fields are distinct, their lineage chains are distinct, and their governance postures evolve independently.

The schema is platform-independent. It is implemented as a typed contract that can be enforced by any runtime that understands it, and it is verifiable by any party that can validate its signatures. An agent's fields are not held in a single framework's process memory; they are held against a governance anchor, the same kind of root that governs scopes in the broader Adaptive Query primitive set. This makes agents portable: an agent defined under the schema can be moved between AutoGen, LangGraph, a custom runtime, or a hosted service, and its governance properties travel with it.

Composition pathway with AutoGen

AutoGen's actor-based 0.4 runtime is well-positioned for composition with the agent-schema primitive because the actor model already separates agent behavior from message routing. Composition takes the form of a typed agent base class that carries the canonical schema fields and overrides the relevant lifecycle hooks. AssistantAgent, UserProxyAgent, and custom agents inherit from this base. At construction, the agent is bound to a governance anchor that supplies its identity, signs its initial governance posture, and authorizes its capability envelope. At each turn, the runtime consults the schema before dispatching: capability checks gate tool invocations, governance checks gate participation, lineage records each mutation.

For GroupChat and team scenarios, the speaker selection function is augmented with a governance filter: agents whose posture has dropped below the scope's threshold are skipped without ceremony, and their absence is recorded in the lineage so that audit can reconstruct why a particular turn went to a particular participant. Tool registration becomes scoped: a tool is registered against a capability declaration, and an agent's ability to invoke it is checked against its envelope rather than against a global registry. Memory is bound to a typed reference rather than carried in the conversation buffer, so that two agents sharing memory do so under explicit policy and so that memory mutations are themselves governed events.

Human-in-the-loop participation, which AutoGen handles gracefully through UserProxyAgent, gains structural force under the schema. The human's identity, governance role, and capability envelope are first-class fields, and a human override of an agent decision is a recorded lineage event with an authorizing party rather than a side-channel input. AutoGen Studio's visual assembly remains useful: it composes typed agents whose runtime properties are now derived from the schema rather than from the studio's local configuration.

Commercial and licensing considerations

AutoGen's MIT license permits the composition pattern described here without friction, and Microsoft Research's stewardship is explicitly oriented toward enabling external runtimes and integrations rather than capturing them. Organizations evaluating multi-agent frameworks frequently weigh AutoGen against LangGraph, CrewAI, and hosted alternatives; the agent-schema primitive is framework-agnostic and reduces the cost of switching, because an agent defined under the schema is portable across runtimes. Teams that have already standardized on AutoGen gain governance, identity, capability binding, and lineage without rewriting their orchestration. Teams evaluating multiple frameworks gain a layer that lets them defer or revisit the runtime choice without rebuilding the agents.

The commercial picture for multi-agent systems is moving quickly: hosted assistant APIs, in-house frameworks at large vendors, and a steady output of open-source alternatives all compete for adoption on the orchestration axis. None of them, at the time of writing, treat the agent itself as a typed, governed, portable object. The remaining gap that AutoGen leaves — and that the field as a whole leaves — is closed not by choosing one framework over another but by giving the agents that any of these frameworks orchestrate a canonical schema that binds identity, governance, capability, memory, execution, and lineage as structural fields. AutoGen orchestrates conversations well. The agent-schema primitive defines, structurally, what is being orchestrated.