Fly Machines Made Micro-VMs Fast. The VMs Still Need External Orchestration.

by Nick Clark | Published March 28, 2026 | PDF

Fly Machines is the second-generation execution primitive that powers Fly.io's v2 Apps platform. Built on Firecracker microVMs, scheduled by the Fly control plane, fronted by an anycast Layer 4 proxy, and backed by NVMe-attached Fly Volumes, Machines deliver hardware-isolated workloads with cold-start times measured in hundreds of milliseconds and resume-from-suspend times measured in tens of milliseconds. They are, by any honest engineering benchmark, one of the fastest publicly available VM substrates in production. Yet a Fly Machine remains an externally orchestrated object: its lifecycle, identity, network reachability, and policy envelope are all decided and enforced by the Fly control plane rather than by the Machine itself. The structural gap addressed by memory-resident execution is the gap between a fast micro-VM that the platform starts and stops, and a self-governing execution object that wakes, sleeps, migrates, and dies according to rules carried in its own state.


Vendor and product reality

Fly.io ships Machines as the canonical compute primitive of its v2 platform. A Machine is a Firecracker microVM created via the Machines API (POST /v1/apps/{app}/machines), provisioned with a specified image, CPU and memory shape, optional Fly Volume mounts, region pin, and a configuration block describing services, health checks, restart policy, and auto-stop behavior. Machines start in roughly 300 ms from a cold image pull and resume from suspended state in well under 100 ms; this is the genuine technical achievement that distinguishes Fly from container-on-VM platforms such as ECS or Cloud Run. The Fly proxy terminates Layer 4 traffic at every edge POP, performs anycast routing, and uses Fly's Replay mechanism to forward connections that landed in the wrong region toward the region where the appropriate Machine resides. Fly Volumes provide local NVMe persistence pinned to a host, with snapshot and fork primitives for rapid clone-on-write provisioning. The control plane (flyd, running on each worker) reconciles desired state, drives auto-start on incoming requests, drives auto-stop on idleness, and triggers migration when a host is drained.

None of the foregoing is in dispute. Fly Machines is mature, well-engineered infrastructure, used in production by latency-sensitive consumer applications, regional inference workloads, and globally distributed databases. The analysis that follows is not a critique of Fly's quality. It is an analysis of where the Fly Machine object sits on a spectrum from externally orchestrated workload to self-governing execution object, and what that location implies for workloads that need governance to travel with the executing entity.

The architectural gap

A Fly Machine's lifecycle is owned by the Fly control plane. The Machines API is the authoritative interface: external callers issue start, stop, suspend, destroy, and update commands, and flyd reconciles the worker host accordingly. The Machine itself is a passive recipient of these state transitions. It does not evaluate whether it should be running. It does not consult any rule embedded in its own image or memory to decide that an inbound request falls outside its operating envelope and should be rejected at the lifecycle level. Auto-stop is a useful efficiency feature, but the predicate is request idleness measured by the Fly proxy, not a semantic predicate evaluated by the workload against its own governance state. Auto-start fires when traffic arrives at the proxy; the Machine has no opportunity to refuse activation on the basis that conditions inside its governed memory are not yet satisfied.

Identity is similarly external. A Machine is identified by the Fly-issued machine ID, attached to a Fly app and organization, and its network identity is whatever the Fly proxy and WireGuard mesh assign. There is no canonical, content-addressed identity bound to the workload's own definition that survives migration to a different cluster, a different vendor, or a snapshot/restore cycle outside Fly's control. Policy is likewise external: TLS termination, rate limits, region pinning, and access lists are configured on the Fly side. A Machine that is exfiltrated as a raw rootfs and booted on bare Firecracker elsewhere does not carry its operating rules with it; the rules lived in the Fly control plane, not in the Machine object.

The consequence for workloads whose correctness depends on governance is that Fly Machines is the wrong abstraction layer at which to enforce that governance. A workload that must remain dormant unless its lineage chain is intact, or must refuse activation when its memory commitments are stale, cannot encode that as a Fly configuration field. The configuration vocabulary is about resources, networking, and lifecycle hooks, not about semantic conditions over the workload's own state.

What memory-resident execution provides

A memory-resident execution object inverts the locus of control. The execution object carries, as part of its serialized form, a typed memory field, a governance field describing the rules under which it may activate or remain active, an identity field that is content-derived and survives transport, and a lifecycle predicate that the object itself evaluates whenever a transition is proposed. Wake is no longer a unilateral decision by an external scheduler. It is a proposal that the object accepts or rejects based on its own evaluation. Dormancy is not idleness measured externally; it is a state the object enters because its governance rules indicate that no further work is warranted. Resumption is not a control-plane reconcile loop; it is a semantic event triggered by an arriving message, a temporal condition, or a memory mutation that the object has declared itself sensitive to.

This shifts the structural property of the system. Where Fly Machines provides a fast and well-isolated vehicle for workloads, memory-resident execution provides a passenger that knows where it is going and refuses to be driven elsewhere. The two abstractions are orthogonal: the vehicle does not need to understand the passenger's rules, and the passenger does not need to manage the vehicle's worker pool, host draining, or anycast routing.

Composition pathway

The composition is straightforward and, importantly, non-destructive to either party. Fly Machines remains the host substrate. A memory-resident execution object is materialized as a Machine image plus an attached governance bundle. On boot, the in-Machine runtime reads the bundle, hydrates the typed memory, evaluates the activation predicate, and either proceeds to serve traffic or transitions to dormancy and signals the Fly proxy to stop the Machine. Auto-stop continues to function for cost efficiency, but is augmented by predicate-driven dormancy that fires before idleness would. Auto-start traffic events are funneled through the runtime's accept/reject gate before the workload reaches the application port. Migration via Fly's host-drain mechanism is preserved; the governance bundle travels with the rootfs and Volume snapshot, so the post-migration object reconstructs identity from content rather than from the Fly machine ID.

Fly's Replay header gains a second purpose. Where today it routes a connection to the region holding the appropriate Volume, in the composed system it can also route to the region holding the appropriate governance scope, with the in-Machine runtime accepting only those replays whose target identity matches the object resident on the host. This makes the Fly proxy a transport-layer accelerant for governance-aware routing without requiring the proxy itself to understand governance.

Commercial and licensing

Fly Machines is a commercial service of Fly.io, billed on a per-second basis for active VM time, with separate billing for Volumes, egress, and Anycast IPs. There is no source license at the platform level; the Machines API and flyd control plane are proprietary. Firecracker, the underlying VMM, is Apache 2.0 licensed and operated independently by AWS. The composition described above does not require any change to Fly's commercial terms or to Firecracker. The memory-resident execution layer is implemented inside the customer's Machine image and bundle, and is licensable independently of the Fly substrate. Customers deploying memory-resident execution on Fly retain their existing Fly contract; the additional layer is compatible with both Fly's hobby and organization plans, and with self-hosted Firecracker for customers who eventually move off Fly. No exclusivity, source disclosure, or platform integration on Fly's part is required for the composition to function.

Nick Clark Invented by Nick Clark Founding Investors:
Anonymous, Devin Wilkie
72 28 14 36 01