Sign in Start free

AXME

Durable execution: the pattern that makes long-running operations reliable

A complete guide to durable execution — what it is, when you need it, how it differs from polling and webhooks, and how to implement it for AI agent workflows and backend services.

Start free Read docs

Durable execution means an operation can pause for hours or days — waiting for humans, tools, or other agents — and resume exactly where it left off after crashes or deploys.

What durable execution is (and why polling/webhooks aren't it)

Polling is a while-loop tax. Webhooks are a 200-line tax per integration. Durable execution means the agent process can exit at step 47 — Cloud holds state until the human, API, or peer agent is ready.

In production, this shows up when a prototype works in a notebook but breaks the first time a deploy restarts mid-run, a manager takes a day to approve, or a second agent needs the same state. The failure mode is almost never the model — it is missing lifecycle infrastructure.

AXME models work as durable intents: submit once, wait in a known state, resume with audit. That lets you keep LangGraph, CrewAI, OpenAI Agents, or your own stack while shipping the same patterns operations and compliance teams expect.

When you evaluate build-vs-buy, ask three questions: does state survive process restarts, can humans approve without a bespoke webhook stack, and is audit intent-level rather than log archaeology? Teams that answer yes ship faster through incidents because one ID ties model output, tools, approvals, and retries.

The patterns below are framework-agnostic. Wire AXME at boundaries — after a graph node, before a cross-service call, or when Mesh policy must enforce spend and tool scope — rather than rewriting agent logic you already trust.

The anatomy of a durable operation: submit → wait → complete

Submit an intent, enter a waiting state (human, tool, agent, or time), then complete or fail with full audit. No busy loops in your agent process.

In production, this shows up when a prototype works in a notebook but breaks the first time a deploy restarts mid-run, a manager takes a day to approve, or a second agent needs the same state. The failure mode is almost never the model — it is missing lifecycle infrastructure.

AXME models work as durable intents: submit once, wait in a known state, resume with audit. That lets you keep LangGraph, CrewAI, OpenAI Agents, or your own stack while shipping the same patterns operations and compliance teams expect.

When you evaluate build-vs-buy, ask three questions: does state survive process restarts, can humans approve without a bespoke webhook stack, and is audit intent-level rather than log archaeology? Teams that answer yes ship faster through incidents because one ID ties model output, tools, approvals, and retries.

The patterns below are framework-agnostic. Wire AXME at boundaries — after a graph node, before a cross-service call, or when Mesh policy must enforce spend and tool scope — rather than rewriting agent logic you already trust.

State persistence: how operations survive crashes

Intent state lives in AXME Cloud — not in process memory. Deploys, OOM kills, and serverless cold starts do not lose in-flight work.

In production, this shows up when a prototype works in a notebook but breaks the first time a deploy restarts mid-run, a manager takes a day to approve, or a second agent needs the same state. The failure mode is almost never the model — it is missing lifecycle infrastructure.

AXME models work as durable intents: submit once, wait in a known state, resume with audit. That lets you keep LangGraph, CrewAI, OpenAI Agents, or your own stack while shipping the same patterns operations and compliance teams expect.

When you evaluate build-vs-buy, ask three questions: does state survive process restarts, can humans approve without a bespoke webhook stack, and is audit intent-level rather than log archaeology? Teams that answer yes ship faster through incidents because one ID ties model output, tools, approvals, and retries.

The patterns below are framework-agnostic. Wire AXME at boundaries — after a graph node, before a cross-service call, or when Mesh policy must enforce spend and tool scope — rather than rewriting agent logic you already trust.

5 delivery modes: which to use when

Stream for live UIs, poll for mobile, push for webhooks, inbox for human tasks, and direct SDK for backend services — one protocol, multiple bindings.

In production, this shows up when a prototype works in a notebook but breaks the first time a deploy restarts mid-run, a manager takes a day to approve, or a second agent needs the same state. The failure mode is almost never the model — it is missing lifecycle infrastructure.

AXME models work as durable intents: submit once, wait in a known state, resume with audit. That lets you keep LangGraph, CrewAI, OpenAI Agents, or your own stack while shipping the same patterns operations and compliance teams expect.

When you evaluate build-vs-buy, ask three questions: does state survive process restarts, can humans approve without a bespoke webhook stack, and is audit intent-level rather than log archaeology? Teams that answer yes ship faster through incidents because one ID ties model output, tools, approvals, and retries.

The patterns below are framework-agnostic. Wire AXME at boundaries — after a graph node, before a cross-service call, or when Mesh policy must enforce spend and tool scope — rather than rewriting agent logic you already trust.

Durable execution vs deterministic replay (Temporal vs AXME)

Temporal replays deterministic workflow code — powerful for microservices, blocking for LLM steps. AXME tracks intent-level state without forcing determinism in agent code.

In production, this shows up when a prototype works in a notebook but breaks the first time a deploy restarts mid-run, a manager takes a day to approve, or a second agent needs the same state. The failure mode is almost never the model — it is missing lifecycle infrastructure.

AXME models work as durable intents: submit once, wait in a known state, resume with audit. That lets you keep LangGraph, CrewAI, OpenAI Agents, or your own stack while shipping the same patterns operations and compliance teams expect.

When you evaluate build-vs-buy, ask three questions: does state survive process restarts, can humans approve without a bespoke webhook stack, and is audit intent-level rather than log archaeology? Teams that answer yes ship faster through incidents because one ID ties model output, tools, approvals, and retries.

The patterns below are framework-agnostic. Wire AXME at boundaries — after a graph node, before a cross-service call, or when Mesh policy must enforce spend and tool scope — rather than rewriting agent logic you already trust.

When you don't need durable execution

Synchronous request/response and fire-and-forget jobs may not need intents. Adopt durability when humans, external APIs, or multi-step agent flows enter the picture.

In production, this shows up when a prototype works in a notebook but breaks the first time a deploy restarts mid-run, a manager takes a day to approve, or a second agent needs the same state. The failure mode is almost never the model — it is missing lifecycle infrastructure.

AXME models work as durable intents: submit once, wait in a known state, resume with audit. That lets you keep LangGraph, CrewAI, OpenAI Agents, or your own stack while shipping the same patterns operations and compliance teams expect.

When you evaluate build-vs-buy, ask three questions: does state survive process restarts, can humans approve without a bespoke webhook stack, and is audit intent-level rather than log archaeology? Teams that answer yes ship faster through incidents because one ID ties model output, tools, approvals, and retries.

The patterns below are framework-agnostic. Wire AXME at boundaries — after a graph node, before a cross-service call, or when Mesh policy must enforce spend and tool scope — rather than rewriting agent logic you already trust.

Implementation patterns

Wrap framework runs (LangGraph, CrewAI, OpenAI Agents) with AXME adapters. Use wait_for_human at approval nodes and wait_for_tool for long external calls.

In production, this shows up when a prototype works in a notebook but breaks the first time a deploy restarts mid-run, a manager takes a day to approve, or a second agent needs the same state. The failure mode is almost never the model — it is missing lifecycle infrastructure.

AXME models work as durable intents: submit once, wait in a known state, resume with audit. That lets you keep LangGraph, CrewAI, OpenAI Agents, or your own stack while shipping the same patterns operations and compliance teams expect.

When you evaluate build-vs-buy, ask three questions: does state survive process restarts, can humans approve without a bespoke webhook stack, and is audit intent-level rather than log archaeology? Teams that answer yes ship faster through incidents because one ID ties model output, tools, approvals, and retries.

The patterns below are framework-agnostic. Wire AXME at boundaries — after a graph node, before a cross-service call, or when Mesh policy must enforce spend and tool scope — rather than rewriting agent logic you already trust.

Polling vs durable intent

Polling

while not done:
    status = api.poll(job_id)
    sleep(30)

AXME intent

intent = axme.submit(job)
await intent.wait_for_tool(api="crm")

Frequently asked questions

Is durable execution the same as idempotent webhooks?: Idempotency prevents duplicate side effects on retry. Durable execution also models long waits, human tasks, and cross-agent state without keeping a worker process alive.
When should I still use Temporal?: Temporal excels at deterministic microservice workflows. AXME fits agentic steps, LLM calls, and human gates where replaying identical code paths is impractical.
Do I have to replace LangGraph, CrewAI, or Temporal?: No. AXME complements orchestration frameworks and workflow engines. You keep agent graphs and workers; intents add durability, HITL, audit, and fleet controls where those tools stop.
How is this different from observability alone?: Dashboards show symptoms after the fact. Intents carry lifecycle state, waiting semantics, and policy enforcement so you can pause, approve, cap spend, or kill one agent without redeploying the fleet.

Related reading

Deeper dives from the AXME blog.

Further reading

Durable execution product Waiting states Delivery modes vs Temporal Replace polling use case

Ship your first durable agent — in under 10 minutes.

Free tier. No credit card. Self-host or hosted — your choice.

Start free now Read the docs