StepflowStepflow

Retries & Backoff

Configure automatic retries and backoff strategies

Retries & Backoff

Stepflow provides built-in support for automatic retries with exponential backoff. When a workflow or a step fails, Stepflow can automatically retry the execution, skipping any steps that have already completed successfully.

Workflow Retries

You can configure retries at the workflow level. These retries apply to the entire workflow handler.

const myWorkflow = createWorkflow(
  {
    id: "my-workflow",
    retries: 3, // Number of retry attempts
    retryDelay: "1s", // Initial delay before first retry
  },
  async (ctx) => {
    // ...
  },
);

Exponential Backoff

Stepflow uses an exponential backoff strategy for retries. The delay for each attempt is calculated as:

delay = initialDelay * 2^(attempt - 1)

For example, with retryDelay: "1s":

  • Attempt 1: 1s delay
  • Attempt 2: 2s delay
  • Attempt 3: 4s delay

Step Retries

You can also configure retries for individual steps. This is useful for steps that interact with flaky external APIs.

await ctx.step.run(
  "fetch-api",
  async () => {
    return await flakyAPI();
  },
  {
    retries: 5,
    retryDelay: "500ms",
  },
);

Note: Step-level retries are currently handled within the worker's execution loop.

Checkpointing

The most powerful feature of Stepflow's retry system is checkpointing. When a workflow retries, it doesn't start from the beginning. Instead, it resumes execution, and any ctx.step.run() calls that previously completed successfully will return their cached results immediately without re-executing.

await ctx.step.run("step-1", async () => {
  // If this succeeds, it will NEVER run again for this Run ID
});

throw new Error("Fail!"); // Workflow will retry

await ctx.step.run("step-2", async () => {
  // This will only run once step-1 has succeeded and the workflow resumes
});

Handling Non-Retryable Errors

Sometimes you may want to fail a workflow immediately without retrying. You can do this by catching the error and not re-throwing it, or by setting the max retries to 0 for specific logic.

try {
  await doSomething();
} catch (error) {
  if (error instanceof FatalError) {
    ctx.log.error("Fatal error, stopping workflow");
    return { status: "failed", error: error.message };
  }
  throw error; // Re-throw to trigger automatic retry
}

Configuration Options

OptionTypeDefaultDescription
retriesnumber0Maximum number of retry attempts.
retryDelayDuration1sInitial delay before the first retry.

On this page