Skip to main content

Error Classification & Retry

Starfish's SyncManager retries on conflict errors (409) automatically. This page covers retry strategies for all other transient errors — rate limits, server failures, and network outages — at the HTTP layer.

Prerequisites: StarfishClient, Offline & Connectivity

Error Classification

Every error from a sync operation falls into one of these categories:

CategoryCauseRetryableSDK handling
Networkfetch throws (TypeError, AbortError)YesNone — client throws
401 UnauthorizedExpired or invalid auth tokenOnce (after refresh)None — throws StarfishHttpError
409 ConflictHash mismatch on pushYesSyncManager retries automatically
429 Rate LimitedToo many requestsYes (with backoff)Opt-in: cache fallback + background retry (see below)
5xx Server ErrorServer bug or overloadYes (with backoff)Opt-in: cache fallback + background retry (see below)
Other 4xxBad request, forbidden, etc.NoNone — throws StarfishHttpError

Classifier function

import { StarfishHttpError } from "@drakkar.software/starfish-client"

type ErrorCategory = "network" | "auth" | "conflict" | "rate-limited" | "server" | "client" | "unknown"

function classifyError(err: unknown): ErrorCategory {
if (err instanceof StarfishHttpError) {
if (err.status === 401) return "auth"
if (err.status === 409) return "conflict"
if (err.status === 429) return "rate-limited"
if (err.status >= 500) return "server"
return "client"
}
if (err instanceof TypeError) return "network"
return "unknown"
}

Stale-While-Revalidate

For offline-first apps with a pull cache, you can make transient server failures (429, 5xx) serve the last-synced snapshot immediately instead of throwing, and retry silently in the background.

Configure cacheFallbackStatuses on the client:

import { StarfishClient } from "@drakkar.software/starfish-client"

const client = new StarfishClient({
baseUrl: "https://api.example.com/v1",
capProvider,
cache: myPullCache, // required — no cache means nothing to serve
cacheFallbackStatuses: [429, 500, 502, 503, 504],
onRevalidated: (path, result) => {
// Fresh snapshot available — signal the app to re-pull
reportReachability(true)
},
})

How it works:

  1. A structured pull() receives a 429 or 5xx response.
  2. If a cached snapshot exists for that document, pull() returns it immediately tagged stale (pullWasFromCache(result) === true).
  3. A background revalidation loop starts — honoring any Retry-After header — and retries up to 5 times.
  4. When the server returns a 2xx response, the fresh snapshot is written through to the cache and onRevalidated fires.

Important constraints:

  • cache must be configured — when no snapshot exists, the error propagates as before (nothing to serve).
  • Do not include 403 or 404 in cacheFallbackStatuses — they are genuine server answers (access denied / no document yet), not transient failures. Serving a stale snapshot for those would mask deletions and auth failures.
  • Applies to structured (non-append) pulls only. Append-log pulls own their own warm-start persistence.
  • The background loop is deduplicated per document: many concurrent 429 pulls on the same path spawn exactly one loop.

parseRetryAfterMs

A helper exported from @drakkar.software/starfish-client/fetch that parses a Retry-After header value into milliseconds. Used internally by both createRetryFetch and the stale-while-revalidate background loop:

import { parseRetryAfterMs } from "@drakkar.software/starfish-client/fetch"

const delay = parseRetryAfterMs(
response.headers.get("Retry-After"),
{ fallbackMs: 1_000, maxMs: 30_000 },
)
// "30" → 30_000 ms
// "Thu, 01 ..." → delta from now in ms (floored to 0)
// null / "" → fallbackMs (1_000)
// "garbage" → fallbackMs (1_000)

Retry Fetch Wrapper

Inject retry logic via StarfishClientOptions.fetch. This wraps the native fetch with automatic retry for transient errors:

interface RetryOptions {
/** Max retry attempts (default: 3) */
maxRetries?: number
/** Initial delay in ms (default: 500) */
initialDelayMs?: number
/** Max delay in ms (default: 10000) */
maxDelayMs?: number
}

function createRetryFetch(options: RetryOptions = {}): typeof globalThis.fetch {
const { maxRetries = 3, initialDelayMs = 500, maxDelayMs = 10_000 } = options

return async (input, init) => {
let attempt = 0

while (true) {
try {
const response = await globalThis.fetch(input, init)

if (response.status === 429 || response.status >= 500) {
if (attempt >= maxRetries) return response

const retryAfter = response.headers.get("Retry-After")
const delay = retryAfter
? parseInt(retryAfter, 10) * 1000
: Math.min(initialDelayMs * Math.pow(2, attempt), maxDelayMs)

await new Promise((r) => setTimeout(r, delay + Math.random() * 100))
attempt++
continue
}

return response
} catch (err) {
// Network error (offline, DNS failure, etc.)
if (attempt >= maxRetries) throw err

await new Promise((r) =>
setTimeout(r, Math.min(initialDelayMs * Math.pow(2, attempt), maxDelayMs))
)
attempt++
}
}
}
}

Usage:

import { StarfishClient } from "@drakkar.software/starfish-client"

const client = new StarfishClient({
baseUrl: "https://api.example.com/v1",
capProvider: { getCap: async () => ({ cap, devEdPrivHex }) },
fetch: createRetryFetch({ maxRetries: 3 }),
})

This is complementary to SyncManager's conflict retry. SyncManager handles 409 conflicts with its own backoff and merge logic. The retry fetch handles 429/5xx/network errors before they reach SyncManager.

Circuit Breaker

After repeated failures, stop retrying to avoid wasting resources and battery. The circuit breaker has three states:

┌──────────┐ N failures ┌──────┐ cooldown ┌───────────┐
│ CLOSED │ ──────────────►│ OPEN │ ────────────►│ HALF-OPEN │
│ (normal) │ │(block)│ │ (test one) │
└──────────┘ └──────┘ └───────────┘
▲ │
│ success │
└────────────────────────────────────────────────────┘
failure → back to OPEN
class CircuitBreaker {
private failures = 0
private state: "closed" | "open" | "half-open" = "closed"
private nextAttemptAt = 0

constructor(
private readonly threshold: number = 5,
private readonly cooldownMs: number = 30_000,
) {}

isOpen(): boolean {
if (this.state === "open" && Date.now() >= this.nextAttemptAt) {
this.state = "half-open"
}
return this.state === "open"
}

recordSuccess() {
this.failures = 0
this.state = "closed"
}

recordFailure() {
this.failures++
if (this.failures >= this.threshold) {
this.state = "open"
this.nextAttemptAt = Date.now() + this.cooldownMs
}
}
}

Integrating with fetch

Wrap the retry fetch with circuit breaker protection:

function createResilientFetch(
retryOptions?: RetryOptions,
breakerOptions?: { threshold?: number; cooldownMs?: number },
): typeof globalThis.fetch {
const retryFetch = createRetryFetch(retryOptions)
const breaker = new CircuitBreaker(
breakerOptions?.threshold,
breakerOptions?.cooldownMs,
)

return async (input, init) => {
if (breaker.isOpen()) {
throw new Error("Circuit breaker is open — sync paused after repeated failures")
}

try {
const response = await retryFetch(input, init)

if (response.ok || response.status === 409) {
breaker.recordSuccess()
} else if (response.status >= 500) {
breaker.recordFailure()
}

return response
} catch (err) {
breaker.recordFailure()
throw err
}
}
}

Auth Token Refresh on 401

When a sync request returns 401, refresh the token and retry once:

function createAuthRefreshFetch(
refreshToken: () => Promise<void>,
): typeof globalThis.fetch {
let isRefreshing = false

return async (input, init) => {
const response = await globalThis.fetch(input, init)

if (response.status === 401 && !isRefreshing) {
isRefreshing = true
try {
await refreshToken()
} finally {
isRefreshing = false
}
// Retry once — note: auth headers in `init` were already set by
// StarfishClient before this wrapper was called. See caveat below.
return globalThis.fetch(input, init)
}

return response
}
}

The isRefreshing guard prevents multiple concurrent refresh calls when several requests fail at once.

Important: this wrapper retries the raw fetch call, but StarfishClient applies auth headers before calling fetch. To get a fresh cap on the retry, the refresh must update the state that the CapProvider reads (e.g., a stored creds), and the retry must rebuild the request with fresh headers. A simpler approach is to handle 401 at the CapProvider level — when a cap is near its exp, refresh before returning:

const client = new StarfishClient({
baseUrl: "https://api.example.com/v1",
capProvider: {
getCap: async () => {
if (capNearExpiry(currentCreds)) {
// Re-pair via QR/relay or re-bootstrap, replacing `currentCreds`.
currentCreds = await refreshCredentials()
}
return { cap: currentCreds.capCert, devEdPrivHex: currentCreds.device.edPriv }
},
},
})

CapProvider.getCap() is called for every authenticated request and naturally integrates with the cap-cert lifecycle (TTL, rotation, pairing).

Combining Strategies

Compose the layers into a single fetch pipeline:

Request → Auth refresh (401) → Circuit breaker → Retry (429/5xx/network) → globalThis.fetch
const client = new StarfishClient({
baseUrl: "https://api.example.com/v1",
capProvider: {
getCap: async () => {
if (capNearExpiry(currentCreds)) {
currentCreds = await refreshCredentials()
}
return { cap: currentCreds.capCert, devEdPrivHex: currentCreds.device.edPriv }
},
},
fetch: createResilientFetch(
{ maxRetries: 3, initialDelayMs: 500 },
{ threshold: 5, cooldownMs: 30_000 },
),
})

Integration with Sync Status

Extend the deriveSyncStatus function from Offline & Connectivity to surface error categories:

import type { StarfishState } from "@drakkar.software/starfish-client/zustand"

type SyncStatusValue = "synced" | "pending" | "syncing" | "error" | "offline"

function deriveSyncStatus(state: StarfishState): {
status: SyncStatusValue
message: string
} {
if (!state.online) return { status: "offline", message: "No connection" }
if (state.error) {
// Classify the error message for better UX
if (state.error.includes("429") || state.error.includes("rate"))
return { status: "error", message: "Server busy — retrying soon" }
if (state.error.includes("401") || state.error.includes("auth"))
return { status: "error", message: "Session expired — please sign in" }
if (state.error.includes("Circuit breaker"))
return { status: "error", message: "Sync paused — will retry shortly" }
return { status: "error", message: "Sync failed" }
}
if (state.syncing) return { status: "syncing", message: "Saving..." }
if (state.dirty) return { status: "pending", message: "Unsaved changes" }
return { status: "synced", message: "All changes saved" }
}

Next Steps