Error Classification & Retry
Starfish's SyncManager retries on conflict errors (409) automatically. This page covers retry strategies for all other transient errors — rate limits, server failures, and network outages — at the HTTP layer.
Prerequisites: StarfishClient, Offline & Connectivity
Error Classification
Every error from a sync operation falls into one of these categories:
| Category | Cause | Retryable | SDK handling |
|---|---|---|---|
| Network | fetch throws (TypeError, AbortError) | Yes | None — client throws |
| 401 Unauthorized | Expired or invalid auth token | Once (after refresh) | None — throws StarfishHttpError |
| 409 Conflict | Hash mismatch on push | Yes | SyncManager retries automatically |
| 429 Rate Limited | Too many requests | Yes (with backoff) | Opt-in: cache fallback + background retry (see below) |
| 5xx Server Error | Server bug or overload | Yes (with backoff) | Opt-in: cache fallback + background retry (see below) |
| Other 4xx | Bad request, forbidden, etc. | No | None — throws StarfishHttpError |
Classifier function
import { StarfishHttpError } from "@drakkar.software/starfish-client"
type ErrorCategory = "network" | "auth" | "conflict" | "rate-limited" | "server" | "client" | "unknown"
function classifyError(err: unknown): ErrorCategory {
if (err instanceof StarfishHttpError) {
if (err.status === 401) return "auth"
if (err.status === 409) return "conflict"
if (err.status === 429) return "rate-limited"
if (err.status >= 500) return "server"
return "client"
}
if (err instanceof TypeError) return "network"
return "unknown"
}
Stale-While-Revalidate
For offline-first apps with a pull cache, you can make transient server failures (429, 5xx) serve the last-synced snapshot immediately instead of throwing, and retry silently in the background.
Configure cacheFallbackStatuses on the client:
import { StarfishClient } from "@drakkar.software/starfish-client"
const client = new StarfishClient({
baseUrl: "https://api.example.com/v1",
capProvider,
cache: myPullCache, // required — no cache means nothing to serve
cacheFallbackStatuses: [429, 500, 502, 503, 504],
onRevalidated: (path, result) => {
// Fresh snapshot available — signal the app to re-pull
reportReachability(true)
},
})
How it works:
- A structured
pull()receives a 429 or 5xx response. - If a cached snapshot exists for that document,
pull()returns it immediately tagged stale (pullWasFromCache(result) === true). - A background revalidation loop starts — honoring any
Retry-Afterheader — and retries up to 5 times. - When the server returns a 2xx response, the fresh snapshot is written through to the cache and
onRevalidatedfires.
Important constraints:
cachemust be configured — when no snapshot exists, the error propagates as before (nothing to serve).- Do not include
403or404incacheFallbackStatuses— they are genuine server answers (access denied / no document yet), not transient failures. Serving a stale snapshot for those would mask deletions and auth failures. - Applies to structured (non-append) pulls only. Append-log pulls own their own warm-start persistence.
- The background loop is deduplicated per document: many concurrent 429 pulls on the same path spawn exactly one loop.
parseRetryAfterMs
A helper exported from @drakkar.software/starfish-client/fetch that parses a
Retry-After header value into milliseconds. Used internally by both
createRetryFetch and the stale-while-revalidate background loop:
import { parseRetryAfterMs } from "@drakkar.software/starfish-client/fetch"
const delay = parseRetryAfterMs(
response.headers.get("Retry-After"),
{ fallbackMs: 1_000, maxMs: 30_000 },
)
// "30" → 30_000 ms
// "Thu, 01 ..." → delta from now in ms (floored to 0)
// null / "" → fallbackMs (1_000)
// "garbage" → fallbackMs (1_000)
Retry Fetch Wrapper
Inject retry logic via StarfishClientOptions.fetch. This wraps the native fetch with automatic retry for transient errors:
interface RetryOptions {
/** Max retry attempts (default: 3) */
maxRetries?: number
/** Initial delay in ms (default: 500) */
initialDelayMs?: number
/** Max delay in ms (default: 10000) */
maxDelayMs?: number
}
function createRetryFetch(options: RetryOptions = {}): typeof globalThis.fetch {
const { maxRetries = 3, initialDelayMs = 500, maxDelayMs = 10_000 } = options
return async (input, init) => {
let attempt = 0
while (true) {
try {
const response = await globalThis.fetch(input, init)
if (response.status === 429 || response.status >= 500) {
if (attempt >= maxRetries) return response
const retryAfter = response.headers.get("Retry-After")
const delay = retryAfter
? parseInt(retryAfter, 10) * 1000
: Math.min(initialDelayMs * Math.pow(2, attempt), maxDelayMs)
await new Promise((r) => setTimeout(r, delay + Math.random() * 100))
attempt++
continue
}
return response
} catch (err) {
// Network error (offline, DNS failure, etc.)
if (attempt >= maxRetries) throw err
await new Promise((r) =>
setTimeout(r, Math.min(initialDelayMs * Math.pow(2, attempt), maxDelayMs))
)
attempt++
}
}
}
}
Usage:
import { StarfishClient } from "@drakkar.software/starfish-client"
const client = new StarfishClient({
baseUrl: "https://api.example.com/v1",
capProvider: { getCap: async () => ({ cap, devEdPrivHex }) },
fetch: createRetryFetch({ maxRetries: 3 }),
})
This is complementary to SyncManager's conflict retry. SyncManager handles 409 conflicts with its own backoff and merge logic. The retry fetch handles 429/5xx/network errors before they reach SyncManager.
Circuit Breaker
After repeated failures, stop retrying to avoid wasting resources and battery. The circuit breaker has three states:
┌──────────┐ N failures ┌──────┐ cooldown ┌───────────┐
│ CLOSED │ ──────────────►│ OPEN │ ────────────►│ HALF-OPEN │
│ (normal) │ │(block)│ │ (test one) │
└──────────┘ └──────┘ └───────────┘
▲ │
│ success │
└────────────────────────────────────────────────────┘
failure → back to OPEN
class CircuitBreaker {
private failures = 0
private state: "closed" | "open" | "half-open" = "closed"
private nextAttemptAt = 0
constructor(
private readonly threshold: number = 5,
private readonly cooldownMs: number = 30_000,
) {}
isOpen(): boolean {
if (this.state === "open" && Date.now() >= this.nextAttemptAt) {
this.state = "half-open"
}
return this.state === "open"
}
recordSuccess() {
this.failures = 0
this.state = "closed"
}
recordFailure() {
this.failures++
if (this.failures >= this.threshold) {
this.state = "open"
this.nextAttemptAt = Date.now() + this.cooldownMs
}
}
}
Integrating with fetch
Wrap the retry fetch with circuit breaker protection:
function createResilientFetch(
retryOptions?: RetryOptions,
breakerOptions?: { threshold?: number; cooldownMs?: number },
): typeof globalThis.fetch {
const retryFetch = createRetryFetch(retryOptions)
const breaker = new CircuitBreaker(
breakerOptions?.threshold,
breakerOptions?.cooldownMs,
)
return async (input, init) => {
if (breaker.isOpen()) {
throw new Error("Circuit breaker is open — sync paused after repeated failures")
}
try {
const response = await retryFetch(input, init)
if (response.ok || response.status === 409) {
breaker.recordSuccess()
} else if (response.status >= 500) {
breaker.recordFailure()
}
return response
} catch (err) {
breaker.recordFailure()
throw err
}
}
}
Auth Token Refresh on 401
When a sync request returns 401, refresh the token and retry once:
function createAuthRefreshFetch(
refreshToken: () => Promise<void>,
): typeof globalThis.fetch {
let isRefreshing = false
return async (input, init) => {
const response = await globalThis.fetch(input, init)
if (response.status === 401 && !isRefreshing) {
isRefreshing = true
try {
await refreshToken()
} finally {
isRefreshing = false
}
// Retry once — note: auth headers in `init` were already set by
// StarfishClient before this wrapper was called. See caveat below.
return globalThis.fetch(input, init)
}
return response
}
}
The isRefreshing guard prevents multiple concurrent refresh calls when several requests fail at once.
Important: this wrapper retries the raw fetch call, but StarfishClient applies auth headers before calling fetch. To get a fresh cap on the retry, the refresh must update the state that the CapProvider reads (e.g., a stored creds), and the retry must rebuild the request with fresh headers. A simpler approach is to handle 401 at the CapProvider level — when a cap is near its exp, refresh before returning:
const client = new StarfishClient({
baseUrl: "https://api.example.com/v1",
capProvider: {
getCap: async () => {
if (capNearExpiry(currentCreds)) {
// Re-pair via QR/relay or re-bootstrap, replacing `currentCreds`.
currentCreds = await refreshCredentials()
}
return { cap: currentCreds.capCert, devEdPrivHex: currentCreds.device.edPriv }
},
},
})
CapProvider.getCap() is called for every authenticated request and naturally integrates with the cap-cert lifecycle (TTL, rotation, pairing).
Combining Strategies
Compose the layers into a single fetch pipeline:
Request → Auth refresh (401) → Circuit breaker → Retry (429/5xx/network) → globalThis.fetch
const client = new StarfishClient({
baseUrl: "https://api.example.com/v1",
capProvider: {
getCap: async () => {
if (capNearExpiry(currentCreds)) {
currentCreds = await refreshCredentials()
}
return { cap: currentCreds.capCert, devEdPrivHex: currentCreds.device.edPriv }
},
},
fetch: createResilientFetch(
{ maxRetries: 3, initialDelayMs: 500 },
{ threshold: 5, cooldownMs: 30_000 },
),
})
Integration with Sync Status
Extend the deriveSyncStatus function from Offline & Connectivity to surface error categories:
import type { StarfishState } from "@drakkar.software/starfish-client/zustand"
type SyncStatusValue = "synced" | "pending" | "syncing" | "error" | "offline"
function deriveSyncStatus(state: StarfishState): {
status: SyncStatusValue
message: string
} {
if (!state.online) return { status: "offline", message: "No connection" }
if (state.error) {
// Classify the error message for better UX
if (state.error.includes("429") || state.error.includes("rate"))
return { status: "error", message: "Server busy — retrying soon" }
if (state.error.includes("401") || state.error.includes("auth"))
return { status: "error", message: "Session expired — please sign in" }
if (state.error.includes("Circuit breaker"))
return { status: "error", message: "Sync paused — will retry shortly" }
return { status: "error", message: "Sync failed" }
}
if (state.syncing) return { status: "syncing", message: "Saving..." }
if (state.dirty) return { status: "pending", message: "Unsaved changes" }
return { status: "synced", message: "All changes saved" }
}
Next Steps
- StarfishClient — custom fetch injection point
- Offline & Connectivity — sync status indicators
- Logging & Observability — logging errors and retries