Skip to content

Pattern: Idempotent API

Quick facts

  • Category: Backend & Distributed Systems
  • Maturity: Adopt
  • Typical team size: 1-2 engineers per API surface
  • Typical timeline to MVP: 1-3 weeks (per API; shared infrastructure is reusable)
  • Last reviewed: 2026-05-19 by Architecture Team

1. Context

Use this pattern when:

  • A state-changing API operation has consequences that are unacceptable to duplicate — debiting an account twice, issuing two cards, opening two accounts for the same request
  • The network path between client and server is unreliable and the client will retry on timeout or connection failure
  • The server processes operations asynchronously and the client cannot distinguish "request lost in transit" from "request received, processing"
  • An upstream system (card scheme, payment rail, correspondent bank) retries operations without being able to confirm whether the original was processed

Do NOT use this pattern when:

  • The operation is naturally idempotent already — GET, PUT with full resource replacement, and DELETE are idempotent by HTTP semantics; this pattern is for POST and non-idempotent PATCH operations
  • Duplicate execution is harmless — a log entry, a read-only analytics event, a cache warm-up; the overhead of idempotency key storage is not justified
  • The client is a browser form submission with no retry logic — browser POST-redirect-GET is the correct pattern for those cases

Relationship to other patterns

The Idempotent API pattern is not an architectural style in the same sense as Microservices or Event-Driven Architecture. It is a mandatory safety layer applied to any state-changing API. It appears inside every other pattern in this category:

Where idempotency appears Why
Saga Pattern Every saga step (Activity) must be idempotent; the orchestrator retries timed-out steps
CQRS with CDC-Driven Read Models Projection upserts must be idempotent; Kafka guarantees at-least-once delivery
Canonical Model + Pluggable Adapters Outbound adapter dispatch to payment rails must be idempotent; rails retry on network timeout
Event-Driven Architecture Every Kafka consumer must be idempotent; messages can be re-delivered after rebalance or replay
Workflow Orchestration Activity implementations called by Temporal must be idempotent; Temporal retries on failure

2. Problem it solves

A corporate treasury system sends a payment instruction to the bank's H2H API. The bank processes the payment and debits the account. The bank's response is lost in transit — the corporate's network times out before receiving it. The corporate system cannot tell whether the payment was processed or not, so it retries. Without idempotency, the bank processes the same instruction twice, debiting the account for double the amount. The corporate system has no way to detect this until the account statement arrives. The Idempotent API pattern prevents this by having the client supply a unique key with every request; the server uses that key to detect and short-circuit duplicate submissions, returning the original response without re-executing the operation.

3. Solution overview

Request lifecycle

flowchart TD
    Client([Client]) -->|POST /payments\nIdempotency-Key: uuid-1234| API[API Server]

    API --> Check{Key exists\nin store?}

    Check -->|Yes — not in-flight| Return[Return cached\nresponse]
    Check -->|Yes — in-flight| Wait[Return 409 Conflict\nor wait + return]
    Check -->|No| Lock[Acquire distributed lock\non key]

    Lock --> Process[Process request\nwrite to DB]
    Process --> Store[Store key + response\n+ request fingerprint]
    Store --> Release[Release lock]
    Release --> Respond[Return response\nto client]

    Return --> Client
    Wait --> Client
    Respond --> Client

Container view (C4 Level 2)

flowchart TB
    subgraph Client["Client"]
        ClientApp[Client Application\ngenerates UUID v4 per operation]
    end

    subgraph API["API Layer"]
        Gateway[API Gateway\nor middleware]
        IdempMiddleware[Idempotency\nMiddleware]
        Handler[Request Handler\nbusiness logic]
    end

    subgraph Store["Idempotency Store"]
        Redis[(Redis\nkey TTL store)]
        LockStore[Redis SETNX\ndistributed lock]
    end

    subgraph DB["System of Record"]
        MainDB[(Primary DB\ntransaction + idempotency log)]
    end

    ClientApp -->|Idempotency-Key header| Gateway
    Gateway --> IdempMiddleware
    IdempMiddleware -->|check key| Redis
    Redis -->|miss| LockStore
    LockStore -->|lock acquired| Handler
    Handler -->|write result| MainDB
    Handler -->|store key + response| Redis
    LockStore -->|lock released| IdempMiddleware
    Redis -->|hit — return cached| IdempMiddleware
    IdempMiddleware --> Gateway

Key lifecycle states

stateDiagram-v2
    [*] --> Missing: first request with this key
    Missing --> InFlight: lock acquired, processing started
    InFlight --> Completed: processing finished, response cached
    InFlight --> Failed: processing failed (5xx)
    Completed --> Completed: subsequent requests return cached response
    Failed --> Missing: 5xx responses are NOT cached — client may retry
    Completed --> [*]: TTL expires

4. Technology stack

Layer Primary choice Alternatives Notes
Idempotency key storage Redis (with TTL) PostgreSQL idempotency_keys table, DynamoDB Redis for sub-millisecond key lookup and native TTL expiry; PostgreSQL for financial-grade durability where key loss is unacceptable (store key in same transaction as the operation itself — the only way to guarantee atomicity)
Distributed lock (in-flight protection) Redis SET NX PX (atomic set-if-not-exists with TTL) Redlock (multi-node Redis), database row lock The lock prevents two concurrent requests with the same key from both proceeding; must expire automatically to handle crashed workers
Key header standard Idempotency-Key (IETF draft) X-Idempotency-Key (pre-standard convention) Use Idempotency-Key — this is the emerging IETF standard adopted by Stripe, Adyen, and others; include it in your OpenAPI spec
Key format UUID v4 (client-generated) UUID v7 (time-ordered), ULID Client-generated UUIDs give clients full control and require no server round-trip before the operation; v4 is universally supported; v7 / ULID are preferable if index efficiency on the storage side matters
Request fingerprinting SHA-256 hash of request body stored alongside key None (key-only) Stripe's approach: if a key is reused with a different request body, return 422 Unprocessable Entity; prevents accidental key reuse masking a different operation
Middleware / library Custom middleware (Go, NestJS interceptor, Django middleware) AWS Lambda Powertools idempotency utility Implement as a reusable middleware layer, not per-handler; every handler that needs idempotency applies the middleware rather than re-implementing the pattern
TTL 24 hours (payment APIs) to 7 days (async / long-running) Per-operation configuration TTL must be longer than the client's retry window; for async operations where the client polls for a result, extend the TTL to cover the full async processing SLA

5. Non-functional characteristics

Concern Profile
Correctness guarantee Exactly-once execution for the business operation when the client retries within the TTL window and the same key. After TTL expiry a new submission of the same key is treated as a new request.
Latency overhead One Redis read on every request (< 1 ms). On a cache miss: one Redis write after processing (< 1 ms). Total overhead on the hot path: < 2 ms. The lock acquisition adds one additional Redis round-trip only on the first request for a key.
Availability impact Redis is in the read path for every idempotent API call. A Redis outage requires a fallback decision: fail open (process without idempotency check — risk of duplicates) or fail closed (return 503 — no duplicates, but service unavailable). Financial APIs must fail closed.
Storage cost Redis memory per key: ~200–500 bytes (key + response body + fingerprint + metadata). At 1 million transactions per day with 24h TTL: approximately 500 MB Redis memory. Negligible for most deployments.
Security posture Idempotency keys are client-supplied and may be guessable if short or sequential. Mitigation: document that keys must be UUIDs or equivalently unguessable; validate format server-side; keys are scoped per client credential (API key / tenant) so a key from client A cannot collide with client B.
Compliance fit The idempotency key log is an audit record of every attempted operation and its outcome. Retain alongside the transaction record. For payment APIs, regulators may require evidence that duplicate submissions were detected and suppressed — the key store is that evidence.

6. Cost ballpark

Idempotency infrastructure is shared across all APIs; cost is dominated by Redis.

Scale Operations / day Incremental monthly cost Notes
Small < 100,000 $0 - $50 Redis already provisioned for sessions/cache; idempotency keys add < 50 MB memory overhead
Medium 100k - 5M $50 - $300 Dedicated Redis cluster for idempotency (isolation from cache eviction); 1-2 GB memory
Large 5M+ $300 - $1,500 Redis cluster with replication, persistence (AOF) for key durability, automated TTL monitoring

7. LLM-assisted development fit

Aspect Rating Notes
Idempotency middleware boilerplate (Redis check, lock, store) ★★★★★ Excellent — the core check-lock-process-store pattern is well-represented in Go, Python, and TypeScript
Request fingerprinting (SHA-256 of body + key mismatch detection) ★★★★ Good — straightforward hashing; validate that the comparison logic handles JSON key ordering correctly
Distributed lock with TTL (Redis SET NX) ★★★★ Good — gets the atomic SET NX right; verify the lock TTL is longer than the maximum expected processing time
In-flight concurrent request handling (409 vs wait) ★★★ Knows both approaches; the choice between returning 409 immediately vs polling-wait is a product decision the LLM cannot make
Architecture decisions Don't outsource. Fail-open vs fail-closed during Redis outage is a business risk decision, not a technical one.

Recommended workflow: Implement idempotency as shared middleware before building the first payment or account API — retrofitting it later requires re-testing every handler. Test four scenarios before launch: (1) first request succeeds, (2) retry returns cached response without re-executing, (3) concurrent duplicate requests — only one executes, (4) Redis outage — fail-closed behaviour confirmed.

8. Reference implementations

  • Public reference: Stripe — Idempotent requests — Stripe's production idempotency implementation; documents the Idempotency-Key header, 24h TTL, request fingerprinting (same key + different body = 422), and safe retry behaviour across all API operations (200 OK ✓)
  • Public reference: Adyen — API idempotency — Adyen's payment API idempotency design; shows how a major card acquirer handles duplicate authorization and capture requests from merchants and schemes (200 OK ✓)
  • Public reference: AWS Builders Library — Making retries safe with idempotent APIs — Amazon's internal engineering guidance on idempotency; covers token design, storage strategies, and the in-flight deduplication problem at AWS scale (200 OK ✓)
  • Public reference: RFC 7231 §4.2.2 — Idempotent methods — the HTTP specification definition of idempotency; foundational context for why POST requires explicit idempotency handling while PUT and DELETE do not (200 OK ✓)
  • Internal case studies: Digital banking — payment initiation H2H and card authorization (see below)

Internal case study — Corporate H2H payment initiation

A corporate treasury management system sends payment files to the bank via a Host-to-Host (H2H) API. Files contain 50–500 individual payment instructions, each submitted as a separate POST request. The corporate's ERP system retries any request that does not receive a 2xx response within 30 seconds — a standard behaviour in corporate banking middleware that the bank cannot change.

Before idempotency was implemented, two incidents occurred in the same quarter: a network link degradation caused 140 payment instructions to be submitted twice; both submissions were processed; 140 corporate accounts were debited for double the intended amount. The remediation took 4 days of manual reversal processing and generated regulatory correspondence.

What changed

Every payment instruction endpoint now requires an Idempotency-Key header. The corporate's middleware was updated to generate a UUID v4 per instruction and persist it alongside the instruction in the ERP (so the key survives an ERP restart). The bank's API middleware checks Redis on every POST; a hit returns the original response without invoking the payment engine.

flowchart LR
    ERP([Corporate ERP]) -->|POST /payments\nIdempotency-Key: uuid-per-instruction| API[Payment API]
    API -->|check| Redis[(Redis\nkey store 24h TTL)]
    Redis -->|miss — first submission| Engine[Payment Engine\ndebit + post]
    Engine -->|store result| Redis
    Engine -->|response| ERP
    Redis -->|hit — retry| API
    API -->|cached response\nno engine call| ERP

Key design decisions

Decision Choice Reason
Key scope Per client credential + per key value Key abc-123 from Client A cannot collide with abc-123 from Client B
Request fingerprinting SHA-256 of instruction body A retry with a changed amount on the same key returns 422 — prevents accidental key reuse hiding a different instruction
Redis outage behaviour Fail closed (503) Duplicate payment is worse than temporary unavailability; corporate client retries later
TTL 48 hours Corporate batch windows can span overnight; 24h was insufficient for batches submitted at 23:59

Outcomes

Metric Before After
Duplicate payment incidents 2 in one quarter 0 in 18 months post-launch
Manual reversal processing per incident 4 days N/A
Retry transparency None — corporate could not tell if retry was safe Idempotent retry confirmed in API documentation and tested in integration suite

Gotchas observed

  • ERP did not persist the idempotency key across restarts — after an ERP failover, the key was lost; the retry generated a new key and created a duplicate payment. Fixed by requiring the ERP to persist the key in its own database before submitting the first request, and by documenting this as a client integration requirement in the API spec.
  • Key TTL too short for end-of-month batch — a 24h TTL meant that a payment instruction submitted at 23:30 and retried at 00:15 (after month-end rollover processing) was treated as a new request. Resolved by extending TTL to 48h and documenting the TTL prominently in the API reference.

Internal case study — Card authorization deduplication

A card authorization request from the international card scheme arrives at the bank's authorization engine. The scheme's network stack retries the authorization if it does not receive a response within 100ms — the ISO 8583 timeout. If the bank's engine processes the first authorization (debiting available credit) but the response is delayed by a GC pause, the scheme sends a second identical authorization. Without idempotency, the bank debits the cardholder twice for a single tap.

Card authorization idempotency differs from payment initiation in two critical ways: the key is scheme-supplied (the STAN — System Trace Audit Number — combined with the terminal ID and date), not client-generated; and the latency budget is 100ms, not 30 seconds.

What changed

The authorization engine computes a composite idempotency key from STAN + terminal_id + processing_date. On every authorization request, it performs an atomic Redis SET NX with a 60-second TTL. If the key already exists, the cached authorization response is returned immediately. The Redis check adds < 1ms to the authorization path.

Gotchas observed

  • STAN wraps at 999999 — the Scheme Trace Audit Number is a 6-digit counter that resets to 000001 at the end of each processing day. Without the processing_date component in the composite key, STANs from the previous day collided with the first authorizations of the new day. Fixed by including processing_date in the key.
  • 60-second TTL was too short for scheme retry window — some scheme retry configurations extend to 120 seconds. Authorizations submitted near the 60s boundary were re-processed. Extended TTL to 180 seconds with no material memory impact.
  • Candidate ADR: Fail-open vs fail-closed during Redis idempotency store outage — the answer differs per API risk tier (informational vs financial); record when your organisation sets the policy
  • Candidate ADR: Idempotency key TTL policy by API category — payment APIs, account management APIs, and reporting APIs have different retry windows and different risk profiles

10. Known risks & gotchas

  • Client does not persist the key before submitting — the client generates a key, submits the request, crashes before receiving a response, restarts, generates a new key, and submits again. The server sees two different keys and processes both. Mitigation: document in the API contract that clients must persist the idempotency key in durable storage before making the first request; provide client SDK examples that enforce this.
  • Caching 5xx responses locks out legitimate retries — a transient server error (database connection timeout) produces a 500; if that response is cached, every retry for the next 24 hours returns 500 without retrying the operation. Mitigation: never cache 5xx responses; only cache successful responses (2xx) and client-error responses (4xx) where the error is deterministic (validation failure, insufficient funds).
  • Redis eviction under memory pressure silently removes keys — if Redis is under memory pressure and the eviction policy is allkeys-lru, idempotency keys can be silently evicted before their TTL expires. A retry after eviction re-executes the operation. Mitigation: use a dedicated Redis instance for idempotency keys with maxmemory-policy noeviction; size it with headroom; alert before it reaches capacity.
  • Concurrent duplicate requests both pass the initial check — two requests with the same key arrive within microseconds of each other; both check Redis, both find a miss, both proceed to the distributed lock. Only one acquires the lock; the other must wait or return 409. Mitigation: the distributed lock (Redis SET NX) is mandatory — a simple read-then-write is not sufficient; test concurrent submission explicitly.
  • Key namespace collision across API versions/v1/payments and /v2/payments share the same Redis key namespace; a key submitted to v1 prevents the same key being used on v2. Mitigation: namespace the stored key by API endpoint and version ({api_version}:{endpoint}:{client_id}:{key}).
  • Idempotency applied inconsistently across a microservices estate — some services implement it, some do not; consumers cannot tell which APIs are safe to retry. Mitigation: make idempotency a platform-level default for all POST endpoints; enforce via a gateway policy or shared middleware that every service inherits; document exceptions explicitly in the API spec.