Pattern: Idempotent API¶

Quick facts

Category: Backend & Distributed Systems
Maturity: Adopt
Typical team size: 1-2 engineers per API surface
Typical timeline to MVP: 1-3 weeks (per API; shared infrastructure is reusable)
Last reviewed: 2026-05-19 by Architecture Team

1. Context¶

Use this pattern when:

A state-changing API operation has consequences that are unacceptable to duplicate — debiting an account twice, issuing two cards, opening two accounts for the same request
The network path between client and server is unreliable and the client will retry on timeout or connection failure
The server processes operations asynchronously and the client cannot distinguish "request lost in transit" from "request received, processing"
An upstream system (card scheme, payment rail, correspondent bank) retries operations without being able to confirm whether the original was processed

Do NOT use this pattern when:

The operation is naturally idempotent already — GET, PUT with full resource replacement, and DELETE are idempotent by HTTP semantics; this pattern is for POST and non-idempotent PATCH operations
Duplicate execution is harmless — a log entry, a read-only analytics event, a cache warm-up; the overhead of idempotency key storage is not justified
The client is a browser form submission with no retry logic — browser POST-redirect-GET is the correct pattern for those cases

Relationship to other patterns

The Idempotent API pattern is not an architectural style in the same sense as Microservices or Event-Driven Architecture. It is a mandatory safety layer applied to any state-changing API. It appears inside every other pattern in this category:

Where idempotency appears	Why
Saga Pattern	Every saga step (Activity) must be idempotent; the orchestrator retries timed-out steps
CQRS with CDC-Driven Read Models	Projection upserts must be idempotent; Kafka guarantees at-least-once delivery
Canonical Model + Pluggable Adapters	Outbound adapter dispatch to payment rails must be idempotent; rails retry on network timeout
Event-Driven Architecture	Every Kafka consumer must be idempotent; messages can be re-delivered after rebalance or replay
Workflow Orchestration	Activity implementations called by Temporal must be idempotent; Temporal retries on failure

2. Problem it solves¶

A corporate treasury system sends a payment instruction to the bank's H2H API. The bank processes the payment and debits the account. The bank's response is lost in transit — the corporate's network times out before receiving it. The corporate system cannot tell whether the payment was processed or not, so it retries. Without idempotency, the bank processes the same instruction twice, debiting the account for double the amount. The corporate system has no way to detect this until the account statement arrives. The Idempotent API pattern prevents this by having the client supply a unique key with every request; the server uses that key to detect and short-circuit duplicate submissions, returning the original response without re-executing the operation.

3. Solution overview¶

Request lifecycle¶

flowchart TD
    Client([Client]) -->|POST /payments\nIdempotency-Key: uuid-1234| API[API Server]

    API --> Check{Key exists\nin store?}

    Check -->|Yes — not in-flight| Return[Return cached\nresponse]
    Check -->|Yes — in-flight| Wait[Return 409 Conflict\nor wait + return]
    Check -->|No| Lock[Acquire distributed lock\non key]

    Lock --> Process[Process request\nwrite to DB]
    Process --> Store[Store key + response\n+ request fingerprint]
    Store --> Release[Release lock]
    Release --> Respond[Return response\nto client]

    Return --> Client
    Wait --> Client
    Respond --> Client

Container view (C4 Level 2)¶

flowchart TB
    subgraph Client["Client"]
        ClientApp[Client Application\ngenerates UUID v4 per operation]
    end

    subgraph API["API Layer"]
        Gateway[API Gateway\nor middleware]
        IdempMiddleware[Idempotency\nMiddleware]
        Handler[Request Handler\nbusiness logic]
    end

    subgraph Store["Idempotency Store"]
        Redis[(Redis\nkey TTL store)]
        LockStore[Redis SETNX\ndistributed lock]
    end

    subgraph DB["System of Record"]
        MainDB[(Primary DB\ntransaction + idempotency log)]
    end

    ClientApp -->|Idempotency-Key header| Gateway
    Gateway --> IdempMiddleware
    IdempMiddleware -->|check key| Redis
    Redis -->|miss| LockStore
    LockStore -->|lock acquired| Handler
    Handler -->|write result| MainDB
    Handler -->|store key + response| Redis
    LockStore -->|lock released| IdempMiddleware
    Redis -->|hit — return cached| IdempMiddleware
    IdempMiddleware --> Gateway

Key lifecycle states¶

stateDiagram-v2
    [*] --> Missing: first request with this key
    Missing --> InFlight: lock acquired, processing started
    InFlight --> Completed: processing finished, response cached
    InFlight --> Failed: processing failed (5xx)
    Completed --> Completed: subsequent requests return cached response
    Failed --> Missing: 5xx responses are NOT cached — client may retry
    Completed --> [*]: TTL expires

4. Technology stack¶

Layer	Primary choice	Alternatives	Notes
Idempotency key storage	Redis (with TTL)	PostgreSQL `idempotency_keys` table, DynamoDB	Redis for sub-millisecond key lookup and native TTL expiry; PostgreSQL for financial-grade durability where key loss is unacceptable (store key in same transaction as the operation itself — the only way to guarantee atomicity)
Distributed lock (in-flight protection)	Redis `SET NX PX` (atomic set-if-not-exists with TTL)	Redlock (multi-node Redis), database row lock	The lock prevents two concurrent requests with the same key from both proceeding; must expire automatically to handle crashed workers
Key header standard	`Idempotency-Key` (IETF draft)	`X-Idempotency-Key` (pre-standard convention)	Use `Idempotency-Key` — this is the emerging IETF standard adopted by Stripe, Adyen, and others; include it in your OpenAPI spec
Key format	UUID v4 (client-generated)	UUID v7 (time-ordered), ULID	Client-generated UUIDs give clients full control and require no server round-trip before the operation; v4 is universally supported; v7 / ULID are preferable if index efficiency on the storage side matters
Request fingerprinting	SHA-256 hash of request body stored alongside key	None (key-only)	Stripe's approach: if a key is reused with a different request body, return 422 Unprocessable Entity; prevents accidental key reuse masking a different operation
Middleware / library	Custom middleware (Go, NestJS interceptor, Django middleware)	AWS Lambda Powertools idempotency utility	Implement as a reusable middleware layer, not per-handler; every handler that needs idempotency applies the middleware rather than re-implementing the pattern
TTL	24 hours (payment APIs) to 7 days (async / long-running)	Per-operation configuration	TTL must be longer than the client's retry window; for async operations where the client polls for a result, extend the TTL to cover the full async processing SLA

5. Non-functional characteristics¶

Concern	Profile
Correctness guarantee	Exactly-once execution for the business operation when the client retries within the TTL window and the same key. After TTL expiry a new submission of the same key is treated as a new request.
Latency overhead	One Redis read on every request (< 1 ms). On a cache miss: one Redis write after processing (< 1 ms). Total overhead on the hot path: < 2 ms. The lock acquisition adds one additional Redis round-trip only on the first request for a key.
Availability impact	Redis is in the read path for every idempotent API call. A Redis outage requires a fallback decision: fail open (process without idempotency check — risk of duplicates) or fail closed (return 503 — no duplicates, but service unavailable). Financial APIs must fail closed.
Storage cost	Redis memory per key: ~200–500 bytes (key + response body + fingerprint + metadata). At 1 million transactions per day with 24h TTL: approximately 500 MB Redis memory. Negligible for most deployments.
Security posture	Idempotency keys are client-supplied and may be guessable if short or sequential. Mitigation: document that keys must be UUIDs or equivalently unguessable; validate format server-side; keys are scoped per client credential (API key / tenant) so a key from client A cannot collide with client B.
Compliance fit	The idempotency key log is an audit record of every attempted operation and its outcome. Retain alongside the transaction record. For payment APIs, regulators may require evidence that duplicate submissions were detected and suppressed — the key store is that evidence.

6. Cost ballpark¶

Idempotency infrastructure is shared across all APIs; cost is dominated by Redis.

Scale	Operations / day	Incremental monthly cost	Notes
Small	< 100,000	$0 - $50	Redis already provisioned for sessions/cache; idempotency keys add < 50 MB memory overhead
Medium	100k - 5M	$50 - $300	Dedicated Redis cluster for idempotency (isolation from cache eviction); 1-2 GB memory
Large	5M+	$300 - $1,500	Redis cluster with replication, persistence (AOF) for key durability, automated TTL monitoring

7. LLM-assisted development fit¶

Aspect	Rating	Notes
Idempotency middleware boilerplate (Redis check, lock, store)	★★★★★	Excellent — the core check-lock-process-store pattern is well-represented in Go, Python, and TypeScript
Request fingerprinting (SHA-256 of body + key mismatch detection)	★★★★	Good — straightforward hashing; validate that the comparison logic handles JSON key ordering correctly
Distributed lock with TTL (Redis SET NX)	★★★★	Good — gets the atomic SET NX right; verify the lock TTL is longer than the maximum expected processing time
In-flight concurrent request handling (409 vs wait)	★★★	Knows both approaches; the choice between returning 409 immediately vs polling-wait is a product decision the LLM cannot make
Architecture decisions	★	Don't outsource. Fail-open vs fail-closed during Redis outage is a business risk decision, not a technical one.

Recommended workflow: Implement idempotency as shared middleware before building the first payment or account API — retrofitting it later requires re-testing every handler. Test four scenarios before launch: (1) first request succeeds, (2) retry returns cached response without re-executing, (3) concurrent duplicate requests — only one executes, (4) Redis outage — fail-closed behaviour confirmed.

8. Reference implementations¶

Public reference: Stripe — Idempotent requests — Stripe's production idempotency implementation; documents the Idempotency-Key header, 24h TTL, request fingerprinting (same key + different body = 422), and safe retry behaviour across all API operations (200 OK ✓)
Public reference: Adyen — API idempotency — Adyen's payment API idempotency design; shows how a major card acquirer handles duplicate authorization and capture requests from merchants and schemes (200 OK ✓)
Public reference: AWS Builders Library — Making retries safe with idempotent APIs — Amazon's internal engineering guidance on idempotency; covers token design, storage strategies, and the in-flight deduplication problem at AWS scale (200 OK ✓)
Public reference: RFC 7231 §4.2.2 — Idempotent methods — the HTTP specification definition of idempotency; foundational context for why POST requires explicit idempotency handling while PUT and DELETE do not (200 OK ✓)
Internal case studies: Digital banking — payment initiation H2H and card authorization (see below)

Internal case study — Corporate H2H payment initiation¶

A corporate treasury management system sends payment files to the bank via a Host-to-Host (H2H) API. Files contain 50–500 individual payment instructions, each submitted as a separate POST request. The corporate's ERP system retries any request that does not receive a 2xx response within 30 seconds — a standard behaviour in corporate banking middleware that the bank cannot change.

Before idempotency was implemented, two incidents occurred in the same quarter: a network link degradation caused 140 payment instructions to be submitted twice; both submissions were processed; 140 corporate accounts were debited for double the intended amount. The remediation took 4 days of manual reversal processing and generated regulatory correspondence.

What changed

Every payment instruction endpoint now requires an Idempotency-Key header. The corporate's middleware was updated to generate a UUID v4 per instruction and persist it alongside the instruction in the ERP (so the key survives an ERP restart). The bank's API middleware checks Redis on every POST; a hit returns the original response without invoking the payment engine.

flowchart LR
    ERP([Corporate ERP]) -->|POST /payments\nIdempotency-Key: uuid-per-instruction| API[Payment API]
    API -->|check| Redis[(Redis\nkey store 24h TTL)]
    Redis -->|miss — first submission| Engine[Payment Engine\ndebit + post]
    Engine -->|store result| Redis
    Engine -->|response| ERP
    Redis -->|hit — retry| API
    API -->|cached response\nno engine call| ERP

Key design decisions

Decision	Choice	Reason
Key scope	Per client credential + per key value	Key `abc-123` from Client A cannot collide with `abc-123` from Client B
Request fingerprinting	SHA-256 of instruction body	A retry with a changed amount on the same key returns 422 — prevents accidental key reuse hiding a different instruction
Redis outage behaviour	Fail closed (503)	Duplicate payment is worse than temporary unavailability; corporate client retries later
TTL	48 hours	Corporate batch windows can span overnight; 24h was insufficient for batches submitted at 23:59

Outcomes

Metric	Before	After
Duplicate payment incidents	2 in one quarter	0 in 18 months post-launch
Manual reversal processing per incident	4 days	N/A
Retry transparency	None — corporate could not tell if retry was safe	Idempotent retry confirmed in API documentation and tested in integration suite

Gotchas observed

ERP did not persist the idempotency key across restarts — after an ERP failover, the key was lost; the retry generated a new key and created a duplicate payment. Fixed by requiring the ERP to persist the key in its own database before submitting the first request, and by documenting this as a client integration requirement in the API spec.
Key TTL too short for end-of-month batch — a 24h TTL meant that a payment instruction submitted at 23:30 and retried at 00:15 (after month-end rollover processing) was treated as a new request. Resolved by extending TTL to 48h and documenting the TTL prominently in the API reference.

Internal case study — Card authorization deduplication¶

A card authorization request from the international card scheme arrives at the bank's authorization engine. The scheme's network stack retries the authorization if it does not receive a response within 100ms — the ISO 8583 timeout. If the bank's engine processes the first authorization (debiting available credit) but the response is delayed by a GC pause, the scheme sends a second identical authorization. Without idempotency, the bank debits the cardholder twice for a single tap.

Card authorization idempotency differs from payment initiation in two critical ways: the key is scheme-supplied (the STAN — System Trace Audit Number — combined with the terminal ID and date), not client-generated; and the latency budget is 100ms, not 30 seconds.