Pattern: Canonical Model + Pluggable Adapters¶
Quick facts
- Category: Backend & Distributed Systems
- Maturity: Adopt
- Typical team size: 3-6 engineers
- Typical timeline to MVP: 8-14 weeks (first adapter pair); 2-4 weeks per additional adapter
- Last reviewed: 2026-05-19 by Architecture Team
1. Context¶
Use this pattern when:
- A core processing hub must integrate with multiple external systems that each use different data formats, protocols, or naming conventions
- The number of integration formats is growing and each new format must not require changes to core processing logic
- Format or protocol variations are external constraints (regulatory standards, vendor choices, legacy systems) rather than design choices you control
- A single authoritative data model needs to be enforced across a portfolio of services — any service that deviates must be caught at the adapter boundary, not deep in core logic
Do NOT use this pattern when:
- There are only two systems to integrate — a direct adapter between them is simpler; the canonical model layer earns its overhead at three or more distinct external formats
- All systems already share the same format and schema — adding a canonical translation layer is pure indirection with no decoupling benefit
- The integration is purely consumer-facing API shaping (different clients need different API shapes) — that is the API Gateway + BFF pattern, not this one
How this pattern relates to others in this category
This is the single most common integration pattern in financial systems, but it is rarely named. Understanding where it ends and adjacent patterns begin saves significant design time.
| Question | Pattern to reach for |
|---|---|
| Different client types (web, mobile, partner) need different API shapes from the same backend? | API Gateway + BFF — consumer-facing API composition |
| Multiple services need to react asynchronously to state changes without tight coupling? | Event-Driven Architecture — async decoupling; often combined with this pattern (canonical events on Kafka) |
| A multi-service business transaction needs all-or-nothing rollback? | Saga Pattern — distributed transaction semantics; the canonical model defines what flows between saga steps |
| A long-running process needs central state tracking and retry? | Workflow Orchestration — the orchestrator can use a canonical model for all activity inputs/outputs |
| One system with clean internal boundaries, no shared canonical model across multiple external systems? | Hexagonal Architecture (Ports & Adapters) is sufficient — the canonical model layer is what makes this pattern specific |
The canonical model pattern is a domain-specific application of Hexagonal Architecture where the ports are defined by a shared internal standard that multiple systems all translate to and from.
2. Problem it solves¶
A payment hub must receive instructions from mobile banking, internet banking, a corporate batch file upload, and a third-party partner API — each in a different format. It must dispatch to SWIFT, a domestic real-time rail, a card network, and a correspondent bank — each expecting a different wire format. Without a canonical model, every inbound format must know about every outbound format: four inbound times four outbound equals sixteen bespoke translation paths, each one a maintenance liability. Adding a fifth inbound integration requires touching all existing outbound adapters. The canonical model collapses this to N+M paths (four inbound adapters translate to canonical; four outbound adapters translate from canonical), and the hub's core logic operates exclusively on the canonical form — insulated from external format churn.
3. Solution overview¶
System context (C4 Level 1)¶
flowchart LR
subgraph Inbound["Inbound channels"]
MobileBanking[Mobile Banking]
CorporateBatch[Corporate Batch\nISO 20022 pain.001]
PartnerAPI[Partner API\nJSON REST]
Legacy[Legacy Core\nproprietary flat file]
end
subgraph Hub["Processing Hub"]
Canonical[Canonical Model\nISO 20022 pacs.008]
CoreLogic[Core Processing\nvalidation, routing, enrichment]
end
subgraph Outbound["Outbound rails"]
SWIFT[SWIFT MT / MX]
FastRail[Domestic Fast Rail\ne.g. FAST, PayNow, FPS]
CardNet[Card Network\nISO 8583]
Correspondent[Correspondent Bank\nMEPS+ / proprietary]
end
MobileBanking -->|inbound adapter| Canonical
CorporateBatch -->|inbound adapter| Canonical
PartnerAPI -->|inbound adapter| Canonical
Legacy -->|inbound adapter| Canonical
Canonical --> CoreLogic
CoreLogic -->|outbound adapter| SWIFT
CoreLogic -->|outbound adapter| FastRail
CoreLogic -->|outbound adapter| CardNet
CoreLogic -->|outbound adapter| Correspondent
Container view (C4 Level 2)¶
flowchart TB
subgraph Inbound["Inbound Adapters"]
InboundA[ISO 20022 pain.001\nAdapter]
InboundB[JSON REST\nAdapter]
InboundC[Legacy Flat-file\nAdapter]
end
subgraph Core["Hub Core"]
Validator[Schema Validator\ncanonical model]
Enricher[Enricher\nrate lookup, BIC resolution]
Router[Router\nrail selection rules]
AuditLog[(Audit Log\nevery canonical message)]
end
subgraph SchemaLayer["Schema Layer"]
SchemaReg[Schema Registry\nAvro / Protobuf]
end
subgraph Outbound["Outbound Adapters"]
OutboundA[SWIFT MT Adapter\ngenerates MT103]
OutboundB[Fast Rail Adapter\ngenerates FAST ISO 20022]
OutboundC[Card Network Adapter\ngenerates ISO 8583]
end
subgraph Ops
DLQ[Dead-letter Queue\nfailed translations]
Monitor[Adapter Health\nper-rail error rates]
end
InboundA -->|canonical message| Validator
InboundB -->|canonical message| Validator
InboundC -->|canonical message| Validator
Validator --> SchemaReg
Validator --> Enricher
Enricher --> Router
Router --> AuditLog
Router --> OutboundA
Router --> OutboundB
Router --> OutboundC
OutboundA -->|on failure| DLQ
OutboundB -->|on failure| DLQ
OutboundC -->|on failure| DLQ
DLQ --> Monitor
4. Technology stack¶
| Layer | Primary choice | Alternatives | Notes |
|---|---|---|---|
| Canonical schema definition | Protobuf (Protocol Buffers) | Apache Avro + Schema Registry, ISO 20022 XSD | Protobuf gives strongly-typed, backward-compatible schemas with generated code in every language; Avro for Kafka-heavy hubs where Schema Registry is already in place; ISO 20022 XSD directly when regulatory compliance requires the exact standard |
| Schema registry / governance | Confluent Schema Registry | AWS Glue Schema Registry, Buf Schema Registry (Protobuf) | Centralise schema evolution rules; enforce backward compatibility before any adapter ships a schema change |
| Adapter implementation language | Go | Java (Spring Integration), Python | Go for high-throughput, low-latency adapters with a small memory footprint; Java/Spring Integration for complex transformation pipelines with a large library of pre-built connectors (file formats, legacy protocols) |
| Message bus (between adapters and hub) | Apache Kafka | AWS SQS, RabbitMQ | Kafka gives a durable, replayable bus; the canonical message can be inspected, replayed, and re-routed without touching adapters; SQS for simpler hubs that do not need replay |
| Inbound validation | JSON Schema (REST adapters) / Protobuf schema (internal) | Apache Camel validation, custom | Validate against the canonical schema at the adapter boundary — reject malformed messages before they enter hub core logic |
| Routing rules | Rules engine (Drools, easy-rules) | Database-driven routing table, code-based switch | Externalise rail-selection logic into a rules engine or DB table; hard-coded routing becomes a maintenance problem as rail count grows |
| Observability | OpenTelemetry — messageId and railId in every span |
Datadog, Grafana Tempo | Every canonical message must carry a correlationId traceable end-to-end across adapter boundaries |
5. Non-functional characteristics¶
| Concern | Profile |
|---|---|
| Scalability | Inbound and outbound adapters scale independently. The hub core (validator, enricher, router) is stateless and scales horizontally. Throughput bottleneck is typically the slowest outbound rail, not the hub itself. |
| Availability target | 99.99% for payment hubs — treat the canonical message bus as critical infrastructure. The Kafka-backed design means an outbound rail outage queues messages rather than rejecting them; the hub continues to accept inbound traffic. |
| Latency target | Depends on the rail. Real-time rails (FAST, card networks): p95 end-to-end < 2 s including adapter translation. Batch rails (SWIFT MT, GIRO): latency measured in settlement cycles, not milliseconds. Design the hub to meet the tightest rail SLA without penalising all rails. |
| Security posture | Each inbound adapter authenticates its channel independently. The canonical message at rest in Kafka is encrypted (TLS in-transit, encryption at-rest). Financial message fields (account numbers, amounts) are not logged in plain text. Outbound adapters hold rail credentials in a secrets manager, rotated on a defined schedule. |
| Data residency | The canonical message bus is the system of record for in-flight transactions. Ensure Kafka topics and the audit log are pinned to the correct regulatory jurisdiction. Cross-border payment hubs may need separate canonical buses per regulatory region. |
| Compliance fit | ISO 20022 adoption mandated by SWIFT for cross-border payments from 2025 — the canonical model aligns directly. SOC 2 — the audit log provides a timestamped, immutable record of every message. PCI-DSS — card data (PAN, CVV) must never appear in the canonical model in plain text; tokenise at the inbound adapter before the message enters the hub. |
6. Cost ballpark¶
Indicative monthly USD cost. Kafka and adapter compute are the dominant costs; schema registry and rules engine add modest overhead.
| Scale | Messages / day | Monthly cost | Cost drivers |
|---|---|---|---|
| Small | < 100,000 | $500 - $1,500 | 3-node Kafka, schema registry, 2-4 adapter containers, routing rules DB |
| Medium | 100k - 5M | $2,000 - $8,000 | Larger Kafka cluster, dedicated adapter fleets per rail, full observability, HA routing engine |
| Large | 5M+ | $10,000 - $40,000 | Multi-region Kafka with MirrorMaker, redundant adapter pairs per rail, HSM for credential management, regulatory archival storage |
7. LLM-assisted development fit¶
| Aspect | Rating | Notes |
|---|---|---|
| Adapter boilerplate (parse external format to canonical struct) | ★★★★★ | Excellent — format conversion code (XML/JSON/flat-file to struct) is well-handled; validate the mapping against the actual spec |
| Protobuf / Avro schema definition | ★★★★ | Good — generates clean schemas; verify field optionality and backward-compatibility rules manually |
| Routing rules implementation | ★★★ | Gets simple rules right; complex conditional routing (multi-rail fallback, regulatory overrides) requires human design |
| ISO 20022 message construction | ★★★ | Knows the standard; field-level compliance rules (mandatory vs optional per message type, character set restrictions) need human verification against the spec |
| Architecture decisions | ★ | Don't outsource. Use ADRs. Schema evolution and rail onboarding decisions are expensive to reverse. |
Recommended workflow: Define and publish the canonical schema before building any adapter. The schema is the contract; every adapter team signs up to it. Add one inbound and one outbound adapter as a proof of concept before committing to the hub architecture. Validate that the canonical model can express the most complex message each rail supports — edge cases surface format gaps that are cheap to fix before the schema is frozen.
8. Reference implementations¶
- Public reference: enterpriseintegrationpatterns.com — Canonical Data Model — the original pattern definition by Gregor Hohpe and Bobby Woolf; covers the motivation, N×M path problem, and schema governance trade-offs (200 OK ✓)
- Public reference: github.com/Sairyss/domain-driven-hexagon — TypeScript/NestJS reference implementation of Hexagonal Architecture (Ports & Adapters) with DDD; demonstrates the port/adapter boundary that the canonical model pattern formalises across multiple systems (200 OK ✓)
- Internal case studies: Digital banking — Payment Hub and Customer Master Data Hub (see below)
Internal case study — Payment Hub: ISO 20022 canonical with multi-rail adapters¶
A retail and corporate banking payment hub processes domestic and cross-border payments from five inbound channels (mobile banking, internet banking, corporate bulk upload, internal teller system, third-party partner API) and dispatches to four outbound rails (SWIFT MX, domestic fast payment rail, card network, correspondent bank via MEPS+).
The original design had grown organically: each new rail integration was built as a bespoke point-to-point connector. By the time the hub reached five rails, there were eleven bespoke translation paths in production, each implemented by a different team, none sharing a data model. Adding ISO 20022 compliance for SWIFT MX migration required touching every single connector.
What changed
ISO 20022 pacs.008 (credit transfer) was adopted as the canonical model for all payment messages inside the hub. Every inbound channel received a dedicated adapter that translates to canonical form; every outbound rail received an adapter that translates from canonical form. The hub core (validation, sanctions screening, routing, enrichment, audit) operates exclusively on the canonical message and has no knowledge of any external format.
flowchart LR
subgraph Inbound["Inbound Adapters"]
MobileIn[Mobile Banking\nJSON → pacs.008]
CorpIn[Corporate Bulk\npain.001 → pacs.008]
PartnerIn[Partner API\nREST/JSON → pacs.008]
LegacyIn[Teller System\nFlat-file → pacs.008]
end
Canonical[Canonical Hub\npacs.008 + ISO 20022]
subgraph Outbound["Outbound Adapters"]
SWIFTOut[SWIFT MX\npacs.008 → MX]
FastOut[Fast Rail\npacs.008 → FAST ISO 20022]
CardOut[Card Network\npacs.008 → ISO 8583]
MEPSOut[Correspondent\npacs.008 → MEPS+ proprietary]
end
MobileIn --> Canonical
CorpIn --> Canonical
PartnerIn --> Canonical
LegacyIn --> Canonical
Canonical --> SWIFTOut
Canonical --> FastOut
Canonical --> CardOut
Canonical --> MEPSOut
Outcomes
| Metric | Before | After |
|---|---|---|
| Integration paths in production | 11 bespoke connectors | 8 adapters (4 inbound + 4 outbound) |
| Time to onboard a new rail | 8-12 weeks (new bespoke connector) | 2-4 weeks (new outbound adapter only) |
| SWIFT ISO 20022 migration scope | All 11 connectors required changes | 1 outbound adapter updated; hub core unchanged |
| Sanctions screening coverage | Inconsistent (each connector applied it differently) | 100% — enforced once at canonical validation stage |
Gotchas observed
- Canonical model was too narrow at first — the initial pacs.008 subset omitted fields that MEPS+ required. Discovered when the first MEPS+ adapter was built. Mitigation: run a field-coverage workshop with every outbound rail team before freezing the canonical schema; the most constrained rail defines the minimum viable model.
- Legacy flat-file adapter became a translation monolith — the teller system format was undocumented; the adapter grew to 4,000 lines of defensive parsing code. Mitigated by isolating it behind a strict canonical validation step — the adapter can be ugly internally as long as it produces a valid canonical message at its output boundary.
- Schema evolution broke an adapter silently — a new mandatory field added to the canonical schema failed Protobuf validation in one adapter that had not been updated. Fixed by enforcing backward-compatibility in the schema registry; new fields must have defaults and be optional until all adapters are updated.
Internal case study — Customer Master Data Hub: canonical customer across core, CRM, and regulatory reporting¶
A bank maintained customer records in three systems: a legacy core banking system (proprietary schema), a CRM (Salesforce), and a regulatory reporting platform (FATCA/CRS XML schema). Every system had a slightly different definition of "customer" — different field names, different address formats, different identifier types. Regulatory reporting failures traced back to data that had been transformed incorrectly between systems at least twice before reaching the reporting platform.
A canonical customer model was defined (drawing on ISO 20022 Party and internal KYC extensions). Three adapters translate inbound updates from each system into the canonical form; three adapters translate outbound from canonical into each system's native format. The canonical customer record in the hub became the single source of truth; each system's native record is a derived projection.
Gotchas observed
- "Customer" meant different things in each system — the CRM counted prospects; the core counted active account holders; regulatory reporting counted legal entities. The canonical model required an explicit
partyTypefield and separate handling per type — the hub cannot treat all three as interchangeable. - Address format normalisation was harder than expected — international addresses do not fit neatly into street/city/postcode/country; FATCA required country of tax residence that none of the source systems stored consistently. Resolved by adding explicit address normalisation logic in each inbound adapter rather than in hub core.
9. Related decisions (ADRs)¶
- ADR-0001: Tenant isolation via PostgreSQL Row-Level Security — the canonical model hub audit log must follow the same RLS policy as all tenant-scoped tables
- Candidate ADR: Canonical schema format — Protobuf vs Avro vs ISO 20022 XSD — record when your organisation makes a committed schema decision
- Candidate ADR: ISO 20022 message subset selection — which message types and which optional fields to include in the canonical model
10. Known risks & gotchas¶
- The canonical model becomes a kitchen-sink schema — every team wants their fields in the canonical model; after 18 months it contains every field from every system and is owned by no one. Mitigation: assign a named schema owner (an architect or domain lead) who controls changes; treat schema changes as a formal ADR; additions require proof that at least two adapters need the field.
- Inbound adapters leak business logic — an adapter starts with format translation, then someone adds a routing rule, then a validation exception for one channel. Mitigation: strict boundary rule — adapters translate format and validate schema; no routing, no business rules, no database calls. Any logic that would affect a different adapter belongs in hub core.
- Schema evolution breaks adapters across teams — a canonical schema change requires all adapter teams to update simultaneously, causing coordination overhead. Mitigation: enforce backward-compatible evolution only (new fields must be optional with defaults); use a schema registry with compatibility checks in CI; never rename or remove a field — deprecate it and add a replacement.
- Audit log grows without a retention policy — the canonical message bus stores every message; at high volume this becomes a compliance risk (retaining PII longer than permitted) and a cost problem. Mitigation: define per-message-type retention policies at launch; encrypt PII fields at rest; implement crypto-shredding for customer erasure requests (GDPR).
- The hub becomes a bottleneck for rail onboarding — every new integration must go through the hub team to get a canonical model extension approved, slowing the business. Mitigation: publish the schema extension process and SLA; empower adapter teams to propose schema extensions via PR; the schema owner reviews, not gatekeeps.
- Partial canonical coverage hides data loss — an inbound adapter silently drops a field because the canonical model has no slot for it; the field is missing from all outbound adapters. Mitigation: inbound adapters must fail loudly on unmapped fields (not silently drop them) unless the field is explicitly marked as discarded in the adapter spec.