Skip to content

Pattern: CI/CD Pipeline

Quick facts

  • Category: Scripts & Automation
  • Maturity: Adopt
  • Typical team size: 1-2 engineers (to build), whole engineering team (to consume)
  • Typical timeline to MVP: 1-2 weeks
  • Last reviewed: 2026-05-02 by Architecture Team

1. Context

Use this pattern when:

  • Automating the build, test, security scanning, and deployment of any software project
  • You want every merge to main to be deployable (or deployed) without manual steps
  • The team has more than one engineer — at that point, manual deploy coordination becomes a bottleneck and a source of incidents

Do NOT use this pattern when:

  • The "deployment" is publishing a static artifact (Python package, npm package) with no infrastructure changes — the pipeline still applies, but the deploy stage simplifies to a registry publish
  • You need complex stateful workflows with human approval, compensation, and retries across days — that is an orchestrated workflow (Temporal, Step Functions), not a CI/CD pipeline
  • You are in a highly regulated environment where the deployment tool must produce a tamper-evident audit trail with hardware attestation — evaluate specialised release tooling (e.g., SLSA Level 3 build systems)

2. Problem it solves

Without automation, deployments are artisanal: engineers SSH into servers, run scripts from memory, and skip steps when under pressure. Bugs that automated tests would catch reach production. Security patches sit undeployed for weeks. A CI/CD pipeline makes every change go through the same repeatable gates — build, test, scan, package, deploy — and creates an audit trail of exactly what was deployed, when, and by whom.

3. Solution overview

System context (C4 Level 1)

flowchart LR
    Dev((Developer)) -->|git push / PR| Repo[Git Repository\nGitHub]
    Repo -->|webhook| Pipeline[CI/CD Pipeline\nGitHub Actions]
    Pipeline -->|push image| Registry[Container Registry\nECR / GHCR]
    Pipeline -->|deploy| Dev_Env[Dev Environment]
    Pipeline -->|deploy| Staging[Staging Environment]
    Pipeline -->|deploy with approval| Prod[Production Environment]
    Pipeline --> Notify[Notifications\nSlack / PagerDuty]

Container view (C4 Level 2)

flowchart TB
    subgraph Triggers
        PR[Pull Request\nor push to branch]
        MergeMain[Merge to main]
        TagPush[Git tag push\ne.g. v1.2.3]
        Schedule[Scheduled run\nnightly scan]
    end
    subgraph CI Stages - always run on PR
        Build[Build\ncompile / bundle / lint]
        UnitTest[Unit Tests\npytest / jest / go test]
        IntTest[Integration Tests\ndocker-compose test env]
        SAST[SAST\nCodeQL / Semgrep]
        DepScan[Dependency Scan\nDependabot + Trivy]
        SecretScan[Secret Scan\ngitleaks]
    end
    subgraph CD Stages - main and tags only
        DockerBuild[Docker Build\nBuildKit multi-stage]
        DockerPush[Push to Registry\nECR / GHCR]
        DeployDev[Deploy to Dev\nauto on merge]
        DeployStaging[Deploy to Staging\nauto on merge]
        DeployProd[Deploy to Prod\nmanual approval gate]
    end
    subgraph Notifications
        SlackOK[Slack — success\non prod deploy]
        SlackFail[Slack — failure\non any stage]
    end

    PR --> Build
    MergeMain --> Build
    TagPush --> Build
    Schedule --> DepScan
    Build --> UnitTest
    UnitTest --> IntTest
    IntTest --> SAST
    SAST --> DepScan
    DepScan --> SecretScan
    SecretScan -->|on main/tag| DockerBuild
    DockerBuild --> DockerPush
    DockerPush --> DeployDev
    DeployDev --> DeployStaging
    DeployStaging -->|approval required| DeployProd
    DeployProd --> SlackOK
    SecretScan -->|on failure| SlackFail

4. Technology stack

Layer Primary choice Alternatives Notes
CI/CD platform GitHub Actions GitLab CI, CircleCI, Buildkite, Jenkins GitHub Actions if code is on GitHub; Buildkite for large self-hosted fleets needing fine-grained billing; avoid Jenkins for new projects
Container build Docker with BuildKit Podman, Kaniko BuildKit enables layer caching, multi-platform builds, and --secret mount; Kaniko for building inside Kubernetes without a Docker daemon
Container registry AWS ECR GitHub Container Registry (GHCR), Docker Hub ECR if deploying to AWS; GHCR is free for public repos and integrates natively with Actions; Docker Hub rate-limits unauthenticated pulls
Cloud credentials OIDC (federated identity) Long-lived AWS access keys as secrets OIDC lets GitHub Actions assume an IAM role without storing long-lived credentials; set this up first — retrofitting is painful
Dependency scanning Dependabot + Trivy Snyk, Grype, Renovate Dependabot for automated PR updates; Trivy for container image CVE scanning in the pipeline; Renovate for monorepos or non-GitHub hosts
SAST CodeQL Semgrep, SonarQube CodeQL is free for public repos and covers most languages; Semgrep for custom organisation-specific rules
Secrets scanning gitleaks TruffleHog, GitHub Secret Scanning Run gitleaks on every push — pre-commit hook and CI; GitHub Secret Scanning catches pushed secrets but only after the fact
Deployment target AWS ECS (rolling update) or Lambda Kubernetes + kubectl, AWS CodeDeploy, Heroku Rolling update on ECS is the simplest zero-downtime deploy; ArgoCD if the org has Kubernetes and wants GitOps
Notifications Slack (GitHub Actions webhook step) PagerDuty, email, Microsoft Teams Notify on every production deploy success and on any pipeline failure

5. Non-functional characteristics

Concern Profile
Scalability GitHub Actions scales runners on demand (GitHub-hosted) or requires fleet management (self-hosted). GitHub-hosted runners handle most workloads; add self-hosted runners only when build times exceed 20 minutes or costs become significant.
Availability target The pipeline is a critical path for deploying fixes. GitHub Actions has a 99.9% SLA. Mitigation for downtime: keep a documented manual deploy runbook; the pipeline should be a convenience, not the only path to production.
Latency target PR feedback in under 10 minutes is the target for developer experience. Parallelise independent stages (unit tests, SAST, dep scan) rather than running them sequentially. Cache pip/npm/cargo dependencies between runs.
Security posture Use OIDC for cloud credentials. Pin third-party Actions to a commit SHA, not a mutable tag (uses: actions/checkout@v4 is @<SHA> under the hood — prefer the SHA). Restrict GITHUB_TOKEN permissions at the workflow level. Never print secrets in logs; use ::add-mask:: for dynamic secrets.
Data residency GitHub-hosted runners execute in Azure datacenters; ensure this is acceptable for your data classification. For sensitive workloads, use self-hosted runners inside your VPC.
Compliance fit SOC 2 ✓ — pipeline creates an immutable audit trail of every deploy. GDPR: ensure build logs do not contain PII (test data, error messages with real user data). PCI-DSS: the pipeline must not have access to production card data; use environment isolation.

6. Cost ballpark

Indicative monthly USD cost. GitHub-hosted runner minutes are the main variable; free tier is generous for small teams.

Scale Pipeline runs / month Monthly cost Cost drivers
Small < 1,000 $0 - $20 GitHub free tier (2,000 min/month for public; 500 for private); ECR storage
Medium 1,000 - 10,000 $50 - $300 GitHub Actions minutes overage, ECR data transfer, self-hosted runner EC2 if used
Large 10,000+ $300 - $2,000 Self-hosted runner fleet, artifact storage, Snyk or SonarQube commercial tier, multiple environment accounts

7. LLM-assisted development fit

Aspect Rating Notes
Workflow YAML scaffolding ★★★★★ Excellent — GitHub Actions syntax and common job patterns are very well-represented. Generate a full workflow from a description of stages.
Security scanning integration ★★★★ Correctly wires up CodeQL, Trivy, and gitleaks; always review the permissions granted to each step.
Docker multi-stage build files ★★★★ Generates efficient multi-stage Dockerfiles; verify the final stage does not include build tools or development dependencies.
Secret management and OIDC setup ★★★ Knows the concept; OIDC trust policy JSON and IAM role ARN wiring require careful manual review — mistakes here create privilege escalation paths.
Architecture decisions Don't outsource — platform choice, runner strategy, and approval gate design have long-term team-wide consequences. Use ADRs.

Recommended workflow: Generate the workflow YAML and Dockerfile with the LLM, then manually audit every permissions: block and every third-party Action used. Pin Actions to commit SHAs before merging. Add the gitleaks scan first — it's the fastest return on investment.

8. Reference implementations

  • Public reference: actions/starter-workflows — official GitHub Actions starter templates for dozens of languages and deployment targets
  • Public reference: docker/build-push-action — canonical Action for BuildKit image builds with cache and multi-platform support
  • Public reference: trufflesecurity/trufflehog and gitleaks/gitleaks — the two leading open-source secret scanners, both with GitHub Actions integrations
  • Internal case study: Add your anonymised internal example here
  • No ADRs recorded yet. Candidates: self-hosted vs GitHub-hosted runners, ArgoCD GitOps vs Actions-push deploys, monorepo pipeline strategy.

10. Known risks & gotchas

  • Mutable Action tags create supply-chain riskuses: some-org/some-action@v3 resolves to whatever commit that tag points to today, which can change. A compromised Action maintainer can inject malicious code. Mitigation: pin all third-party Actions to a full commit SHA (uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683); use Dependabot to keep SHAs up to date.
  • Long-lived AWS credentials in secrets are a liability — Rotating them is manual and often forgotten; a leak compromises the account. Mitigation: set up OIDC federation on day one (aws-actions/configure-aws-credentials with role-to-assume); delete any existing long-lived access keys from CI.
  • Flaky integration tests slow down every PR — A test that fails intermittently trains developers to re-run pipelines rather than investigate, and hides real failures. Mitigation: quarantine flaky tests immediately into a separate job with a continue-on-error: true flag and a tracking issue; fix or delete within one sprint.
  • Build cache invalidation causes silently stale artifacts — A Docker layer cache hit may include a dependency that was patched for a CVE. Mitigation: run a full no-cache build on a nightly schedule (not just PR builds) and treat Trivy scan failures as pipeline failures.
  • Approval gates bypassed under incident pressure — When production is on fire, engineers merge directly to bypass staging. Mitigation: require approval gates via GitHub Environments (not just convention) and document a formal break-glass procedure that still creates an audit trail (e.g., a required incident ticket number in the approval comment).