Pattern: CLI Tool¶
Quick facts
- Category: Scripts & Automation
- Maturity: Adopt
- Typical team size: 1-2 engineers
- Typical timeline to MVP: 1-3 weeks
- Last reviewed: 2026-05-02 by Architecture Team
1. Context¶
Use this pattern when:
- Building internal developer tooling, ops automation, or utilities that engineers run from a terminal
- The interaction model is command-driven (verbs + flags), not a UI or long-running daemon
- The tool needs to be distributed to teammates or CI environments, not just run locally by the author
Do NOT use this pattern when:
- The operation takes more than a few minutes and needs background execution — wrap it in a scheduled job or serverless function instead
- The primary audience is non-technical users who need a GUI
- The tool needs to maintain long-running state or listen for events — that is a service, not a CLI tool
- You only need to run the script once — write a plain script, not a packaged CLI
2. Problem it solves¶
Engineering teams accumulate repetitive manual operations: rotating credentials, triggering deployments, querying internal systems, seeding test data. Each of these starts as a README step, becomes a Bash one-liner, then grows until nobody trusts it. A properly packaged CLI tool gives the operation a stable interface, a help system, testable code, and a version history — so the team can confidently hand it to a new hire on day one.
3. Solution overview¶
System context (C4 Level 1)¶
flowchart LR
Dev((Developer\nor CI runner)) --> CLI[CLI Tool]
CLI --> API[Internal / External API]
CLI --> FS[Local Filesystem]
CLI --> DB[(Database)]
CLI --> Cloud[Cloud Provider API\nAWS / GCP / Azure]
Container view (C4 Level 2)¶
flowchart TB
subgraph CLI Process
Entry[Entry point\ntyper app]
ArgParser[Argument & Option Parser\nTyper + type hints]
ConfigLoader[Config Loader\nTOML + env var override]
CmdModules[Command Modules\none per sub-command group]
HTTPClient[HTTP Client\nhttpx]
OutputFmt[Output Formatter\nRich tables / JSON / plain]
CredStore[Credential Store\nkeyring / env vars]
end
subgraph External
API[REST / GraphQL API]
DB[(Database)]
CloudAPI[Cloud SDK\nboto3 / google-cloud]
end
Entry --> ArgParser
ArgParser --> ConfigLoader
ConfigLoader --> CmdModules
CmdModules --> HTTPClient
CmdModules --> OutputFmt
CmdModules --> CredStore
HTTPClient --> API
CmdModules --> DB
CmdModules --> CloudAPI
4. Technology stack¶
| Layer | Primary choice | Alternatives | Notes |
|---|---|---|---|
| Language | Python 3.12+ with uv | Go, Rust | Python for tools used by Python teams with rich library dependencies; Go or Rust when distributing a single binary to machines without Python — see ADR-0002 |
| Argument parsing | Typer | Click, argparse | Typer is Click with type annotations; auto-generates --help from type hints; argparse is stdlib but verbose for sub-commands |
| Configuration | tomllib (stdlib) + python-dotenv | pydantic-settings, Dynaconf | tomllib (Python 3.11+) for file config; dotenv for environment overrides; pydantic-settings if you need validation and env-var casting |
| Terminal output | Rich | colorama, tabulate, termcolor | Rich renders tables, progress bars, and syntax-highlighted output with zero config; outputs plain text automatically when stdout is piped |
| HTTP client | httpx | requests | httpx for async-capable and HTTP/2 clients; requests is fine for simple sync scripts |
| Credential storage | keyring | Environment variables, ~/.netrc |
keyring uses the OS credential store (macOS Keychain, Windows DPAPI, SecretService on Linux); fall back to env vars in CI |
| Testing | pytest + Typer's CliRunner |
unittest | Typer's CliRunner (inherited from Click) lets you invoke commands in-process without subprocess overhead |
| Packaging | uv build → PyPI or private index | Go goreleaser for binaries, Homebrew tap |
uv can publish to PyPI; for internal tools, a private PyPI (AWS CodeArtifact or Nexus) avoids leaking tooling to the public index |
| CI/CD | GitHub Actions | GitLab CI | Lint (ruff), type-check (mypy), test, publish on tag push |
5. Non-functional characteristics¶
| Concern | Profile |
|---|---|
| Scalability | A CLI runs as a single process per invocation. There is no scalability dimension in the traditional sense — if a single invocation becomes slow, profile and optimise the bottleneck (usually a slow API call or large local file), or offload to a background job. |
| Availability target | Not a service. Availability = "the binary installs correctly and exits 0 on a valid invocation". Track this with a smoke test in CI. |
| Latency target | Cold start (import time) should be under 300ms for interactive tools. Avoid importing heavy libraries (pandas, boto3) at the top level; import inside the command function so unrelated sub-commands stay fast. |
| Security posture | Credentials must never be in command-line flags (they appear in ps output and shell history). Use --password-stdin, environment variables, or keyring. Log the command invoked (minus secrets) for audit purposes. Validate all inputs before passing to shell or SQL. |
| Data residency | CLI tools typically operate on data in transit between systems; they do not store data themselves. Ensure any temp files written to disk are cleaned up and respect local data handling policies. |
| Compliance fit | CLI tools are internal tooling; formal compliance certifications rarely apply. Exception: tools that access PII or production databases should log invocations for SOC 2 audit trails. |
6. Cost ballpark¶
CLI tools have near-zero infrastructure cost — they run on the developer's machine or in a CI runner.
| Scale | Users | Monthly cost | Cost drivers |
|---|---|---|---|
| Small | 1-10 internal users | $0 - $10 | GitHub Actions CI minutes; no infrastructure |
| Medium | 10-100 users, private package index | $10 - $100 | AWS CodeArtifact or Nexus private registry; CI compute |
| Large | 100+ users, cross-team distribution | $50 - $300 | Private registry, code signing, multi-platform binary builds, documentation hosting |
7. LLM-assisted development fit¶
| Aspect | Rating | Notes |
|---|---|---|
| Command scaffolding and help text | ★★★★★ | Excellent — Typer / Click patterns are extremely well-represented. Generate a full sub-command skeleton from a spec. |
| Argument parsing and validation | ★★★★★ | Generates correct type annotations, validators, and --help strings reliably. |
| Output formatting with Rich | ★★★★ | Good; verify table column alignment and colour choices manually in a real terminal. |
| Error handling and user-facing messages | ★★★★ | Produces reasonable error messages; review for clarity with a non-expert user in mind. |
| Architecture decisions | ★ | Don't outsource — specifically the Python vs Go decision has real deployment consequences. Use ADRs. |
Recommended workflow: Write a spec.md listing every sub-command, its flags, and expected output format. Feed it to the LLM to generate the Typer skeleton, then fill in the business logic. Add a CliRunner test for every command's happy path before merging.
8. Reference implementations¶
- Public reference: tiangolo/typer — Typer itself is the canonical reference; the
docs_src/directory contains runnable examples for every feature - Public reference: pallets/click — examples — Click examples including multi-command applications and complex option types
- Public reference: cli/cli — GitHub's own
ghCLI, written in Go; excellent reference for Go-based CLI architecture with Cobra - Internal case study: Add your anonymised internal example here
9. Related decisions (ADRs)¶
10. Known risks & gotchas¶
- Credentials leaking into shell history or process list — A flag like
--api-key=secretis visible inps auxand persists in~/.zsh_history. Mitigation: never accept secrets as positional arguments or named flags; use--token-stdin(read from stdin), environment variables, orkeyring.get_password(). - Import-time slowness makes the tool feel sluggish — Importing
boto3,pandas, orgoogle-cloud-*at module level adds 300–800ms to every invocation, even for commands that do not use them. Mitigation: lazy-import heavy libraries inside the command function body, not at the top of the file. - Breaking changes in sub-command interfaces — A renamed flag breaks teammates' shell aliases and CI scripts. Mitigation: treat the CLI interface as a public API; version it, publish a changelog, and keep the old flag as a deprecated alias for at least one release cycle.
- Untested error paths produce confusing output — LLM-generated code tends to re-raise raw exceptions, printing Python tracebacks to end users. Mitigation: catch known error types and print a human-readable message with
typer.echo(..., err=True)andraise typer.Exit(1); reserve raw tracebacks for a--debugflag. - Single-platform packaging — A tool built on macOS with
uv buildmay fail on Linux CI or on a Windows developer machine due to platform-specific dependencies. Mitigation: runuv sync+ smoke tests in CI on all target platforms (matrix build); document supported platforms in the README.