The Backstory and Where We All Come From

Passwords, private keys, and API keys are spread across systems and easy to compromise — that’s the normal situation in many environments. Most have rogue credentials scattered across servers, and the workforce still relies on passwords rather than moving toward Zero Trust IAM.

This must change, and fast.

We all know this problem exists somewhere in our environment. It’s easier in cloud or PaaS; even OpenShift or Kubernetes make it more manageable. But a broad, system-wide architecture with support for a wide selection of runtimes, operating systems, and cloud or on-prem server farms makes it hard. So hard, in fact, that I regularly stumble upon servers with plaintext files containing passwords, API keys, and certificates.

We need to design an architecture that makes it hard to make mistakes but easy to onboard new services and applications. That’s where a proper secrets management architecture comes in.


What Is a Secret?

Before designing anything, agree on a definition. A secret is any credential that grants access to a system or resource:

  • Passwords and passphrases
  • API keys and tokens (including short-lived JWTs)
  • TLS/SSL private keys and certificates
  • SSH keys
  • Database connection strings
  • Encryption keys (symmetric and asymmetric)
  • OAuth client secrets

If it grants access, it’s a secret and needs to be managed accordingly.


The Secret Lifecycle

Every secret goes through the same lifecycle, and your architecture needs to handle all of it:

  1. Creation — who or what generates the secret, and how?
  2. Storage — where does it live, encrypted at rest?
  3. Distribution — how does it get to the runtime that needs it?
  4. Rotation — how and how often does it change?
  5. Revocation — how quickly can you kill a compromised secret?
  6. Audit — who accessed what, and when?

Weak handling at any of these stages creates risk. Most breaches happen at distribution (secrets in environment variables, config files, or container images) and rotation (secrets that are never rotated).


Architecture Principles

Centralize, Don’t Consolidate

Use a single secrets management platform — HashiCorp Vault, AWS Secrets Manager, Azure Key Vault, Infisical, or similar — as the authoritative source. The goal is a single place to audit, rotate, and revoke. This is not the same as putting all eggs in one basket: access policies, namespaces, and secret engines let you isolate secrets by team, environment, and classification even within a single platform.

Short-Lived Over Long-Lived

Prefer dynamic, short-lived credentials over static, long-lived ones wherever possible. A database credential that expires in 15 minutes has a dramatically smaller blast radius than one that’s been sitting in a .env file for three years. Vault’s dynamic secrets engine is a good example of this pattern.

Least Privilege at Every Layer

Each application or service should only have access to the secrets it needs, scoped to the environment it runs in. A dev service should never be able to read prod secrets. Machine identities (service accounts, IAM roles, Kubernetes service accounts) should be used instead of shared credentials.

No Secrets in Code or Config

Secrets must never be committed to source control — not even in private repositories. Use pre-commit hooks and secret scanning in CI (e.g., infisical scan, trufflehog, gitleaks) to catch mistakes before they land. If a secret is ever committed, treat it as compromised and rotate immediately, even after rewriting history.

Inject at Runtime

Secrets should be injected at runtime, not baked in at build time. Common patterns:

  • Environment variable injection via a secrets manager sidecar or init container (Kubernetes)
  • Direct API calls from the application using an SDK with short-lived tokens
  • File-based injection into a tmpfs or in-memory volume (never on disk in production)

Depending on your industry and geography, secrets management intersects with several regulatory frameworks:

  • GDPR — encryption keys protecting personal data must be auditable and revocable
  • PCI DSS — cryptographic keys must be rotated regularly, with documented procedures
  • SOC 2 — access to credentials must be logged and reviewable
  • ISO 27001 — requires a formal key management policy

Regardless of which frameworks apply, the common thread is: audit logs, rotation schedules, and access control policies must be documented and demonstrable. Your secrets manager should produce logs that feed into your SIEM.


Rotation Strategy

Rotation is where most organizations fail. The usual pattern is: rotation is planned, then deprioritized, then forgotten until an incident forces it.

Build rotation in from the start:

  • Automated rotation for all secrets where the target system supports it (databases, cloud IAM, etc.)
  • Rotation windows defined per secret class (e.g., API keys: 90 days, TLS certs: before expiry minus 30 days)
  • Break-glass procedures for emergency revocation — tested, documented, and rehearsed
  • Alerting when secrets approach expiry or haven’t been rotated within policy

If you can’t automate rotation for a given secret, that’s a signal to push the upstream system to support it, or to accept the risk explicitly and document it.


Operational Runbook Checklist

When rolling out a secrets management architecture, work through this list:

  • Inventory all existing secrets and their current storage locations
  • Classify secrets by sensitivity and owning team
  • Define access policies per environment (dev / staging / prod)
  • Set up machine identities for all services — no shared credentials
  • Enable audit logging and ship logs to SIEM
  • Deploy pre-commit hooks and CI secret scanning
  • Define and document rotation schedules
  • Test emergency revocation end-to-end
  • Review and sign off on compliance mapping

Summary

Secrets management is not a product you buy — it’s an architecture you design and operate. A platform like Vault or Secrets Manager gives you the primitives, but the hard work is in the policies, the lifecycle management, and the culture shift required to stop treating credentials as an afterthought.

Start with an inventory, centralize early, automate rotation, and instrument everything for audit. The cost of getting this right is much lower than the cost of a breach.