QCecuring - Enterprise Security Solutions

Certificate Management for DevOps Teams: Stop Treating Certs as an Afterthought

Devops 25 Feb, 2026 · 05 Mins read

DevOps teams deploy 50 services a week but manage certificates like it's 2010. Here's how to integrate certificate lifecycle into your CI/CD, IaC, and monitoring stack — the DevOps way.


You’ve automated everything else. Infrastructure is Terraform. Deployments are CI/CD. Monitoring is Prometheus + Grafana. Secrets are in Vault. Scaling is automatic.

But certificates? Somebody manually requests them from a portal. Somebody downloads a ZIP file. Somebody SCPs it to a server. Somebody remembers to renew it in 11 months. Maybe.

This is the gap. DevOps teams that deploy 50 services a week still manage certificates like it’s a manual IT process from 2010. And then they’re surprised when an expired cert takes down production at 2 AM on a Saturday.

Here’s how to fix it — treating certificates as infrastructure, not tickets.


The DevOps Certificate Manifesto

Certificates should be:

  1. Declared in code (not requested via email/portal)
  2. Provisioned automatically (not downloaded and uploaded manually)
  3. Renewed without human intervention (not tracked in spreadsheets)
  4. Monitored like any other infrastructure (not discovered during outages)
  5. Ephemeral where possible (short-lived, disposable, auto-replaced)

If your certificate process requires a human to do anything other than write the initial configuration, it’s not automated enough.


Pattern 1: Certificates as Code (Terraform)

Declare certificates in your infrastructure code. They’re provisioned alongside the infrastructure that uses them.

AWS (ACM + Route53 + ALB)

# Certificate declared in Terraform
resource "aws_acm_certificate" "api" {
  domain_name               = "api.example.com"
  subject_alternative_names = ["api-v2.example.com"]
  validation_method         = "DNS"

  lifecycle {
    create_before_destroy = true
  }
}

# DNS validation (automatic)
resource "aws_route53_record" "cert_validation" {
  for_each = {
    for dvo in aws_acm_certificate.api.domain_validation_options : dvo.domain_name => dvo
  }
  zone_id = data.aws_route53_zone.main.zone_id
  name    = each.value.resource_record_name
  type    = each.value.resource_record_type
  records = [each.value.resource_record_value]
  ttl     = 60
}

# Wait for validation
resource "aws_acm_certificate_validation" "api" {
  certificate_arn         = aws_acm_certificate.api.arn
  validation_record_fqdns = [for r in aws_route53_record.cert_validation : r.fqdn]
}

# Attach to ALB (certificate auto-renews via ACM)
resource "aws_lb_listener" "https" {
  load_balancer_arn = aws_lb.main.arn
  port              = 443
  protocol          = "HTTPS"
  certificate_arn   = aws_acm_certificate_validation.api.certificate_arn
  # ...
}

Result: terraform apply creates the certificate, validates it via DNS, and attaches it to the load balancer. ACM auto-renews. Zero ongoing maintenance.

GCP (Managed Certificate + Load Balancer)

resource "google_compute_managed_ssl_certificate" "api" {
  name = "api-cert"
  managed {
    domains = ["api.example.com"]
  }
}

resource "google_compute_target_https_proxy" "api" {
  name             = "api-proxy"
  url_map          = google_compute_url_map.api.id
  ssl_certificates = [google_compute_managed_ssl_certificate.api.id]
}

Pattern 2: Certificates in Kubernetes (cert-manager)

For Kubernetes workloads, cert-manager is the standard. Certificates are Kubernetes resources — managed the same way as Deployments and Services.

The GitOps Way

# In your Helm chart or Kustomize overlay:
# charts/my-app/templates/certificate.yaml
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: {{ .Release.Name }}-tls
  namespace: {{ .Release.Namespace }}
spec:
  secretName: {{ .Release.Name }}-tls-secret
  issuerRef:
    name: letsencrypt-prod
    kind: ClusterIssuer
  dnsNames:
  {{- range .Values.ingress.hosts }}
    - {{ . }}
  {{- end }}
  privateKey:
    algorithm: ECDSA
    size: 256
# values.yaml
ingress:
  hosts:
    - api.example.com
    - api-v2.example.com

Result: Deploy the app → certificate is automatically provisioned. Delete the app → certificate is cleaned up. Scale to 10 environments → each gets its own certificate automatically.

Monitoring cert-manager (Prometheus)

# ServiceMonitor for cert-manager metrics
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: cert-manager
spec:
  selector:
    matchLabels:
      app: cert-manager
  endpoints:
  - port: http-metrics

# Alert rules
- alert: CertificateNotReady
  expr: certmanager_certificate_ready_status{condition="False"} == 1
  for: 15m
  annotations:
    summary: "Certificate {{ $labels.name }} failed to issue"

- alert: CertificateExpiringSoon
  expr: (certmanager_certificate_expiration_timestamp_seconds - time()) / 86400 < 7
  annotations:
    summary: "Certificate {{ $labels.name }} expires in < 7 days"

Pattern 3: Certificates in CI/CD Pipelines

For services that aren’t in Kubernetes or cloud-managed load balancers:

GitHub Actions: Request + Deploy + Verify

name: Deploy with Certificate
on:
  push:
    branches: [main]

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Request certificate (if needed)
        run: |
          # Check if current cert expires within 30 days
          EXPIRY=$(ssh deploy@server "openssl x509 -enddate -noout -in /etc/ssl/certs/app.pem" | cut -d= -f2)
          EXPIRY_EPOCH=$(date -d "$EXPIRY" +%s)
          NOW_EPOCH=$(date +%s)
          DAYS_LEFT=$(( (EXPIRY_EPOCH - NOW_EPOCH) / 86400 ))
          
          if [ $DAYS_LEFT -lt 30 ]; then
            echo "Certificate expires in $DAYS_LEFT days — renewing"
            ssh deploy@server "certbot renew --deploy-hook 'systemctl reload nginx'"
          fi

      - name: Deploy application
        run: |
          # Your normal deployment steps
          ssh deploy@server "cd /app && git pull && docker-compose up -d"

      - name: Verify certificate
        run: |
          sleep 10
          CERT_INFO=$(echo | openssl s_client -connect app.example.com:443 -servername app.example.com 2>/dev/null | openssl x509 -noout -subject -enddate)
          echo "$CERT_INFO"
          echo "$CERT_INFO" | grep -q "app.example.com" || exit 1

Pattern 4: Certificate Monitoring as Code

Your monitoring stack should treat certificate expiry the same as disk space or memory usage:

Prometheus + Blackbox Exporter

# prometheus.yml — probe all TLS endpoints
scrape_configs:
  - job_name: 'tls-certificates'
    metrics_path: /probe
    params:
      module: [tcp_connect]
    static_configs:
      - targets:
        - api.example.com:443
        - app.example.com:443
        - admin.example.com:443
        - payments.example.com:443
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: blackbox-exporter:9115
# Alert rules
groups:
- name: certificates
  rules:
  - alert: TLSCertExpiring30Days
    expr: (probe_ssl_earliest_cert_expiry - time()) / 86400 < 30
    labels:
      severity: warning
    annotations:
      summary: "TLS cert for {{ $labels.instance }} expires in < 30 days"

  - alert: TLSCertExpiring7Days
    expr: (probe_ssl_earliest_cert_expiry - time()) / 86400 < 7
    labels:
      severity: critical
    annotations:
      summary: "CRITICAL: TLS cert for {{ $labels.instance }} expires in < 7 days"
      runbook: "https://wiki.internal/runbooks/certificate-renewal"

Grafana Dashboard

# Days until expiry for all monitored endpoints
(probe_ssl_earliest_cert_expiry - time()) / 86400

# Count of certificates expiring within 30 days
count(probe_ssl_earliest_cert_expiry - time() < 86400 * 30)

# Certificate issuer distribution
count by (issuer_cn) (probe_ssl_last_chain_info)

Pattern 5: Internal Certificates with Vault

For internal services that need mTLS or private certificates:

# In your deployment script or Helm chart:
# 1. Authenticate to Vault (using K8s service account or CI JWT)
export VAULT_TOKEN=$(vault write -field=token auth/kubernetes/login \
  role=my-app jwt=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token))

# 2. Request a short-lived certificate
vault write -format=json pki/issue/internal-service \
  common_name="my-app.production.svc.cluster.local" \
  ttl="72h" > /tmp/cert.json

# 3. Extract cert and key
jq -r '.data.certificate' /tmp/cert.json > /etc/ssl/app.pem
jq -r '.data.private_key' /tmp/cert.json > /etc/ssl/app-key.pem
jq -r '.data.ca_chain[]' /tmp/cert.json >> /etc/ssl/app.pem

# 4. Clean up
rm /tmp/cert.json

With Vault Agent (sidecar), this happens automatically with renewal:

# vault-agent-config.hcl
template {
  source      = "/vault/templates/cert.tpl"
  destination = "/etc/ssl/app.pem"
  command     = "nginx -s reload"
  # Vault Agent re-renders template when cert approaches expiry
  # Nginx reloads automatically
}

The Anti-Patterns (What NOT to Do)

❌ Certificates in Git

# NEVER commit certificates or keys to source control
# .gitignore should include:
*.pem
*.key
*.crt
*.pfx
*.p12

❌ Long-Lived Certificates for Dynamic Infrastructure

If your infrastructure scales up/down daily, don’t use 1-year certificates that require manual renewal. Use short-lived certificates (hours/days) that are issued at deploy time and expire naturally.

❌ Shared Wildcard Certificates Across Environments

# DON'T: Same wildcard cert on dev, staging, and production
*.example.com → deployed everywhere

# DO: Separate certificates per environment
dev.example.com   → cert from Let's Encrypt (auto-renewed)
staging.example.com → cert from Let's Encrypt (auto-renewed)
api.example.com   → cert from Let's Encrypt (auto-renewed)

Shared wildcards mean: one compromised environment exposes the key for all environments.

❌ Manual Renewal Reminders

If your certificate management strategy involves calendar reminders or Jira tickets for renewal, it’s not automated — it’s a human process pretending to be managed. Automate it or accept that you’ll have outages.


The Maturity Model for DevOps Certificate Management

LevelDescriptionCharacteristics
0ChaosManual everything. Certs expire without warning.
1TrackedSpreadsheet/monitoring exists. Still manual renewal.
2AutomatedACME/cert-manager handles renewal. Monitoring alerts on failure.
3CodifiedCertificates declared in IaC. Provisioned with infrastructure.
4EphemeralShort-lived certs. No renewal needed. Issued at deploy, expire naturally.

Most DevOps teams are at Level 1-2. The goal is Level 3-4.


FAQ

Q: Should every service have its own certificate? A: Yes. One certificate per service (or per endpoint). Shared certificates (especially wildcards) create shared risk. If one service’s key is compromised, all services sharing that certificate are affected.

Q: How do I handle certificates for local development? A: Use mkcert — it generates locally-trusted certificates for localhost and custom domains. No browser warnings, no self-signed cert hacks. For team-wide dev environments, use a shared private CA with cert-manager.

Q: What about certificates for non-HTTP services (databases, message queues)? A: Same principles apply. Use cert-manager Certificate resources (mount the Secret as a volume), Vault PKI (request at startup), or your CLM platform’s agent. The protocol doesn’t matter — the lifecycle management is the same.

Q: How do I convince my team to invest in certificate automation? A: Calculate the cost of your last certificate outage (or the next one). Include: engineer time (emergency response at 2 AM), revenue loss, customer trust impact, and the post-mortem time. Compare to the cost of setting up cert-manager (a few hours) or ACME (an afternoon). The ROI is immediate.

Free SSL Tools

CSR generator, chain validator, cert decoder — 18 browser-based tools.

Explore Tools

Related Insights

CLM

QCecuring vs Venafi (CyberArk): Certificate Lifecycle Management Compared

A detailed, honest comparison of QCecuring CertSecure Manager vs Venafi TLS Protect (now CyberArk Machine Identity Security) for enterprise certificate lifecycle management. Features, pricing, deployment, architecture, and who each platform is best for.

By Shivam sharma

10 May, 2026 · 08 Mins read

CLMComparisonsEnterprise

SSL/TLS

OpenSSL Complete Guide: Commands, Configuration & Troubleshooting

Master OpenSSL with this comprehensive guide covering certificate generation, CSR creation, chain verification, TLS debugging, format conversion, and production hardening. Every command you'll ever need.

By Shivam sharma

10 May, 2026 · 08 Mins read

SSL/TLSPractical GuidesDevOps

Pki

47-Day TLS Certificates: How to Prepare for the New CA/B Forum Standard

The CA/Browser Forum voted to reduce maximum TLS certificate validity to 47 days by 2029. Here's the timeline, what it means for your infrastructure, and how to prepare before it's enforced.

By Amarjeet shukla

07 May, 2026 · 06 Mins read

PkiClmCompliance

Ready to Secure Your Enterprise?

Experience how our cryptographic solutions simplify, centralize, and automate identity management for your entire organization.

Stay ahead on cryptography & PKI

Get monthly insights on certificate management, post-quantum readiness, and enterprise security. No spam.

We respect your privacy. Unsubscribe anytime.