QCecuring - Enterprise Security Solutions

Kubernetes Certificate Management: cert-manager, Service Mesh, and Beyond

Pki 15 Mar, 2026 · 04 Mins read

Kubernetes uses certificates at every layer — cluster infrastructure, ingress, and service-to-service. Here's how to manage them all with cert-manager, Istio, and proper monitoring to prevent outages.


Kubernetes has a certificate problem that most teams don’t realize until something breaks.

The cluster itself uses certificates for internal authentication (API server, kubelets, etcd). Your applications need certificates for ingress TLS termination. Your microservices need certificates for mTLS between pods. And each layer has different management mechanisms, different expiry timelines, and different failure modes.

A certificate expiring at the cluster infrastructure layer takes down the entire cluster (kubectl stops working, pods can’t be scheduled, nothing deploys). A certificate expiring at the ingress layer takes down your public-facing services. A certificate expiring in the service mesh breaks internal communication.

This guide covers all three layers: what certificates exist, how to manage them, and how to monitor them so nothing expires without warning.


Layer 1: Cluster Infrastructure Certificates

These are the certificates Kubernetes itself uses for internal component authentication. They’re created at cluster initialization and expire after 1 year (kubeadm default).

What Exists

/etc/kubernetes/pki/
├── ca.crt + ca.key                    # Cluster CA (signs everything below)
├── apiserver.crt + apiserver.key      # API server TLS (what kubectl connects to)
├── apiserver-kubelet-client.crt       # API server → kubelet authentication
├── apiserver-etcd-client.crt          # API server → etcd authentication
├── front-proxy-ca.crt + key           # Aggregation layer CA
├── front-proxy-client.crt + key       # Aggregation layer client
├── etcd/
│   ├── ca.crt + ca.key               # etcd CA (separate from cluster CA)
│   ├── server.crt + server.key       # etcd server TLS
│   ├── peer.crt + peer.key           # etcd peer communication
│   └── healthcheck-client.crt + key  # etcd health checks
└── sa.key + sa.pub                    # Service account token signing

The Danger

kubeadm clusters: Certificates expire after 1 year. If nobody renews them:

  • kubectl returns “certificate has expired”
  • New pods can’t be scheduled
  • Existing pods continue running but can’t be managed
  • The cluster is effectively frozen

Managed clusters (EKS, GKE, AKS): The provider handles infrastructure certificate rotation automatically. You never see these certificates.

How to Manage

# Check expiry dates (kubeadm)
kubeadm certs check-expiration

# Renew all certificates
kubeadm certs renew all
systemctl restart kubelet

# Automate with a cron job (run monthly)
0 0 1 * * /usr/bin/kubeadm certs renew all && systemctl restart kubelet

Monitoring

# Prometheus alert for cluster cert expiry
- alert: KubernetesClusterCertExpiringSoon
  expr: |
    (kube_certificate_expiration_timestamp_seconds - time()) / 86400 < 30
  labels:
    severity: critical
  annotations:
    summary: "Kubernetes cluster certificate expires in < 30 days"

Layer 2: Ingress Certificates (cert-manager)

These are the TLS certificates that terminate HTTPS for your public-facing services. cert-manager is the standard tool for managing them.

Setup

# 1. Install cert-manager
# kubectl apply -f https://github.com/cert-manager/cert-manager/releases/latest/download/cert-manager.yaml

# 2. Create a ClusterIssuer (Let's Encrypt)
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: platform-team@example.com
    privateKeySecretRef:
      name: letsencrypt-prod-key
    solvers:
    - http01:
        ingress:
          class: nginx
    - dns01:
        cloudDNS:
          project: my-gcp-project
      selector:
        dnsNames:
        - "*.example.com"

# 3. Annotate your Ingress (automatic certificate provisioning)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: app-ingress
  annotations:
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
  tls:
  - hosts:
    - app.example.com
    secretName: app-tls  # cert-manager creates and manages this Secret
  rules:
  - host: app.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: app
            port:
              number: 80

That’s it. cert-manager handles: key generation → CSR → ACME challenge → certificate issuance → Secret creation → renewal at 2/3 lifetime.

For Internal Services (Private CA)

# Vault issuer for internal certificates
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: vault-internal
spec:
  vault:
    server: https://vault.internal:8200
    path: pki_int/sign/internal-service
    auth:
      kubernetes:
        role: cert-manager
        mountPath: /v1/auth/kubernetes

---
# Internal service certificate
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: payment-api-tls
  namespace: production
spec:
  secretName: payment-api-tls
  issuerRef:
    name: vault-internal
    kind: ClusterIssuer
  dnsNames:
  - payment-api.production.svc.cluster.local
  - payment-api.internal.example.com
  duration: 720h      # 30 days
  renewBefore: 240h   # Renew 10 days before expiry
  privateKey:
    algorithm: ECDSA
    size: 256

Monitoring cert-manager

# Prometheus alerts for cert-manager
- alert: CertManagerCertNotReady
  expr: certmanager_certificate_ready_status{condition="False"} == 1
  for: 30m
  labels:
    severity: warning
  annotations:
    summary: "Certificate {{ $labels.name }} in {{ $labels.namespace }} is not ready"

- alert: CertManagerCertExpiringSoon
  expr: (certmanager_certificate_expiration_timestamp_seconds - time()) < 604800
  labels:
    severity: critical
  annotations:
    summary: "Certificate {{ $labels.name }} expires in < 7 days"

Layer 3: Service-to-Service mTLS (Service Mesh)

For encrypting and authenticating all pod-to-pod traffic.

Istio (Most Common)

# Enable strict mTLS mesh-wide
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
  namespace: istio-system
spec:
  mtls:
    mode: STRICT

# Istio automatically:
# - Injects Envoy sidecar into every pod
# - Issues per-pod certificates (24h validity, SPIFFE ID)
# - Rotates certificates before expiry
# - Encrypts all pod-to-pod traffic with mTLS
# - No application code changes required

Linkerd (Simpler Alternative)

# Install Linkerd (mTLS on by default)
linkerd install | kubectl apply -f -

# Inject into a namespace
kubectl annotate namespace production linkerd.io/inject=enabled

# Verify mTLS is active
linkerd viz edges -n production
# Shows: all connections secured with mTLS

Monitoring Service Mesh Certificates

# Istio: check certificate status for a pod
istioctl proxy-config secret <pod-name> -n production
# Shows: certificate chain, expiry, SPIFFE ID

# Linkerd: check identity
linkerd identity -n production
# Shows: certificate issuer, expiry, trust anchors

The Complete Monitoring Stack

Monitor all three layers with a unified approach:

# Prometheus rules covering all certificate layers
groups:
- name: certificate-monitoring
  rules:
  # Layer 1: Cluster infrastructure
  - alert: ClusterCertExpiring
    expr: (apiserver_client_certificate_expiration_seconds - time()) / 86400 < 30

  # Layer 2: Ingress (cert-manager)
  - alert: IngressCertExpiring
    expr: (certmanager_certificate_expiration_timestamp_seconds - time()) / 86400 < 7

  # Layer 3: Service mesh
  - alert: IstioCertError
    expr: increase(istio_agent_cert_rotation_failure_total[5m]) > 0

  # External probe (what clients actually see)
  - alert: ExternalCertExpiring
    expr: (probe_ssl_earliest_cert_expiry - time()) / 86400 < 14

Common Failures and Fixes

cert-manager: “error presenting challenge”

Cause: DNS-01 challenge can’t create TXT record (wrong credentials, missing permissions).

# Debug
kubectl describe challenge <challenge-name> -n <namespace>
kubectl logs -n cert-manager deploy/cert-manager -f | grep -i error

Ingress serves old certificate after renewal

Cause: Ingress controller didn’t detect the Secret update.

# Force ingress controller to reload
kubectl rollout restart deployment ingress-nginx-controller -n ingress-nginx

# Or verify the controller watches Secrets (nginx-ingress does by default)

Istio: “upstream connect error or disconnect/reset before headers”

Cause: mTLS handshake failure between pods (certificate expired, CA mismatch, sidecar not injected).

# Check if sidecar is injected
kubectl get pod <pod> -o jsonpath='{.spec.containers[*].name}'
# Should include "istio-proxy"

# Check certificate validity
istioctl proxy-config secret <pod> | grep "VALID"

FAQ

Q: Do I need cert-manager if I use a service mesh? A: Yes — they handle different layers. cert-manager manages ingress certificates (public-facing TLS). The service mesh manages internal mTLS certificates (pod-to-pod). They don’t overlap.

Q: What about Gateway API (replacing Ingress)? A: cert-manager supports Gateway API via the gateway-shim controller. Same concept: annotate your Gateway/HTTPRoute, cert-manager provisions the certificate.

Q: How do I handle certificates for non-HTTP services (gRPC, databases)? A: Use cert-manager Certificate resources directly (not via Ingress annotations). Mount the resulting Secret as a volume in your pod. Your application loads the cert from the mounted path.

Q: Should I use Let’s Encrypt or a private CA for internal services? A: Private CA (Vault, AWS PCA, self-signed CA via cert-manager). Internal services don’t need public trust. Private CAs give you: no rate limits, custom validity periods, no external dependency, and no information leakage to CT logs.

PKI Maturity Assessment

Evaluate your PKI infrastructure in 5 minutes and get a tailored improvement plan.

Take Assessment

Related Insights

SSL/TLS

OpenSSL Complete Guide: Commands, Configuration & Troubleshooting

Master OpenSSL with this comprehensive guide covering certificate generation, CSR creation, chain verification, TLS debugging, format conversion, and production hardening. Every command you'll ever need.

By Shivam sharma

10 May, 2026 · 08 Mins read

SSL/TLSPractical GuidesDevOps

Pki

47-Day TLS Certificates: How to Prepare for the New CA/B Forum Standard

The CA/Browser Forum voted to reduce maximum TLS certificate validity to 47 days by 2029. Here's the timeline, what it means for your infrastructure, and how to prepare before it's enforced.

By Amarjeet shukla

07 May, 2026 · 06 Mins read

PkiClmCompliance

CLM

How to Automate Certificate Renewal with ACME Protocol: A Practical Guide

ACME automates TLS certificate issuance and renewal without human intervention. Here's how to set it up with Certbot, acme.sh, and cert-manager — with real configs for Nginx, Apache, and Kubernetes.

By Ayush kumar rai

03 May, 2026 · 06 Mins read

CLMDevOpsPKI

Ready to Secure Your Enterprise?

Experience how our cryptographic solutions simplify, centralize, and automate identity management for your entire organization.

Stay ahead on cryptography & PKI

Get monthly insights on certificate management, post-quantum readiness, and enterprise security. No spam.

We respect your privacy. Unsubscribe anytime.