QCecuring - Enterprise Security Solutions

cert-manager Troubleshooting: Fix Certificate Not Ready, Stuck Orders & Failed Challenges

Kubernetes 12 May, 2026 · 06 Mins read

Diagnose and fix every common cert-manager issue — Certificate not ready, CertificateRequest pending, Order stuck, Challenge failing, Issuer not ready, and Secret not updating. Includes kubectl commands for each step in the resource chain.


“Certificate not ready” is the cert-manager equivalent of a blank stare. It tells you something is wrong but not what. The actual problem could be anywhere in the resource chain — the Issuer might be misconfigured, the ACME challenge might be failing, DNS might not be propagating, or the webhook might be rejecting the request.

This guide walks through the systematic debugging process: follow the resource chain from Certificate → CertificateRequest → Order → Challenge, and at each step, the exact kubectl commands to identify the failure.


The Resource Chain

Every cert-manager certificate issuance creates a chain of resources. The failure is always at one specific point:

Flowchart showing top-down process flow

Debugging rule: Start at the Certificate and work down. The first resource with an error is where the problem lives.


Step 1: Check the Certificate

# List all certificates and their status
kubectl get certificates -A

# Example output:
# NAMESPACE   NAME      READY   SECRET          AGE
# prod        api-tls   False   api-tls-secret  45m   ← PROBLEM

# Get details
kubectl describe certificate api-tls -n prod

What to look for in the output:

Status:
  Conditions:
    - Type: Ready
      Status: "False"
      Reason: "MissingData"  # or "Issuing", "RequestDenied", etc.
      Message: "Issuing certificate as Secret does not exist"
  # or
      Reason: "RequestDenied"
      Message: "The CSR was denied"
ReasonMeaningNext Step
IssuingCertificate is being issued (in progress)Check CertificateRequest
MissingDataSecret doesn’t exist yetCheck CertificateRequest
RequestDeniedIssuer rejected the requestCheck CertificateRequest for denial reason
DoesNotExistReferenced issuer doesn’t existCheck issuer name/kind/namespace
PendingWaiting for approval (if approval required)Check approval controller

Step 2: Check the CertificateRequest

# Find the CertificateRequest for your Certificate
kubectl get certificaterequest -n prod

# Get details
kubectl describe certificaterequest api-tls-xxxxx -n prod

Common CertificateRequest issues:

“Issuer not found”

Events:
  Type: Warning
  Reason: IssuerNotFound
  Message: Referenced "ClusterIssuer" not found: clusterissuer.cert-manager.io "letsencrypt-prod" not found

Fix: Check the issuer name matches exactly (case-sensitive):

kubectl get clusterissuer
kubectl get issuer -n prod

# Common mistake: using "Issuer" kind when it's a "ClusterIssuer" or vice versa

“Issuer not ready”

Events:
  Type: Warning
  Reason: IssuerNotReady
  Message: Referenced issuer does not have a Ready status condition

Fix: Check the issuer itself:

kubectl describe clusterissuer letsencrypt-prod

# Common causes:
# - ACME account registration failed (bad email, server unreachable)
# - Vault token expired
# - CA secret doesn't exist

“Request denied by webhook”

Events:
  Type: Warning
  Reason: Denied
  Message: "cert-manager.io: Certificate request has been denied"

Fix: Check cert-manager webhook logs and any approval policies:

kubectl logs -n cert-manager deployment/cert-manager-webhook

Step 3: Check the Order (ACME Issuers Only)

kubectl get orders -n prod
kubectl describe order api-tls-xxxxx -n prod

Common Order issues:

Order stuck in “Pending”

Status:
  State: pending
  Authorizations:
    - URL: https://acme-v02.api.letsencrypt.org/acme/authz/...
      Identifier: api.example.com
      Challenges:
        - Type: http-01
          Status: pending

Meaning: The ACME challenge hasn’t been solved yet. Check the Challenge resource.

Order “Invalid”

Status:
  State: invalid
  Reason: "order is invalid"

Causes:

  • Challenge failed (timeout, wrong response)
  • Rate limit hit
  • Domain authorization expired

Fix: Delete the Order (cert-manager will create a new one):

kubectl delete order api-tls-xxxxx -n prod
# cert-manager automatically creates a new Order

Step 4: Check the Challenge (ACME Only)

kubectl get challenges -n prod
kubectl describe challenge api-tls-xxxxx -n prod

HTTP-01 Challenge Failing

“Waiting for HTTP-01 challenge propagation”

# Check if the solver pod is running
kubectl get pods -n cert-manager -l acme.cert-manager.io/http01-solver=true

# Check if the temporary ingress was created
kubectl get ingress -A | grep cm-acme

# Test the challenge URL from outside the cluster
curl -v http://api.example.com/.well-known/acme-challenge/<token>
# Should return the challenge response (not 404)

Common HTTP-01 fixes:

ProblemFix
Solver pod not runningCheck cert-manager logs, resource limits
Ingress not createdWrong ingressClassName in solver config
404 on challenge URLIngress controller not routing /.well-known paths
TimeoutFirewall blocking port 80 from Let’s Encrypt
DNS not pointing to clusterUpdate DNS A record to cluster ingress IP
Multiple ingress controllersSpecify the correct one in solver config
# Check cert-manager controller logs for challenge errors
kubectl logs -n cert-manager deployment/cert-manager -f | grep -i "challenge\|error\|failed"

DNS-01 Challenge Failing

“Waiting for DNS-01 challenge propagation”

# Check if the TXT record was created
dig -t TXT _acme-challenge.api.example.com @8.8.8.8

# If no record: check DNS provider credentials
kubectl logs -n cert-manager deployment/cert-manager | grep -i "dns\|route53\|cloudflare"

Common DNS-01 fixes:

ProblemFix
”Forbidden” / “Unauthorized”API token lacks zone edit permissions
”Timeout”DNS propagation delay (increase propagationTimeout)
“Zone not found”Wrong zone ID or domain not in the configured zone
Record created but not resolvingDNS propagation delay — wait or check nameservers
Wrong DNS providerCheck solver selector matches your domain
# Increase propagation timeout in the issuer
spec:
  acme:
    solvers:
      - dns01:
          cloudflare:
            apiTokenSecretRef:
              name: cloudflare-token
              key: api-token
          # Add propagation check settings
          cnameStrategy: Follow

Step 5: Check cert-manager Controller Logs

When the resource chain doesn’t give enough info:

# Full cert-manager controller logs
kubectl logs -n cert-manager deployment/cert-manager --tail=100

# Filter for errors
kubectl logs -n cert-manager deployment/cert-manager | grep -i "error\|failed\|denied" | tail -30

# Watch logs in real-time
kubectl logs -n cert-manager deployment/cert-manager -f

Common log patterns:

Log MessageMeaningFix
Failed to determine zoneDNS provider can’t find the zoneCheck zone permissions, domain spelling
context deadline exceededTimeout connecting to ACME server or DNSCheck network connectivity, firewall
rate limitedLet’s Encrypt rate limit hitWait, or use staging issuer for testing
account not foundACME account key secret deletedDelete issuer, recreate (new account)
no matching solverNo solver configured for this domainAdd solver selector for the domain

Common Scenarios

Certificate Was Working, Now Stuck on Renewal

# Check if the old secret still exists
kubectl get secret api-tls-secret -n prod -o jsonpath='{.metadata.annotations}'

# Force renewal
cmctl renew api-tls -n prod

# Or delete the secret to trigger re-issuance
kubectl delete secret api-tls-secret -n prod

Certificate Works in One Namespace But Not Another

# Check if you're using Issuer (namespace-scoped) vs ClusterIssuer
# Issuer only works in its own namespace
kubectl get issuer -n prod
kubectl get clusterissuer

# If using Issuer, it must exist in the SAME namespace as the Certificate

Wildcard Certificate Not Issuing

# Wildcards REQUIRE DNS-01 challenge (HTTP-01 can't validate wildcards)
# Check your issuer uses dns01 solver:
kubectl describe clusterissuer letsencrypt-prod | grep -A 10 "solvers"

# Ensure the solver covers your domain:
# selector:
#   dnsZones:
#     - "example.com"

Certificate Issued But Ingress Still Shows Old Cert

# Check if the ingress references the correct secret
kubectl get ingress api-ingress -n prod -o jsonpath='{.spec.tls[0].secretName}'

# Check if the secret was updated
kubectl get secret api-tls-secret -n prod -o jsonpath='{.metadata.annotations.cert-manager\.io/certificate-name}'

# Restart ingress controller to pick up new cert (if it doesn't auto-reload)
kubectl rollout restart deployment ingress-nginx-controller -n ingress-nginx

The Nuclear Option: Full Reset

If nothing works and you want to start fresh for a specific certificate:

# Delete everything in the chain
kubectl delete certificate api-tls -n prod
kubectl delete certificaterequest -n prod -l cert-manager.io/certificate-name=api-tls
kubectl delete order -n prod -l cert-manager.io/certificate-name=api-tls
kubectl delete challenge -n prod -l cert-manager.io/certificate-name=api-tls
kubectl delete secret api-tls-secret -n prod

# Recreate the Certificate resource
kubectl apply -f certificate.yaml

# Watch it progress
kubectl get certificate api-tls -n prod -w

Monitoring cert-manager Health

# Quick health check
cmctl check api

# Check all certificates across cluster
kubectl get certificates -A -o custom-columns=\
  NAMESPACE:.metadata.namespace,\
  NAME:.metadata.name,\
  READY:.status.conditions[0].status,\
  EXPIRY:.status.notAfter,\
  RENEWAL:.status.renewalTime

# Prometheus metrics (if enabled)
# certmanager_certificate_ready_status{condition="False"} — certificates not ready
# certmanager_certificate_expiration_timestamp_seconds — expiry timestamps
# certmanager_http_acme_client_request_count{status="error"} — ACME errors

FAQ

Q: How long should I wait before assuming something is stuck?

HTTP-01 challenges typically complete in 1-5 minutes. DNS-01 can take 5-15 minutes (DNS propagation). If a Certificate has been “not ready” for more than 15 minutes, something is wrong — start debugging.

Q: Can I use cmctl to debug?

Yes — cmctl (cert-manager CLI) is invaluable:

cmctl status certificate api-tls -n prod  # Shows full status chain
cmctl renew api-tls -n prod               # Force renewal
cmctl check api                           # Verify cert-manager is healthy

Q: Why does my certificate keep re-issuing every few minutes?

Usually a conflict: something is deleting the Secret (another controller, Helm upgrade, ArgoCD sync). Check if the Secret has ownerReferences pointing to the Certificate. Also check if renewBefore is set too close to duration.

Q: How do I debug in a cluster where I can’t access external URLs?

For HTTP-01: the solver pod must be reachable from the internet. If your cluster is private, use DNS-01 instead (doesn’t require inbound connectivity). For DNS-01: cert-manager needs outbound access to your DNS provider’s API.

Q: cert-manager webhook is CrashLooping. What do I do?

kubectl logs -n cert-manager deployment/cert-manager-webhook
# Common causes: TLS certificate for webhook expired, resource limits too low
# Fix: delete the webhook secret and restart
kubectl delete secret cert-manager-webhook-ca -n cert-manager
kubectl rollout restart deployment cert-manager-webhook -n cert-manager

Q: I hit Let’s Encrypt rate limits. Now what?

Wait. Rate limits reset after 1 week for most limits. In the meantime: (1) switch to staging issuer for testing, (2) consolidate domains into fewer certificates, (3) check if you’re accidentally creating duplicate Orders.


Related Reading:

Certificate Expiry Checker

Verify your cert-manager renewals are working — check any domain's certificate status instantly.

Check Expiry

Related Insights

Code Signing

Best Code Signing Platforms 2026: Enterprise Comparison

Compare the best code signing platforms for enterprise — DigiCert, Sectigo, Keyfactor SignServer, Sigstore/Cosign, QCecuring, and Azure SignTool. Covers HSM-backed signing, CI/CD integration, EV certificates, and keyless signing.

By Sneha gupta

12 May, 2026 · 06 Mins read

Code SigningComparisonsDevOps

PKI

AD CS Troubleshooting: Fix Every Common Certificate Services Error

Fix every common AD CS error — enrollment denied, template not available, RPC server unavailable, CRL failures, auto-enrollment not working, and certificate chain issues. Includes exact certutil commands and event log analysis.

By Sneha gupta

12 May, 2026 · 05 Mins read

PKITroubleshootingWindows Server

PKI

AD CS + Azure Hybrid PKI Architecture: Extending On-Premises CA to the Cloud

Design hybrid PKI architecture combining on-premises AD CS with Azure services. Covers Intune certificate connector, Azure AD App Proxy for NDES, Windows Hello for Business, Intune Cloud PKI, and Azure Key Vault integration.

By Sneha gupta

12 May, 2026 · 08 Mins read

PKIWindows ServerDevOps

Ready to Secure Your Enterprise?

Experience how our cryptographic solutions simplify, centralize, and automate identity management for your entire organization.

Stay ahead on cryptography & PKI

Get monthly insights on certificate management, post-quantum readiness, and enterprise security. No spam.

We respect your privacy. Unsubscribe anytime.