QCecuring - Enterprise Security Solutions

Certificate Expiry Monitoring with Prometheus & Grafana: Complete Setup

DevOps 15 May, 2026 · 05 Mins read

Set up certificate expiry monitoring using Prometheus exporters (x509-certificate-exporter, Blackbox exporter, cert-manager metrics), PromQL alerting rules, Grafana dashboards, and AlertManager notifications for Slack and PagerDuty.


Certificate expiry is the number one cause of preventable outages. Every year, major services go down because a certificate expired without anyone noticing — not because the team was negligent, but because monitoring wasn’t in place or didn’t cover every certificate in the environment.

Prometheus and Grafana give you a battle-tested stack for certificate expiry monitoring. This guide covers three approaches: the x509-certificate-exporter for file-based and Kubernetes certificates, the Blackbox exporter for probing external endpoints, and cert-manager’s built-in metrics. You’ll get working configs for each, PromQL queries, Grafana dashboards, and AlertManager rules that notify your team at 30, 14, and 7 days before expiry.


Monitoring Architecture Overview

Flowchart showing top-down process flow


Option 1: x509-certificate-exporter

The x509-certificate-exporter is a lightweight Prometheus exporter purpose-built for certificate monitoring. It watches certificate files on disk, Kubernetes TLS secrets, and kubeconfig files.

Installation via Helm (Kubernetes)

helm repo add enix https://charts.enix.io
helm repo update

helm install x509-exporter enix/x509-certificate-exporter \
  --namespace monitoring \
  --create-namespace \
  --set secretsExporter.enabled=true \
  --set hostPathsExporter.daemonSets[0].watchFiles[0]=/etc/ssl/certs \
  --set hostPathsExporter.daemonSets[0].watchFiles[1]=/etc/pki/tls/certs

Installation as Standalone Binary

# Download latest release
curl -LO https://github.com/enix/x509-certificate-exporter/releases/latest/download/x509-certificate-exporter-linux-amd64.tar.gz
tar xzf x509-certificate-exporter-linux-amd64.tar.gz
sudo mv x509-certificate-exporter /usr/local/bin/

# Run with watched paths
x509-certificate-exporter \
  --watch-file=/etc/ssl/certs/server.crt \
  --watch-file=/etc/letsencrypt/live/*/fullchain.pem \
  --listen-address=:9793

Systemd Service

# /etc/systemd/system/x509-certificate-exporter.service
[Unit]
Description=x509 Certificate Exporter
After=network.target

[Service]
Type=simple
ExecStart=/usr/local/bin/x509-certificate-exporter \
  --watch-file=/etc/ssl/certs/server.crt \
  --watch-file=/etc/letsencrypt/live/*/fullchain.pem \
  --watch-file=/etc/pki/tls/certs/*.crt \
  --listen-address=:9793
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
sudo systemctl daemon-reload
sudo systemctl enable --now x509-certificate-exporter

Key Metrics

MetricDescription
x509_cert_not_afterCertificate expiry timestamp (Unix epoch)
x509_cert_not_beforeCertificate valid-from timestamp
x509_cert_expired1 if expired, 0 if valid
x509_cert_valid_since_secondsSeconds since certificate became valid
x509_cert_expires_in_secondsSeconds until expiry

Prometheus Scrape Config

# prometheus.yml
scrape_configs:
  - job_name: 'x509-certificate-exporter'
    static_configs:
      - targets: ['localhost:9793']
    scrape_interval: 60s
    scrape_timeout: 30s

Option 2: Blackbox Exporter (External Endpoint Probing)

The Blackbox exporter probes HTTPS endpoints and exposes the probe_ssl_earliest_cert_expiry metric — the Unix timestamp of the earliest certificate expiry in the chain.

Installation

# Download
curl -LO https://github.com/prometheus/blackbox_exporter/releases/latest/download/blackbox_exporter-0.25.0.linux-amd64.tar.gz
tar xzf blackbox_exporter-*.tar.gz
sudo mv blackbox_exporter-*/blackbox_exporter /usr/local/bin/

Blackbox Exporter Configuration

# /etc/blackbox_exporter/config.yml
modules:
  https_cert_check:
    prober: http
    timeout: 10s
    http:
      method: GET
      preferred_ip_protocol: ip4
      tls_config:
        insecure_skip_verify: false
      valid_http_versions: ["HTTP/1.1", "HTTP/2.0"]
      valid_status_codes: [200, 301, 302, 403]
      fail_if_ssl: false

Prometheus Scrape Config for Blackbox

# prometheus.yml
scrape_configs:
  - job_name: 'blackbox-ssl'
    metrics_path: /probe
    params:
      module: [https_cert_check]
    static_configs:
      - targets:
          - https://app.example.com
          - https://api.example.com
          - https://admin.example.com
          - https://mail.example.com
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: blackbox-exporter:9115  # Blackbox exporter address
    scrape_interval: 300s  # Every 5 minutes is sufficient for cert checks

Key Metric

# Days until certificate expires
(probe_ssl_earliest_cert_expiry - time()) / 86400

Option 3: cert-manager Metrics (Kubernetes)

If you’re running cert-manager in Kubernetes, it already exposes certificate metrics. No additional exporter needed.

Enable Prometheus Metrics

# cert-manager Helm values
prometheus:
  enabled: true
  servicemonitor:
    enabled: true
    namespace: monitoring
    labels:
      release: prometheus

Or if cert-manager is already installed:

helm upgrade cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --set prometheus.enabled=true \
  --set prometheus.servicemonitor.enabled=true

Key Metrics

MetricDescription
certmanager_certificate_expiration_timestamp_secondsExpiry timestamp per certificate
certmanager_certificate_ready_status1 if certificate is ready, 0 if not
certmanager_certificate_renewal_timestamp_secondsWhen cert-manager will attempt renewal

Prometheus Scrape Config (if not using ServiceMonitor)

scrape_configs:
  - job_name: 'cert-manager'
    kubernetes_sd_configs:
      - role: pod
        namespaces:
          names: ['cert-manager']
    relabel_configs:
      - source_labels: [__meta_kubernetes_pod_label_app]
        regex: cert-manager
        action: keep
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_port]
        target_label: __address__
        regex: (.+)
        replacement: ${1}:9402

PromQL Queries for Certificate Monitoring

Days Until Expiry (x509-exporter)

# Days remaining for each certificate
x509_cert_expires_in_seconds / 86400

Days Until Expiry (Blackbox)

# Days remaining for probed endpoints
(probe_ssl_earliest_cert_expiry - time()) / 86400

Days Until Expiry (cert-manager)

# Days remaining for Kubernetes certificates
(certmanager_certificate_expiration_timestamp_seconds - time()) / 86400

Certificates Expiring Within 30 Days

# x509-exporter
x509_cert_expires_in_seconds < 2592000

# Blackbox
(probe_ssl_earliest_cert_expiry - time()) < 2592000

# cert-manager
(certmanager_certificate_expiration_timestamp_seconds - time()) < 2592000

Already Expired Certificates

x509_cert_expired == 1

AlertManager Rules

Create alert rules that fire at 30, 14, and 7 days before expiry:

# /etc/prometheus/rules/certificate-alerts.yml
groups:
  - name: certificate_expiry
    rules:
      - alert: CertificateExpiring30Days
        expr: |
          (x509_cert_expires_in_seconds < 2592000 and x509_cert_expires_in_seconds > 0)
          or
          ((probe_ssl_earliest_cert_expiry - time()) < 2592000 and (probe_ssl_earliest_cert_expiry - time()) > 0)
          or
          ((certmanager_certificate_expiration_timestamp_seconds - time()) < 2592000 and (certmanager_certificate_expiration_timestamp_seconds - time()) > 0)
        for: 1h
        labels:
          severity: warning
        annotations:
          summary: "Certificate expiring within 30 days"
          description: "Certificate {{ $labels.filepath }}{{ $labels.instance }}{{ $labels.name }} expires in {{ $value | humanizeDuration }}."

      - alert: CertificateExpiring14Days
        expr: |
          (x509_cert_expires_in_seconds < 1209600 and x509_cert_expires_in_seconds > 0)
          or
          ((probe_ssl_earliest_cert_expiry - time()) < 1209600 and (probe_ssl_earliest_cert_expiry - time()) > 0)
          or
          ((certmanager_certificate_expiration_timestamp_seconds - time()) < 1209600 and (certmanager_certificate_expiration_timestamp_seconds - time()) > 0)
        for: 1h
        labels:
          severity: high
        annotations:
          summary: "Certificate expiring within 14 days"
          description: "Certificate {{ $labels.filepath }}{{ $labels.instance }}{{ $labels.name }} expires in {{ $value | humanizeDuration }}. Immediate renewal required."

      - alert: CertificateExpiring7Days
        expr: |
          (x509_cert_expires_in_seconds < 604800 and x509_cert_expires_in_seconds > 0)
          or
          ((probe_ssl_earliest_cert_expiry - time()) < 604800 and (probe_ssl_earliest_cert_expiry - time()) > 0)
          or
          ((certmanager_certificate_expiration_timestamp_seconds - time()) < 604800 and (certmanager_certificate_expiration_timestamp_seconds - time()) > 0)
        for: 30m
        labels:
          severity: critical
        annotations:
          summary: "Certificate expiring within 7 days — CRITICAL"
          description: "Certificate {{ $labels.filepath }}{{ $labels.instance }}{{ $labels.name }} expires in {{ $value | humanizeDuration }}. Service outage imminent if not renewed."

      - alert: CertificateExpired
        expr: |
          x509_cert_expired == 1
          or
          (probe_ssl_earliest_cert_expiry - time()) <= 0
          or
          (certmanager_certificate_expiration_timestamp_seconds - time()) <= 0
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Certificate has EXPIRED"
          description: "Certificate {{ $labels.filepath }}{{ $labels.instance }}{{ $labels.name }} has expired. Service is likely experiencing TLS failures."

AlertManager Configuration: Slack & PagerDuty

# /etc/alertmanager/alertmanager.yml
global:
  resolve_timeout: 5m
  slack_api_url: 'https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK'
  pagerduty_url: 'https://events.pagerduty.com/v2/enqueue'

route:
  group_by: ['alertname', 'instance']
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h
  receiver: 'slack-warnings'
  routes:
    - match:
        severity: critical
      receiver: 'pagerduty-critical'
      repeat_interval: 1h
    - match:
        severity: high
      receiver: 'slack-high'
      repeat_interval: 2h
    - match:
        severity: warning
      receiver: 'slack-warnings'
      repeat_interval: 12h

receivers:
  - name: 'slack-warnings'
    slack_configs:
      - channel: '#cert-alerts'
        title: '⚠️ Certificate Expiry Warning'
        text: |
          *Alert:* {{ .GroupLabels.alertname }}
          *Instance:* {{ .CommonLabels.instance }}
          *Description:* {{ .CommonAnnotations.description }}
          *Severity:* {{ .CommonLabels.severity }}
        send_resolved: true

  - name: 'slack-high'
    slack_configs:
      - channel: '#cert-alerts'
        title: '🔶 Certificate Expiry — Action Required'
        text: |
          *Alert:* {{ .GroupLabels.alertname }}
          *Instance:* {{ .CommonLabels.instance }}
          *Description:* {{ .CommonAnnotations.description }}
        send_resolved: true

  - name: 'pagerduty-critical'
    pagerduty_configs:
      - service_key: 'YOUR_PAGERDUTY_SERVICE_KEY'
        severity: critical
        description: '{{ .CommonAnnotations.summary }}'
        details:
          instance: '{{ .CommonLabels.instance }}'
          description: '{{ .CommonAnnotations.description }}'

Grafana Dashboard Setup

Import Pre-Built Dashboard

The community dashboard ID 13922 provides a ready-made certificate monitoring view for the x509-certificate-exporter:

  1. Open Grafana → DashboardsImport
  2. Enter dashboard ID: 13922
  3. Select your Prometheus data source
  4. Click Import

Custom Dashboard Panels

For a unified view across all three exporters, create custom panels:

Panel 1: Certificate Expiry Table

# Combine all sources into a single table
sort_desc(
  label_replace(x509_cert_expires_in_seconds / 86400, "source", "file", "", "")
  or
  label_replace((probe_ssl_earliest_cert_expiry - time()) / 86400, "source", "endpoint", "", "")
  or
  label_replace((certmanager_certificate_expiration_timestamp_seconds - time()) / 86400, "source", "k8s", "", "")
)

Panel 2: Certificates Expiring Soon (Stat Panel)

count(x509_cert_expires_in_seconds < 2592000 and x509_cert_expires_in_seconds > 0)
+
count((probe_ssl_earliest_cert_expiry - time()) < 2592000 and (probe_ssl_earliest_cert_expiry - time()) > 0)
+
count((certmanager_certificate_expiration_timestamp_seconds - time()) < 2592000 and (certmanager_certificate_expiration_timestamp_seconds - time()) > 0)

Panel 3: Expiry Timeline (Time Series)

x509_cert_expires_in_seconds / 86400

Set thresholds: Red < 7, Orange < 14, Yellow < 30, Green > 30.


Complete Docker Compose Stack

For a quick local setup or testing environment:

# docker-compose.yml
version: '3.8'

services:
  prometheus:
    image: prom/prometheus:latest
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - ./rules:/etc/prometheus/rules
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'

  alertmanager:
    image: prom/alertmanager:latest
    ports:
      - "9093:9093"
    volumes:
      - ./alertmanager.yml:/etc/alertmanager/alertmanager.yml

  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
    volumes:
      - grafana-data:/var/lib/grafana

  blackbox-exporter:
    image: prom/blackbox-exporter:latest
    ports:
      - "9115:9115"
    volumes:
      - ./blackbox.yml:/etc/blackbox_exporter/config.yml

  x509-exporter:
    image: enix/x509-certificate-exporter:latest
    ports:
      - "9793:9793"
    volumes:
      - /etc/ssl/certs:/certs:ro
    command:
      - --watch-dir=/certs
      - --listen-address=:9793

volumes:
  grafana-data:

Start the stack:

docker compose up -d

FAQ

Q: How often should Prometheus scrape certificate metrics?

For certificate expiry monitoring, every 60-300 seconds is sufficient. Certificates don’t change frequently, and scraping too often wastes resources. The Blackbox exporter can run at 5-minute intervals without issue. The x509-exporter watches file changes and updates metrics immediately, so even a 60-second scrape interval catches renewals quickly.

Q: Can I monitor certificates on Windows servers with this stack?

Yes. The x509-certificate-exporter has a Windows binary. Run it as a Windows service pointing to your certificate files, and configure Prometheus to scrape it. Alternatively, use the Blackbox exporter to probe Windows IIS endpoints externally.

Q: What’s the difference between x509-certificate-exporter and Blackbox exporter?

The x509-certificate-exporter reads certificate files directly (on disk or from Kubernetes secrets). It sees all certificates regardless of whether they’re actively serving traffic. The Blackbox exporter probes live HTTPS endpoints — it only sees certificates that are actually bound and serving. Use both: x509-exporter for internal/file-based certs, Blackbox for external endpoint validation.

Q: How do I monitor certificates behind a load balancer?

Use the Blackbox exporter pointed at the load balancer’s public endpoint. It will see whatever certificate the LB presents. For backend certificates (between LB and origin), use the x509-certificate-exporter on the backend servers, or probe internal endpoints if the Blackbox exporter has network access.

Q: Will these alerts fire during certificate renewal (brief overlap period)?

The for clause in alert rules prevents transient alerts. A 1-hour for duration means the condition must persist for a full hour before firing. During automated renewal (e.g., cert-manager or Certbot), the new certificate is typically in place within minutes, so the alert never fires.

Q: How do I handle alert fatigue from certificates I can’t control (third-party)?

Add label-based silences in AlertManager for known third-party certificates, or create a separate route with longer repeat_interval. You can also use recording rules to exclude specific instances from alert evaluation.


Related Reading:

Certificate Monitoring Checklist

Ensure you're monitoring every certificate in your infrastructure — internal, external, and container-based.

Get the Checklist

Related Insights

SSL/TLS

Fix 'The Certificate Chain Could Not Be Built to a Trusted Root Authority'

Fix the Windows certificate chain trust error. Covers missing root CA, intermediate certificate gaps, AIA/CDP issues, GPO trust distribution, and manual import — with certutil verification commands.

By Shivam sharma

15 May, 2026 · 06 Mins read

SSL/TLSTroubleshootingPKI

SSL/TLS

Fix 'Unable to Get Local Issuer Certificate' (OpenSSL, curl, Git, npm)

Fix the 'unable to get local issuer certificate' error in OpenSSL, curl, Git, npm, pip, and Docker. Covers missing CA bundles, corporate proxies, and trust store configuration for every platform.

By Sneha gupta

15 May, 2026 · 07 Mins read

SSL/TLSTroubleshootingDevOps

SSL/TLS

Fix 'PKIX Path Building Failed' in Java: Every Cause & Solution

Fix the PKIX path building failed error in Java. Covers keytool import, cacerts configuration, corporate proxies, Spring Boot, Maven/Gradle builds, and Docker containers — without disabling certificate validation.

By Shivam sharma

15 May, 2026 · 06 Mins read

SSL/TLSTroubleshootingDevOps

Ready to Secure Your Enterprise?

Experience how our cryptographic solutions simplify, centralize, and automate identity management for your entire organization.

Stay ahead on cryptography & PKI

Get monthly insights on certificate management, post-quantum readiness, and enterprise security. No spam.

We respect your privacy. Unsubscribe anytime.