Skip to main content
Skip table of contents

kube-Prometheus-stack

kube-prometheus-stack

Prometheus Deployment & Operations Guide


1. Overview

kube-prometheus-stack delivers production-ready Kubernetes monitoring using Prometheus Operator.

It provides:

  • Cluster metrics collection

  • Alerting and rule management

  • Pre-built Grafana dashboards

  • Node and workload monitoring

This Helm chart replaces the former prometheus-operator chart and reflects the full monitoring stack.


2. What Is Included

By default, the stack deploys:

  • Prometheus Operator

  • Prometheus

  • Alertmanager

  • Grafana

  • kube-state-metrics

  • node-exporter

Not included:

  • Prometheus Adapter

  • Blackbox Exporter


3. Architecture Summary

The monitoring flow works like this:

Node Exporters → Prometheus → Alertmanager → Grafana

  • Exporters collect metrics.

  • Prometheus scrapes and stores metrics.

  • Alertmanager handles alerts.

  • Grafana visualizes data.

Think of it as a health monitoring system for your cluster.


4. Prerequisites

Requirement

Version

Kubernetes

1.19+

Helm

3+


5. Installation

Install via OCI Registry (Recommended)

CODE
helm install <release-name> \
  oci://ghcr.io/prometheus-community/charts/kube-prometheus-stack \
  --namespace monitoring \
  --create-namespace

View Configurable Values

CODE
helm show values oci://ghcr.io/prometheus-community/charts/kube-prometheus-stack

6. High Availability (HA)

For production environments, enable multiple Prometheus replicas.

Example configuration:

CODE
prometheus:
  prometheusSpec:
    replicas: 2
    podAntiAffinity: "hard"
    externalLabels:
      cluster: prod-cluster

Important:

  • Always use anti-affinity.

  • Do not remove replica external labels.

  • For global query deduplication, use Thanos.

HA improves uptime but does not automatically deduplicate samples.


7. Grafana Dashboards

  • Pre-configured dashboards are automatically deployed.

  • Loaded via Kubernetes ConfigMaps.

  • Sourced from upstream Prometheus mixins.

  • Custom dashboards can be added through Helm values.


8. Upgrading

CODE
helm upgrade <release-name> <chart>

Note:

  • Helm v3 does not automatically upgrade CRDs.

  • Major version upgrades may require manual steps.

  • Review release notes before upgrading.


9. Uninstalling

CODE
helm uninstall <release-name>

CRDs are not removed automatically.

They must be deleted manually if required:

CODE
kubectl delete crd servicemonitors.monitoring.coreos.com
kubectl delete crd podmonitors.monitoring.coreos.com
kubectl delete crd prometheusrules.monitoring.coreos.com

10. Running Multiple Instances

You may deploy multiple Prometheus instances in one cluster.

Only one Prometheus Operator should run.

Disable shared components for secondary releases:

CODE
kubeStateMetrics.enabled: false
nodeExporter.enabled: false
grafana.enabled: false

11. Private Cluster Considerations

In private clusters (e.g., private GKE):

  • Webhooks may not be reachable by the control plane.

  • Add appropriate firewall rules.

Or disable admission webhooks:

CODE
prometheusOperator:
  admissionWebhooks:
    enabled: false

12. Persistent Volume Migration

To migrate without losing metrics:

  1. Patch PV reclaim policy to Retain.

  2. Delete PVC.

  3. Remove claimRef from PV.

  4. Reinstall stack with matching:

    • Storage size

    • Access mode

    • Storage class

    • Availability zone

All values must match exactly for successful re-binding.


13. ServiceMonitor & PodMonitor Discovery

By default, Prometheus discovers monitors:

  • In its namespace

  • Matching its release label

To discover all monitors in the namespace:

CODE
prometheus:
  prometheusSpec:
    podMonitorSelectorNilUsesHelmValues: false
    serviceMonitorSelectorNilUsesHelmValues: false

14. Known Operational Considerations

kube-proxy Metrics

Default bind address:

CODE
127.0.0.1:10249

To enable scraping:

CODE
0.0.0.0:10249

Update via:

CODE
kubectl -n kube-system edit cm kube-proxy

15. Migration from Older Charts

Zero downtime migration is possible from stable/prometheus-operator using:

CODE
helm upgrade prometheus-operator \
  prometheus-community/kube-prometheus-stack \
  --reuse-values \
  --set nameOverride=prometheus-operator

For full renaming, follow the persistent volume migration process.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.