Oct 4, 2024

Prometheus Kubernetes Architecture: Complete Overview

The entire monitoring stack can be deployed using a Helm chart called kube-prometheus-stack. Let's break down the Prometheus architecture and see how these different components come together to provide a powerful monitoring solution.

The Foundation: Prometheus Operator

It all starts with the Prometheus Operator. The Prometheus Operator extends the functionality of Kubernetes by watching for custom resources that essentially tell it what to do. The Prometheus custom resource signals the Prometheus Operator to create and manage a Prometheus server within the cluster.

Core Components

Prometheus Server

The Prometheus server is responsible for storing all of the metric data in a time-series database.

Exporters

Who's going to be collecting this metric data? Exporters:

  • Node Exporter: Extracts system-level metrics from every node in your cluster, such as CPU usage and memory consumption

  • kube-state-metrics: Collects metrics about the current status and health of Kubernetes resources like pods, deployments, and services

Metric Collection Flow

Both Node Exporter and kube-state-metrics expose metrics at specific endpoints. However, for Prometheus to scrape these metrics, we need to establish a connection between these exporters and the Prometheus server.

ServiceMonitors are custom resources that signal the Prometheus Operator to reconfigure the Prometheus server and tell it exactly where to find the endpoints it needs to scrape. Once Prometheus knows where the targets are, it interacts with the Kubernetes API to gain access to these services.

As Prometheus scrapes metrics from various targets, it stores this data in its internal Time Series Database (TSDB).

Visualization with Grafana

Grafana connects to Prometheus and utilizes PromQL, Prometheus's query language, to retrieve metrics and create insightful visualizations. This setup allows us to monitor both system-level metrics collected by the Node Exporter and Kubernetes-related metrics from kube-state-metrics.

Application Monitoring

We don't want to limit ourselves to system metrics; we also want to monitor our own applications. Let's say you deploy a Flask application. You can:

  1. Instrument the application to expose metrics at a particular path

  2. Deploy your own ServiceMonitor that signals the Prometheus Operator how to reconfigure Prometheus to discover your application's metrics endpoint and collect your application's metrics

  3. Create dashboards in Grafana that visualize these application-specific metrics

Alert Management

Collecting and visualizing metrics is essential, but being proactively notified about issues is just as critical.

The alert management system consists of:

  • Alertmanager custom resources: Signal the Prometheus Operator to manage the creation of Alertmanager

  • PrometheusRule custom resource: Sets conditions for when alerts should be fired, such as when CPU usage reaches a certain level

Prometheus receives these rules via the operator, and when it detects that a certain threshold has been reached, it notifies Alertmanager, which sends alerts to the right channels like email, Slack, or PagerDuty based on predefined settings.

That's all, folks! I hope you enjoyed this prometheus overview. If you enjoy my teaching style, make sure to check out our Kubernetes course, and I'll see you in the next one.

Kubernetes Training

If you found these guides helpful, check out The Complete Kubernetes Training course

Let’s keep in touch

Subscribe to the mailing list and receive the latest updates

Let’s keep in touch

Subscribe to the mailing list and receive the latest updates

Let’s keep in touch

Subscribe to the mailing list and receive the latest updates

Let’s keep in touch

Subscribe to the mailing list and receive the latest updates