Oct 4, 2024

Prometheus Architecture in Kubernetes

Prometheus is an essential part of the Kubernetes ecosystem. It provides a simple mechanism to collect valuable metrics from Kubernetes components and applications. These metrics include resource usage, application performance, and system health data. Prometheus stores this information in a time-series database. Users can extend Prometheus's capabilities with tools like Alertmanager for advanced alerting and Grafana for rich visualizations.

This article will dive into the Prometheus architecture and explore how its components work together to create a powerful cloud-native monitoring solution.

Table of Contents

  1. Data Collection: The Pull Model

  2. Why Pull, Not Push?

  3. Service Monitors

  4. Metric Data Format

  5. Storage: Time-Series Database

  6. PromQL: Prometheus Query Language

  7. Alerting and Visualization

  8. Conclusion

1. Data Collection: The Pull Model

Prometheus is designed to pull metric data from Kubernetes components and applications.

Pulling Metrics from Web APIs

Applications built using web frameworks like Flask, FastAPI, Spring Boot, Nest.JS, and Express.js are instrumented using Prometheus client libraries. These libraries expose a /metrics endpoint from which Prometheus can pull (or "scrape") data.

Pulling Metrics from Third-party Software

For third-party software like MongoDB, MySQL, and Elasticsearch, exporters can be used to extract metrics and translate them into the Prometheus format. These exporters act on behalf of the software to expose a /metrics endpoint that Prometheus can pull data from.

Pulling Metrics from Kubernetes Components

Aside from monitoring our deployed applications, we also need to monitor the Kubernetes infrastructure itself. Kubernetes doesn't natively expose metrics in Prometheus format, so additional components are needed. Kube-state-metrics, an exporter for Kubernetes objects, provides cluster-level metrics about their state, while Node Exporter collects system-level metrics from Kubernetes nodes. Both expose /metrics endpoints for Prometheus to scrape.

Putting it Together

Whether data is made available from a web API or a Prometheus Exporter, these are considered monitoring targets from which Prometheus actively collects metrics at regular intervals.

2. Why Pull, Not Push?

Prometheus uses a pull-based model, actively fetching data from pods, rather than a push-based model where pods send data to Prometheus. The pull-based model is particularly advantageous because it allows Prometheus to directly detect when pods become unavailable by failing to pull data. In other words, if Prometheus can't pull data from a target, it's immediately considered down.

A push-based model would require pods to notify Prometheus by sending a final message before they are terminated, which isn't feasible because pod termination in Kubernetes is sudden, frequent and beyond our control.

3. Service Monitors

Service Monitors in Kubernetes allow Prometheus to find its targets. Imagine a flask-api service that exposes metrics on port 8000 and path /metrics.

apiVersion: v1
kind: Service
metadata:
  name: flask-api
spec:
  selector:
    app: flask-api
  ports:
    - name: metrics
      port: 8000
      targetPort: 8000

To monitor this service, we create a ServiceMonitor:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: flask-api-monitor
spec:
  selector:
    matchLabels:
      app: flask-api
  endpoints:
    - port: metrics
      path: /metrics
      interval: 15s

This ServiceMonitor tells Prometheus to:

  • Find services labeled app: flask-api

  • Scrape the metrics port (8000) at path /metrics

  • Repeat every 15 seconds

Prometheus uses this configuration to automatically discover and monitor the flask-api service. As you scale or update your application, Prometheus adapts its monitoring without manual intervention.

4. Metric Data Format

The following example shows the Prometheus exposition format, which is how instrumented applications expose metrics for Prometheus to scrape. This particular instance is from a Flask application:

# HELP http_request_total Total HTTP Requests
# TYPE http_request_total counter
http_request_total{method="GET",status="200",path="/"} 1.0
http_request_total{method="POST",status="200",path="/items"} 1.0
http_request_total{method="GET",status="200",path="/items/1"} 1.0
http_request_total{method="PUT",status="200",path="/items/1"} 1.0
http_request_total{method="DELETE",status="200",path="/items/1"} 1.0

# HELP http_request_duration_seconds HTTP Request Duration
# TYPE http_request_duration_seconds histogram
http_request_duration_seconds_bucket{method="GET",status="200",path="/",le="0.005"} 1.0
http_request_duration_seconds_bucket{method="GET",status="200",path="/",le="0.01"} 1.0
http_request_duration_seconds_count{method="GET",status="200",path="/"} 1.0

# HELP http_requests_in_progress HTTP Requests in progress
# TYPE http_requests_in_progress gauge
http_requests_in_progress{method="GET",path="/"} 0.0

# HELP process_cpu_usage Current CPU usage in percent
# TYPE process_cpu_usage gauge
process_cpu_usage 21.3

# HELP process_memory_usage_bytes Current memory usage in bytes
# TYPE process_memory_usage_bytes gauge
process_memory_usage_bytes 19136512.0

Let's focus on one metric: http_request_total

# HELP http_request_total Total HTTP Requests
# TYPE http_request_total counter
http_request_total{method="GET",status="200",path="/"} 1.0

After four more requests, the value updates to 5.0

# HELP http_request_total Total HTTP Requests
# TYPE http_request_total counter
http_request_total{method="GET",status="200",path="/"} 5.0

# HELP provides a description of the metric and # TYPE indicates the metric type. This metric is a counter metric because it increments whenever a successful GET request is made on the path /.

5. Storage: Time-Series Database

Prometheus scrapes data at regular intervals and stores it in a time-series database. A time series is a sequence of data points, typically consisting of successive measurements made over time.

Let's assume Prometheus is configured to scrape the previous target every 20 seconds, and over the course of 200 seconds (10 scrapes), 5 requests are made. The data in Prometheus might look something like this:

{__name__="http_request_total", method="GET", status="200", path="/"}:

1623701320 1.0
1623701340 1.0
1623701360 2.0
1623701380 2.0
1623701400 3.0
1623701420 3.0
1623701440 4.0
1623701460 5.0
1623701480 5.0
1623701500 5.0

This time series applies to the metric http_request_total with labels:

method="GET", status="200", path="/"

The first number in each pair is a Unix timestamp (seconds since January 1, 1970), and the second number is the value of the counter at that time. Note how:

  1. The value starts at 1.0 and never decreases (characteristic of a counter)

  2. It doesn't change every scrape, reflecting periods where no new requests were made

  3. By the end, it reaches 5.0, representing the initial request plus the 4 additional requests made during this period

Labels

Labels allow Prometheus to track different aspects of the same metric. This second time series, distinguished by its path="/items" label, shows a different pattern of increments, reflecting the unique traffic to the "/items" endpoint.

{__name__="http_request_total", method="GET", status="200", path="/items"}:
1623701320 1.0
1623701340 1.0
1623701360 1.0
1623701380 2.0
1623701400 2.0
1623701420 3.0
1623701440 3.0
1623701460 3.0
1623701480 4.0
1623701500 4.0
Gauge Metric

This third time series tracks the gauge metric process_cpu_usage, which can both increase and decrease, representing the current value at each scrape:

{__name__="process_cpu_usage"}:
1623701320 21.3
1623701340 18.7
1623701360 22.1
1623701380 20.5
1623701400 19.8
1623701420 23.4
1623701440 21.9
1623701460 20.7
1623701480 22.8
1623701500 21.5

6. PromQL: Prometheus Query Language

PromQL allows you to query the time-series data stored in Prometheus. Different types of queries return different types of results. Let's explore some examples:

Instant Vector

An instant vector represents a set of time series, each containing a single sample for a given timestamp.

Query:

http_request_total

Output:

http_request_total{method="GET", path="/", status="200"} 5 
http_request_total{method="GET", path="/items", status="200"} 4 
http_request_total{method="POST", path="/items", status="200"} 2

This query returns the latest value for each time series of the http_request_total metric.

Range Vector

A range vector represents a set of time series containing a range of data points over time.

Query:

http_request_total[5m]

Output:

http_request_total{method="GET", path="/", status="200"} 
1623701460 4 
1623701520 4 
1623701580 5 
1623701640 5 
1623701700 5 
  
http_request_total{method="GET", path="/items", status="200"} 
1623701460 3 
1623701520 3 
1623701580 3 
1623701640 4 
1623701700 4

This query returns the values over the last 5 minutes for each time series of the http_request_total metric.

Aggregation Over Time

PromQL provides functions to aggregate values over time. These functions take a range vector and output an instant vector. For example, the rate function calculates how fast a value is increasing per second.

Query:

rate(http_request_total[5m])

Output:

{method="GET", path="/", status="200"} 0.0033
{method="GET", path="/items", status="200"} 0.0017 
{method="POST", path="/items", status="200"} 0.0008

This query shows the rate of increase for http_request_total over the last 5 minutes. Other functions in this category include increase, irate, and delta.

Scalar

A scalar is a simple numeric floating point value. Scalar results often come from aggregation functions like sum.

Query:

sum(http_request_total)

Output:

11

This query sums up all http_request_total metrics and returns a single scalar value.

7. Alerting and Visualization

AlertManager uses PromQL to define alert conditions against Prometheus data. Grafana uses PromQL to create visualizations from this data.

Alertmanager

Alertmanager evaluates PromQL expressions to trigger alerts. For example:

alert: HighErrorRate 
expr: rate(http_requests_total{status="500"}[5m]) / rate(http_requests_total[5m]) > 0.1 
for: 10m

This alert triggers if the error rate exceeds 10% for 10 minutes. AlertManager continuously evaluates these expressions, sending notifications when conditions are met.

Grafana

Grafana creates visualizations by querying Prometheus with PromQL. For instance, to visualize HTTP request rates:

rate(http_requests_total[5m])

This query, used in a Grafana panel, shows the per-second rate of HTTP requests over the last 5 minutes for each unique combination of labels. Grafana translates these queries into interactive graphs and dashboards.

Conclusion

Prometheus offers a comprehensive monitoring solution for Kubernetes environments:

  1. Data Collection: Prometheus uses a pull model to collect metrics from various sources, including applications, third-party software, and Kubernetes components.

  2. Service Discovery: ServiceMonitors in Kubernetes allow Prometheus to automatically discover and monitor targets as they are deployed or scaled.

  3. Data Storage: Metrics are stored as time-series data, with each series uniquely identified by its metric name and labels.

  4. Query Language: PromQL provides a powerful way to select and aggregate time-series data, supporting various query types like instant vectors, range vectors, and scalars.

  5. Alerting: AlertManager uses PromQL to define and evaluate alert conditions against the collected data.

  6. Visualization: Tools like Grafana use PromQL to query Prometheus data and create rich, interactive dashboards.

This ecosystem allows for flexible, scalable monitoring that adapts to the dynamic nature of Kubernetes environments. From data collection to visualization and alerting, each component works together to provide deep insights into system performance and health.

Let’s keep in touch

Subscribe to the mailing list and receive the latest updates

Let’s keep in touch

Subscribe to the mailing list and receive the latest updates

Let’s keep in touch

Subscribe to the mailing list and receive the latest updates

Let’s keep in touch

Subscribe to the mailing list and receive the latest updates