Apr 15, 2025

Loki, Prometheus, Grafana & Docker: Logging & Monitoring

In this lesson, we're going to build a comprehensive observability platform that combines logs and metrics in a single dashboard, allowing you to identify issues with metrics and diagnose them with related logs.

To begin, clone the repository:

The Big Picture

This setup creates a powerful observability platform where log data and metrics data are collected, stored, and visualized in a unified dashboard. The key insight is that logs and metrics complement each other - metrics show you trends and patterns at scale, while logs provide detailed context for troubleshooting.

What's Actually Happening

At the core, this system has these main components working together:

Log Generation - The log-generator container creates sample JSON logs that simulate application activity
Log Collection - Loki and its agent Promtail collect and index these logs for querying
Metrics Extraction - The metrics-exporter pulls information from the same logs and converts it to time-series metrics
Metrics Collection - Prometheus collects and stores these metrics
Unified Visualization - Grafana queries both systems and displays the information together

The beauty of this approach is that the metrics and logs are derived from the same source, allowing for direct correlation between high-level trends and detailed log entries.

Running the Stack

Our Docker Compose file sets up:

Loki (with read, write, and backend components) - Handles log storage and querying
Prometheus - Collects and stores metrics
Grafana - Provides visualization for both logs and metrics
Log Generator - Creates sample logs for demonstration
Metrics Exporter - Converts logs to metrics
Promtail - Loki's agent for log collection

Start everything with:

Once running, you can access:

Grafana at http://localhost:3000
Prometheus at http://localhost:9090
Loki via Grafana (already configured as a data source)

Building Our Dashboard

Let's build a dashboard that demonstrates the power of combining metrics and logs. We'll start with two complementary panels.

Creating a Metrics Panel

First, let's add a panel showing HTTP request rates by method:

In Grafana, create a new dashboard and add a panel
Select "Prometheus" as the data source

Enter the PromQL query:

sum by(method) (rate(http_requests_total{method=~"$method_filter"}[5m]

Set the visualization to "Time series"
Title it "Request Rate by Method"
Add a unit format of "requests per second"
In the Legend section, enable table mode and show Mean, Max, and Last values

This panel gives us a high-level view of traffic patterns across different HTTP methods (GET, POST, etc.).

Adding a Corresponding Logs Panel

Now, let's add a complementary panel showing the raw logs:

Add another panel and select "Loki" as the data source
Enter the LogQL query:
Set the visualization to "Logs"
Title it "Raw Logs"
Enable "Show time" to see when logs were generated

This panel displays the actual log entries that correspond to the traffic metrics we're seeing in the first panel.

Adding Template Variables for Filtering

To make our dashboard more interactive, let's add template variables:

Go to Dashboard Settings > Variables
Add a query variable named "service" that pulls from Loki's service_name label
Add a custom variable named "method_filter" with options: .*,GET,POST,PUT,DELETE,PATCH,HEAD
Add a custom variable named "status_filter" with options: .*,2.*,3.*,4.*,5.*

Now both our metrics and logs panels use the same filtering, maintaining correlation between them.

The Power of Correlation

With these two panels, we can now:

Observe a spike in traffic or errors in the metrics panel
Use the same time range and filters to see exactly which logs correspond to those events
Read the detailed log entries to diagnose what happened

This correlation between high-level metrics and detailed logs is the core strength of our solution.

Loading the Complete Dashboard

Rather than building every panel from scratch, let's import the full dashboard:

In Grafana, go to Dashboards > Import
Upload the grafana-dashboard.json file from the repository
Select your Prometheus and Loki data sources when prompted

Exploring the Full Dashboard

The complete dashboard includes several interconnected panels:

Request Rate by Method and Status Code Distribution - Give a high-level overview of traffic patterns
Response Time by Method - Shows performance metrics over time
Status Codes Over Time - Tracks error rates and response patterns
Response Size by Method - Monitors payload sizes
Top 10 Slowest Endpoints - Identifies performance bottlenecks
Status Codes from Logs - Shows the same information as the metrics but derived directly from logs
Log Volume by HTTP Method - Corresponds to request rate metrics
Raw Logs - Provides detailed information for investigating issues

The power of this dashboard becomes apparent when troubleshooting:

You notice a spike in 500 errors in the Status Codes panel
You filter to "5.*" status codes using the template variable
All panels update to show only data related to those errors
The logs panel now shows the exact error messages at the time of the spike
You can quickly identify the root cause without switching between systems

Conclusion

What we've built is more than just a monitoring solution—it's a unified observability platform that combines the strengths of metrics and logs:

Metrics provide the big picture, helping identify when and where problems occur
Logs provide the details, helping determine why those problems occurred

By combining Loki, Prometheus, and Grafana in this Docker-based setup, we've created a powerful tool for both monitoring system health and troubleshooting issues when they arise.

The system is scalable and can be adapted to monitor real applications by replacing the log generator with actual application logs and expanding the metrics collection as needed.

Loki, Prometheus, Grafana & Docker: Logging & Monitoring

The Big Picture

What's Actually Happening

Running the Stack

Building Our Dashboard

Creating a Metrics Panel

Adding a Corresponding Logs Panel

Adding Template Variables for Filtering

The Power of Correlation

Loading the Complete Dashboard

Exploring the Full Dashboard

Conclusion

Let’s keep in touch

Subscribe

Let’s keep in touch

Subscribe

Let’s keep in touch

Subscribe

Let’s keep in touch

Subscribe