Apr 15, 2025
Loki, Prometheus, Grafana & Docker: Logging & Monitoring
In this lesson, we're going to build a comprehensive observability platform that combines logs and metrics in a single dashboard, allowing you to identify issues with metrics and diagnose them with related logs.
To begin, clone the repository:
The Big Picture
This setup creates a powerful observability platform where log data and metrics data are collected, stored, and visualized in a unified dashboard. The key insight is that logs and metrics complement each other - metrics show you trends and patterns at scale, while logs provide detailed context for troubleshooting.
What's Actually Happening
At the core, this system has these main components working together:
Log Generation - The
log-generator
container creates sample JSON logs that simulate application activityLog Collection - Loki and its agent Promtail collect and index these logs for querying
Metrics Extraction - The metrics-exporter pulls information from the same logs and converts it to time-series metrics
Metrics Collection - Prometheus collects and stores these metrics
Unified Visualization - Grafana queries both systems and displays the information together
The beauty of this approach is that the metrics and logs are derived from the same source, allowing for direct correlation between high-level trends and detailed log entries.
Running the Stack
Our Docker Compose file sets up:
Loki (with read, write, and backend components) - Handles log storage and querying
Prometheus - Collects and stores metrics
Grafana - Provides visualization for both logs and metrics
Log Generator - Creates sample logs for demonstration
Metrics Exporter - Converts logs to metrics
Promtail - Loki's agent for log collection
Start everything with:
Once running, you can access:
Grafana at http://localhost:3000
Prometheus at http://localhost:9090
Loki via Grafana (already configured as a data source)
Building Our Dashboard
Let's build a dashboard that demonstrates the power of combining metrics and logs. We'll start with two complementary panels.
Creating a Metrics Panel
First, let's add a panel showing HTTP request rates by method:
In Grafana, create a new dashboard and add a panel
Select "Prometheus" as the data source
Enter the PromQL query:
Set the visualization to "Time series"
Title it "Request Rate by Method"
Add a unit format of "requests per second"
In the Legend section, enable table mode and show Mean, Max, and Last values
This panel gives us a high-level view of traffic patterns across different HTTP methods (GET, POST, etc.).
Adding a Corresponding Logs Panel
Now, let's add a complementary panel showing the raw logs:
Add another panel and select "Loki" as the data source
Enter the LogQL query:
Set the visualization to "Logs"
Title it "Raw Logs"
Enable "Show time" to see when logs were generated
This panel displays the actual log entries that correspond to the traffic metrics we're seeing in the first panel.
Adding Template Variables for Filtering
To make our dashboard more interactive, let's add template variables:
Go to Dashboard Settings > Variables
Add a query variable named "service" that pulls from Loki's service_name label
Add a custom variable named "method_filter" with options:
.*,GET,POST,PUT,DELETE,PATCH,HEAD
Add a custom variable named "status_filter" with options:
.*,2.*,3.*,4.*,5.*
Now both our metrics and logs panels use the same filtering, maintaining correlation between them.
The Power of Correlation
With these two panels, we can now:
Observe a spike in traffic or errors in the metrics panel
Use the same time range and filters to see exactly which logs correspond to those events
Read the detailed log entries to diagnose what happened
This correlation between high-level metrics and detailed logs is the core strength of our solution.
Loading the Complete Dashboard
Rather than building every panel from scratch, let's import the full dashboard:
In Grafana, go to Dashboards > Import
Upload the
grafana-dashboard.json
file from the repositorySelect your Prometheus and Loki data sources when prompted
Exploring the Full Dashboard
The complete dashboard includes several interconnected panels:
Request Rate by Method and Status Code Distribution - Give a high-level overview of traffic patterns
Response Time by Method - Shows performance metrics over time
Status Codes Over Time - Tracks error rates and response patterns
Response Size by Method - Monitors payload sizes
Top 10 Slowest Endpoints - Identifies performance bottlenecks
Status Codes from Logs - Shows the same information as the metrics but derived directly from logs
Log Volume by HTTP Method - Corresponds to request rate metrics
Raw Logs - Provides detailed information for investigating issues
The power of this dashboard becomes apparent when troubleshooting:
You notice a spike in 500 errors in the Status Codes panel
You filter to "5.*" status codes using the template variable
All panels update to show only data related to those errors
The logs panel now shows the exact error messages at the time of the spike
You can quickly identify the root cause without switching between systems
Conclusion
What we've built is more than just a monitoring solution—it's a unified observability platform that combines the strengths of metrics and logs:
Metrics provide the big picture, helping identify when and where problems occur
Logs provide the details, helping determine why those problems occurred
By combining Loki, Prometheus, and Grafana in this Docker-based setup, we've created a powerful tool for both monitoring system health and troubleshooting issues when they arise.
The system is scalable and can be adapted to monitor real applications by replacing the log generator with actual application logs and expanding the metrics collection as needed.