ECS-V1: Monitoring with Prometheus and Grafana

Monitoring for containerized apps is easy when you understand the patterns, and plug in some great open-source software. In this episode you’ll learn the architecture of exporting metrics from containers and collecting them in a central server.

We’ll use the leading tools to power collection and visualization of metrics: Prometheus and Grafana. You’ll see how to use them with Docker and Kubernetes and how they power a consistent monitoring approach for all your application components.

Here it is on YouTube - ECS-V1: Monitoring with Prometheus and Grafana

Pre-reqs

Docker Desktop - with Kubernetes enabled (Linux container mode if you’re running on Windows).

Demo 1 - scraping container metrics with Prometheus

The first step in monitoring containerized apps is to add metrics to every component.

The APOD app (source code) does this using Prometheus client libraries, with examples in Go, Node.js and Java.

Start the app:

cd demo1

docker-compose up -d

Browse to:

the app at http://localhost:8010/
the Go web server metrics at http://localhost:8010/metrics
the Node.js log API metrics at http://localhost:8012/metrics
the Java image API metrics at http://localhost:8011/actuator/prometheus

Prometheus is the server component which collects and stores application metrics from the containers.

Run Prometheus in a container, defined in docker-compose-prometheus.yml. This is configured to collect metrics from all the APOD components:

docker-compose -f docker-compose-prometheus.yml up -d

Prometheus is configured using simple domain names in prometheus.yml; those are the container names specified in the app’s docker-compose.yml

Browse to:

Prometheus config UI at http://localhost:9090/config
the query UI at http://localhost:9090/graph

Query some runtime OS metrics:

process_cpu_seconds_total
process_cpu_usage

And some platform metrics:

go_goroutines
http_server_requests_seconds_count

And custom application metrics:

access_log_total
iotd_api_image_load_total

Demo 2 - service discovery in Docker Swarm

Running at scale means you can’t use static domain names - Prometheus needs to collect from every container, not go through a load-balancer.

Prometheus supports dynamic service discovery for many platforms, including Docker Swarm.

Switch to Swarm mode and deploy Prometheus, configured with service discovery:

docker-compose -f docker-compose.yml -f docker-compose-prometheus.yml down

cd ../demo2

docker swarm init

docker config create prometheus config/prometheus.yml

docker stack deploy -c prometheus.yml prometheus

The configuration in prometheus.yml is more complex; it models an opt-in approach, where components state if they want to have metrics scraped using labels.

Browse to:

the Prometheus config at http://localhost:9090/config
discovered services at http://localhost:9090/service-discovery

Now deploy the APOD app as a Swarm stack:

docker stack deploy -c apod.yml apod

The apod.yml app definition includes the Prometheus setup in the service labels.

Browse to:

the new app at http://localhost:8010/
the updated service list at http://localhost:9090/service-discovery
graphs at http://localhost:9090/graph
image_gallery_requests_total
iotd_api_image_load_total

Refresh UI lots and check metrics again

Switch to graph mode

Demo 3 - service discovery in Kubernetes

It’s the same principle in Kubernetes - deploying Prometheus with a configuration to connect to the Kubernetes API for service discovery.

Clear down and check Kubernetes:

docker swarm leave -f

docker ps 

kubectl get nodes

kubectl get ns

The configuration in prometheus-config.yaml shows an alternative approach - an opt-out model, where Pods within the default namespace are included unless they have an annotation to exclude them.

Deploy Prometheus to Kubernetes:

cd ../demo3

kubectl apply -f ./prometheus/

kubectl get all -n monitoring

Prometheus needs access to query the Kubernetes API, so this deployment includes RBAC resources.

Browse to:

Prometheus service discovery at http://localhost:9091/service-discovery

Now deploy the APOD app into the default namespace :

kubectl apply -f ./apod/

kubectl get pods -n default

Browse to:

the new app at http://localhost:8014/
Prometheus target list at http://localhost:9091/targets
metrics at http://localhost:9091/graph

The raw metrics can be used to drive a Grafana dashboard.Grafana runs in a Pod and connects to the Prometheus API to run queries and visualize the results.

Deploy Grafana, with a pre-configured APOD dashboard:

kubectl apply -f ./grafana/

kubectl get pods -n monitoring

Browse to:

http://localhost:3000
sign in with credentials ecs/ecs
check the APOD dashboard

Teardown

kubectl delete ns,svc,deploy,clusterrole,clusterrolebinding -l ecs=v1

Coming next

ECS-V2: Logging with Elasticsearch, Fluentd and Kibana

Elton's Container Show