Skip to content

English · Español

Lab 00 — Bring Prometheus + Grafana up locally

Goal: a one-shot bootstrap of the observability stack that subsequent labs scrape.

Estimated time: 45–60 minutes (mostly waiting for image pulls).

Prereq: Docker + docker-compose installed on Fedora 43 (sudo dnf install docker docker-compose). User added to docker group. Phase 33 server runs locally on a known port (default 8000).


What you produce

A working infra/compose/observability.yml and verified working stack at:

  • Prometheus UI: http://localhost:9090
  • Grafana UI: http://localhost:3000 (admin/admin on first start)
  • OTel-Collector OTLP/gRPC endpoint: localhost:4317
  • OTel-Collector OTLP/HTTP endpoint: localhost:4318
  • Tempo UI (via Grafana data source): backend on localhost:3200

Plus:

  • infra/compose/prometheus.yml — scrape config for Borja's local Phase 33 server on port 8000.
  • infra/compose/otel-collector.yaml — pipeline: OTLP receiver → processor (batching) → exporter to Tempo + logging.
  • infra/grafana/provisioning/datasources/all.yaml — auto-provision Prometheus and Tempo as data sources.
  • Justfile recipe: just serve-obs brings the stack up; just stop-obs tears it down.

TODOs

Block A — write the compose file

  • Services to include:
  • prometheus (prom/prometheus:v2.54.0 or current LTS)
  • grafana (grafana/grafana-oss:11.x)
  • tempo (grafana/tempo:2.x)
  • otel-collector (otel/opentelemetry-collector-contrib:0.x)
  • node-exporter (prom/node-exporter:v1.x) — scrapes the host's USE metrics
  • All services on the same docker network (obs-net).
  • Prometheus mount ./prometheus.yml:/etc/prometheus/prometheus.yml.
  • Grafana mount ./grafana/provisioning:/etc/grafana/provisioning.
  • Tempo with the minimal local-storage config.
  • node-exporter with --path.rootfs=/host and the host root bind-mounted (read-only).

Do not use docker-compose v1 syntax — use the compose plugin (no version: line at the top of the file).

Block B — write prometheus.yml

Scrape jobs:

  • prometheus itself (localhost:9090).
  • node-exporter (node-exporter:9100).
  • miniserve (the Phase 33 server). Address depends on whether the server runs in docker or on the host:
  • If host: use host.docker.internal:8000 (Linux: add extra_hosts: host.docker.internal:host-gateway to the Prometheus service).
  • If docker: add miniserve to the same network and scrape miniserve:8000.

Scrape interval: 5 s (default 15 s is fine for production; 5 s gives faster feedback during development).

Block C — write otel-collector.yaml

Pipeline:

receivers:  [otlp]      # gRPC :4317, HTTP :4318
processors: [batch]     # buffer to reduce export load
exporters:  [otlp/tempo, debug]

otlp/tempo exporter sends to Tempo's OTLP endpoint on tempo:4317. debug exporter logs spans to stdout (useful for the next lab).

Block D — provision Grafana data sources

In infra/grafana/provisioning/datasources/all.yaml:

  • Prometheus data source: URL http://prometheus:9090.
  • Tempo data source: URL http://tempo:3200.
  • Set Prometheus as the default.

Both will appear under Connections → Data sources on first Grafana start.

Block E — Justfile recipes

serve-obs:
    docker compose -f infra/compose/observability.yml up -d
    @echo "Prometheus: http://localhost:9090"
    @echo "Grafana:    http://localhost:3000 (admin/admin)"
    @echo "Tempo:      via Grafana"

stop-obs:
    docker compose -f infra/compose/observability.yml down

Block F — smoke test

  • just serve-obs.
  • Open http://localhost:9090/targets. All four scrape jobs (prometheus, node-exporter, miniserve, tempo) should be UP (or only miniserve DOWN if the Phase 33 server isn't running yet — that's fine for this lab).
  • Open http://localhost:3000. Log in admin/admin. Force password change to something local (e.g. localdev).
  • Navigate to Connections → Data sources. Prometheus and Tempo both listed and "OK" on test.
  • Run a trivial Prom query: up. Should return 3-4 series.
  • just stop-obs. Verify clean shutdown.

Constraints

  • No production-grade config. No TLS, no auth on Prometheus, no Grafana SMTP. The stack is localhost-only.
  • No persistent volumes for Prometheus/Tempo data. Each restart wipes. Fine for a learning environment; the lab notes how to add volumes if Borja wants persistence across restarts.
  • No external dependencies pulled at runtime. All images pinned by digest in the compose file. Re-resolve digests with docker pull if needed; commit the digests.
  • No Loki yet. Structured logs land in stdout for now; Loki integration is a Phase 38 nice-to-have.

Stop conditions

Done when:

  1. infra/compose/observability.yml brings up 4-5 services cleanly.
  2. All four scrape jobs (minus miniserve if not running) are UP.
  3. Grafana logs in, lists both data sources, query up returns data.
  4. just stop-obs cleanly removes everything.

Pitfalls (read before debugging)

  • host.docker.internal on Linux is not automatic. You need extra_hosts: ["host.docker.internal:host-gateway"] on the service that needs to reach the host.
  • Grafana volume permissions. Grafana's container runs as UID 472. If you bind-mount with the wrong owner, Grafana refuses to start. Either run with the bind-mount approach + correct chown, or use a named volume.
  • Prometheus refuses to start if prometheus.yml has a syntax error. Errors land in docker compose logs prometheus. Common: tabs vs spaces in the YAML.
  • Tempo's storage config drift. Tempo 2.x's storage config differs from 1.x — copy from the current official single-binary example, not from old blog posts.
  • SELinux on Fedora denies bind mounts. Either run with :Z flag on each mount, or setenforce 0 for the dev session (and document it).

When to consult solutions/

After your compose stack is up. The solution at solutions/00-prom-grafana-up-ref.md (written at phase open) shows a working compose file and the standard provisioning layout.


Next lab: lab/01-instrument-server.md.