Skip to content

Monitoring Setup

tenement exports Prometheus metrics at /metrics. This guide covers setting up monitoring and alerting.

Terminal window
curl http://localhost:8080/metrics
MetricTypeDescription
instance_countgaugeTotal running instances
instance_status{process,instance}gaugeInstance health (1=healthy)
instance_uptime_seconds{process,instance}gaugeInstance uptime
instance_restarts{process,instance}counterRestart count
instance_memory_bytes{process,instance}gaugeMemory usage
instance_storage_bytes{process,instance}gaugeDisk usage
instance_storage_quota_bytes{process,instance}gaugeStorage limit
http_requests_total{method,path,status}counterRequest count
http_request_duration_seconds{method,path}histogramRequest latency
Terminal window
# Ubuntu/Debian
apt install prometheus
# Or Docker
docker run -d -p 9090:9090 -v /etc/prometheus:/etc/prometheus prom/prometheus

Add to /etc/prometheus/prometheus.yml:

scrape_configs:
- job_name: 'tenement'
static_configs:
- targets: ['localhost:8080']
metrics_path: /metrics
scrape_interval: 15s
Terminal window
# Restart Prometheus
systemctl restart prometheus
# Check targets
curl http://localhost:9090/api/v1/targets
Terminal window
# Ubuntu/Debian
apt install grafana
# Or Docker
docker run -d -p 3000:3000 grafana/grafana
  1. Open Grafana (http://localhost:3000)
  2. Configuration → Data Sources → Add
  3. Select Prometheus
  4. URL: http://localhost:9090
  5. Save & Test

Create a dashboard with these panels:

Instance Count

instance_count

Instance Health

instance_status

Memory Usage by Instance

instance_memory_bytes

Storage Usage

instance_storage_bytes / instance_storage_quota_bytes * 100

Request Rate

rate(http_requests_total[5m])

Request Latency (p99)

histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))

Restart Rate

increase(instance_restarts[1h])

Create /etc/prometheus/alerts/tenement.yml:

groups:
- name: tenement
rules:
# Instance down
- alert: InstanceDown
expr: instance_status == 0
for: 1m
labels:
severity: critical
annotations:
summary: "Instance {{ $labels.process }}:{{ $labels.instance }} is down"
description: "Instance has been unhealthy for more than 1 minute"
# High restart rate
- alert: HighRestartRate
expr: increase(instance_restarts[1h]) > 5
for: 5m
labels:
severity: warning
annotations:
summary: "Instance {{ $labels.process }}:{{ $labels.instance }} restarting frequently"
description: "Instance has restarted {{ $value }} times in the last hour"
# Storage near limit
- alert: StorageNearLimit
expr: instance_storage_bytes / instance_storage_quota_bytes > 0.9
for: 5m
labels:
severity: warning
annotations:
summary: "Instance {{ $labels.process }}:{{ $labels.instance }} storage > 90%"
description: "Storage at {{ $value | humanizePercentage }}"
# No instances running
- alert: NoInstancesRunning
expr: instance_count == 0
for: 1m
labels:
severity: critical
annotations:
summary: "No tenement instances running"
description: "All instances have stopped"
# High latency
- alert: HighLatency
expr: histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m])) > 1
for: 5m
labels:
severity: warning
annotations:
summary: "High API latency"
description: "p99 latency is {{ $value }}s"

Add to prometheus.yml:

rule_files:
- /etc/prometheus/alerts/*.yml

For Slack/PagerDuty/email alerts:

/etc/alertmanager/alertmanager.yml
route:
receiver: 'slack'
receivers:
- name: 'slack'
slack_configs:
- api_url: 'https://hooks.slack.com/services/...'
channel: '#alerts'
Terminal window
# Instance health
ten ps
# Specific instance
ten health api:prod
Terminal window
# Server health
curl http://localhost:8080/health
# All instances via API
curl -H "Authorization: Bearer $TOKEN" http://localhost:8080/api/instances
check-tenement.sh
#!/bin/bash
# Check server is up
if ! curl -sf http://localhost:8080/health > /dev/null; then
echo "CRITICAL: tenement server down"
exit 2
fi
# Check instance count
COUNT=$(curl -s http://localhost:8080/metrics | grep "^instance_count" | awk '{print $2}')
if [ "$COUNT" -eq 0 ]; then
echo "WARNING: no instances running"
exit 1
fi
echo "OK: $COUNT instances running"
exit 0

tenement doesn’t persist logs. Ship to external service:

/etc/vector/vector.toml
[sources.tenement_api]
type = "http_client"
endpoint = "http://localhost:8080/api/logs/stream"
headers.Authorization = "Bearer ${TENEMENT_TOKEN}"
[sinks.loki]
type = "loki"
inputs = ["tenement_api"]
endpoint = "http://loki:3100"
/etc/promtail/config.yml
scrape_configs:
- job_name: tenement
static_configs:
- targets:
- localhost
labels:
job: tenement
__path__: /var/log/tenement/*.log