Skip to content

Metrics reporting

Metrics reporting endpoints include /system/metrics/{serviceName}, /system/metrics, and /system/metrics/breakdown. If the metric query parameter is omitted from /system/metrics/{serviceName}, the API returns all supported per-service metrics. The breakdown endpoint supports CSV output by setting format=csv and grouping with group_by (service, user, country). To include the list of users per service, set include_users=true (JSON only).

The start/end query parameters are optional. If omitted, the API defaults to the last 24 hours (end = now, start = end - 24h).

When using OIDC Bearer tokens, metrics are limited to the services visible to the authenticated user under the usual service visibility rules (public, owned services, and restricted services where the user is listed in allowed_users). When using Basic Auth as the OSCAR admin user, metrics remain cluster-wide and include all services.

Prometheus usage metrics

CPU/GPU hours are fetched from Prometheus. If PROMETHEUS_URL is not set, the service defaults to http://prometheus-server.monitoring.svc.cluster.local. You can override the default Prometheus queries via:

  • PROMETHEUS_CPU_QUERY (default uses {{service}}, {{range}}, and {{services_namespace}})
  • PROMETHEUS_GPU_QUERY (default uses {{service}}, {{range}}, and {{services_namespace}})

Loki request logs (durable breakdowns)

Request-based metrics (breakdowns, request counts) can be sourced from Loki for durable retention. Set LOKI_URL to enable Loki, otherwise the system falls back to Kubernetes pod logs.

  • LOKI_URL (e.g., http://loki.monitoring.svc.cluster.local:3100)
  • LOKI_QUERY (default uses {{namespace}} and {{app}}; if you add {{service}}, prefer a regex matcher like service=~"{{service}}" so summary queries can expand to .*)
  • LOKI_EXPOSED_QUERY (LogQL query for exposed-service requests; default filters /system/services/.+/exposed)
  • LOKI_EXPOSED_NAMESPACE (default ingress-nginx)
  • LOKI_EXPOSED_APP (default ingress-nginx)