Skip to documentation content View Helm values on GitHub
Browse documentation
Getting Started
Reference
In progress Page being written
Observability
Wire LLMKube into your existing Prometheus + Grafana stack. The operator ships a PodMonitor for inference pods and exposes ten custom metrics from the controller itself.
What this page will cover
- Enabling the bundled PodMonitor under monitoring.podMonitor.enabled in values.yaml.
- Switching to ServiceMonitor for clusters that prefer Service-based scrape targets.
- Controller-side metrics: reconcile timing, model download duration, InferenceService phase counts.
- Pod-side metrics from llama.cpp /metrics (always enabled via the --metrics flag).