CLI reference

The llmkube CLI is a thin client over the Kubernetes API: it knows how to construct Model and InferenceService manifests from a small set of flags.

What this page will cover

llmkube deploy: deploy a catalog model or a custom GGUF source URL.
llmkube catalog list / info: browse the bundled model catalog.
llmkube status: check phase, replicas, and endpoint for a deployed InferenceService.
llmkube version: print client and server build info.