bus operator inference
Inference Operations
bus operator inference controls AI inference runtime setup through a
provider-neutral command surface. It handles runtime installation, model
availability, status checks, and readiness verification for a selected node.
Concrete providers such as Ollama are implemented behind
bus-integration-ollama.
Run these commands from an operator workstation or bootstrap host that has the
Bus deployment inventory and can reach the selected node through the configured
node/SSH path. The --node value comes from the deployment inventory or cloud
status output. The selected provider must be available through
bus-integration-inference; for Ollama, install bus-integration-ollama and
ensure the target node has the OS permissions and network access needed to
install the runtime and fetch models. For Ollama, that means root or sudo
access on the target inference node and outbound HTTPS access to the configured
model source.
bus operator inference install --node gpu --provider ollama
bus operator inference model ensure --node gpu --provider ollama --model llama3.2:3b
bus operator inference status --node gpu --provider ollama
bus operator inference verify --node gpu --provider ollama
install succeeds with runtime install/configure actions. model ensure
succeeds with an idempotent model availability action. status returns the
provider runtime status for the node. verify returns readiness checks. If
readiness fails, run bus operator node verify --id gpu first, then
check the provider integration diagnostics such as bus-integration-ollama
--self-test.
Use this command when operating model-serving hosts as part of a Bus deployment.
In a running Bus system, bus-api-provider-inference exposes the matching
internal API surface and bus-integration-inference owns the shared event
contract.