The same connection string configures both the llm formatter
(--llm-format-model) and the
navigator (--navigator-type). It
selects an LLM backend and the model that runs on it. This is a Pro-tier
capability.
Connection string format
protocol://[key@][host]/model[?options]
- protocol picks the backend (see below).
- key is the API key for hosted providers, sent as
key@. - host overrides the default endpoint, for self-hosted or proxied APIs.
- model is the model name, or a file path for
local. - options are query parameters that tune the request.
A bare protocol with no URL is also valid and uses that backend’s defaults, e.g.
local.
Providers
The provider set is identical for the formatter and the navigator.
| Protocol | Backend | Key |
|---|---|---|
local | llama-cpp, running a GGUF model file on the host | none |
ollama | Ollama, local or remote | none |
anthropic | Anthropic Claude | required |
openai | OpenAI | required |
google | Google Gemini | required |
mistral | Mistral | required |
deepseek | DeepSeek | required |
groq | Groq | required |
huggingface | HuggingFace Inference | required |
Query options
| Option | Applies to | Effect |
|---|---|---|
max_tokens | all | Cap on the completion length. |
temperature | all | Sampling temperature. Omitted unless set, so the provider’s own default applies. |
top_p | all | Nucleus sampling cutoff. Omitted unless set, so the provider’s own default applies. |
vision | all | Force vision input on or off. |
insecure | hosted | Use http instead of https for a custom host. |
gpu_layers | local | Model layers to offload to the GPU. |
ctx_size | local | Context window size. |
max_tokens caps the model’s completion length. When omitted, most backends
send no cap and the provider applies its own high default. Anthropic is the
exception: its Messages API requires the field, so zshot supplies a default of
4096 — generous enough that ordinary page summaries finish cleanly.1
Raise it with ?max_tokens=8192 for very long extractions.
Examples
A hosted Anthropic model:
zshot -t llm_json \
--llm-format-model "anthropic://$ANTHROPIC_API_KEY@/claude-haiku-4-5" \
--llm-format-prompt "Summarize this page." \
https://example.com -f out.json
A local GGUF model driving the navigator:
zshot --navigate "Click the News link" \
--navigator-type "local:///path/to/model.gguf?gpu_layers=1" \
https://example.com
When a completion does hit the cap, zshot reports “LLM output truncated at the N-token completion limit” and suggests a higher value, rather than a confusing parse error. ↩︎