Operational Provider Recipes¶
This page is the practical provider setup guide for Hypercontext. It is intentionally operational: choose a backend, set the environment variables, run the example script, and wire the same provider into your own agent code.
Quick Decision Table¶
| Provider | Use It When | Install / Requirements | Main Env Vars |
|---|---|---|---|
mock |
You want deterministic offline testing | None | HYPERCONTEXT_PROVIDER=mock |
anthropic |
You want Claude in production or local testing | pip install anthropic |
ANTHROPIC_API_KEY, HYPERCONTEXT_MODEL, optional HYPERCONTEXT_BASE_URL |
openai |
You want OpenAI models | pip install openai (tiktoken is optional for better token counts) |
OPENAI_API_KEY, HYPERCONTEXT_MODEL, optional HYPERCONTEXT_BASE_URL, HYPERCONTEXT_ORGANIZATION |
ollama |
You want a local Ollama server | requests plus a running Ollama server |
OLLAMA_BASE_URL, HYPERCONTEXT_MODEL |
openai_compatible |
You have a server that speaks OpenAI chat-completions | pip install openai |
OPENAI_COMPATIBLE_BASE_URL, HYPERCONTEXT_MODEL |
local |
You want on-device transformers inference | pip install transformers torch |
LOCAL_MODEL, LOCAL_DEVICE, LOCAL_CACHE_DIR |
For the complete Ollama workflow, see Ollama guide.
1. Environment Contract¶
Hypercontext reads core settings from the process environment. The provider page and the provider examples use the same contract so you can test the whole stack without guessing which variable matters.
Core HYPERCONTEXT_* keys:
HYPERCONTEXT_PROVIDERHYPERCONTEXT_MODELHYPERCONTEXT_API_KEYHYPERCONTEXT_BASE_URLHYPERCONTEXT_ORGANIZATIONHYPERCONTEXT_TEMPERATUREHYPERCONTEXT_MAX_TOKENSHYPERCONTEXT_OUTPUT_DIRHYPERCONTEXT_TOKEN_BUDGETHYPERCONTEXT_MAX_GENERATIONSHYPERCONTEXT_PARENT_SELECTIONHYPERCONTEXT_CAVEMAN_MODEHYPERCONTEXT_DOCKER_IMAGEHYPERCONTEXT_SANDBOX_ENABLEDHYPERCONTEXT_CONVERGENCE_CEILING
Provider-native variables that the example scripts also understand:
ANTHROPIC_API_KEYANTHROPIC_BASE_URLOPENAI_API_KEYOPENAI_BASE_URLOPENAI_ORG_IDOLLAMA_BASE_URLOPENAI_COMPATIBLE_BASE_URLLOCAL_MODELLOCAL_DEVICELOCAL_CACHE_DIRLOCAL_LOAD_IN_8BIT
Named Provider Presets¶
Hypercontext also supports named provider presets through
Settings.provider_presets. This lets you keep several provider
configurations in one config file and select them by name at runtime.
Example YAML:
provider_name: claude-test
provider_presets:
claude-test:
provider: anthropic
model: claude-sonnet-4-20250514
api_key: ${ANTHROPIC_API_KEY}
base_url: https://api.anthropic.com
local-mock:
provider: mock
model: demo
extra_kwargs:
response_delay: 0.0
The repository includes the same pattern in config.yaml.
Hypercontext expands ${VAR} values from the process environment when loading
YAML, so you can keep secrets out of the config file.
Load and use it from Python:
from hypercontext.settings import Settings
from hypercontext import LLMClient
settings = Settings.from_yaml("config.yaml")
provider = settings.create_provider()
client = LLMClient.from_settings(settings)
Use settings.create_provider() when you want the concrete provider
instance, and use LLMClient.from_settings() when you want the same
runtime resolver to build the client for you.
If you are testing with Claude, the most common setup is:
HYPERCONTEXT_PROVIDER=anthropic
HYPERCONTEXT_MODEL=claude-sonnet-4-20250514
ANTHROPIC_API_KEY=your-key-here
If you are testing against a local server, change only the backend-specific variables and keep the same agent code.
2. Mock Provider¶
Use mock while you are shaping prompts, testing the CLI, or validating a
workflow offline.
.env recipe:
HYPERCONTEXT_PROVIDER=mock
HYPERCONTEXT_MODEL=default
HYPERCONTEXT_API_KEY=
Minimal code:
from hypercontext.providers import ProviderRegistry
registry = ProviderRegistry.instance()
provider = registry.create("mock", model="mock")
print(provider.complete("hello"))
Best use cases:
- fast CI runs
- offline development
- deterministic example output
3. Anthropic / Claude¶
This is the most direct path for Claude-based testing.
.env recipe:
HYPERCONTEXT_PROVIDER=anthropic
HYPERCONTEXT_MODEL=claude-sonnet-4-20250514
ANTHROPIC_API_KEY=your-key-here
Optional overrides:
HYPERCONTEXT_BASE_URLwhen you need a proxy or gateway endpointHYPERCONTEXT_TEMPERATUREwhen you want more or less variationHYPERCONTEXT_MAX_TOKENSwhen you want shorter or longer responses
For the Anthropic Python SDK, use the API root URL:
HYPERCONTEXT_BASE_URL=https://api.anthropic.com
ANTHROPIC_BASE_URL=https://api.anthropic.com
If you copied an OpenAI-style /v1 URL here, the provider examples will strip
that suffix automatically before creating the Anthropic client.
Python code:
import os
from hypercontext.providers import ProviderRegistry
from hypercontext import LLMClient
registry = ProviderRegistry.instance()
provider = registry.create(
"anthropic",
model="claude-sonnet-4-20250514",
api_key=os.getenv("ANTHROPIC_API_KEY"),
)
client = LLMClient(provider=provider)
text, history, metadata = client.complete("Summarize Hypercontext in one sentence.")
Use this backend when:
- you want Claude behavior in the same code path you will ship
- you are testing prompt quality and response format
- you want a provider that speaks native
tool_useblocks
4. OpenAI¶
Use this when you want OpenAI Chat Completions semantics.
.env recipe:
HYPERCONTEXT_PROVIDER=openai
HYPERCONTEXT_MODEL=gpt-4o
OPENAI_API_KEY=your-key-here
OPENAI_ORG_ID=your-org-id
Python code:
import os
from hypercontext.providers import ProviderRegistry
from hypercontext import LLMClient
registry = ProviderRegistry.instance()
provider = registry.create(
"openai",
model="gpt-4o",
api_key=os.getenv("OPENAI_API_KEY"),
organization=os.getenv("OPENAI_ORG_ID", ""),
)
client = LLMClient(provider=provider)
response = client.chat([{"role": "user", "content": "List three use cases for Hypercontext."}])
print(response.content)
Use this backend when:
- your project already uses OpenAI models
- you want the OpenAI tool-calling format
- you need
tiktoken-based token counting
5. Ollama¶
Use Ollama when you want a local server with a familiar OpenAI-like workflow.
.env recipe:
HYPERCONTEXT_PROVIDER=ollama
HYPERCONTEXT_MODEL=llama3
OLLAMA_BASE_URL=http://localhost:11434
Python code:
from hypercontext.providers import ProviderRegistry
from hypercontext import LLMClient
registry = ProviderRegistry.instance()
provider = registry.create(
"ollama",
model="llama3",
base_url="http://localhost:11434",
)
client = LLMClient(provider=provider)
text, _, _ = client.complete("Explain how to run Hypercontext providers.")
print(text)
Use this backend when:
- you want a local model server
- you need offline or private inference
- you want to keep the same provider-shaped API
For a full local-model walkthrough across the CLI, TUI, MCP, and agent stack, see Ollama guide.
6. OpenAI-Compatible Endpoints¶
Use this for servers that expose the OpenAI chat completions API, such as vLLM, LM Studio, LiteLLM, or a self-hosted gateway.
.env recipe:
HYPERCONTEXT_PROVIDER=openai_compatible
HYPERCONTEXT_MODEL=default
OPENAI_COMPATIBLE_BASE_URL=http://localhost:8080/v1
Python code:
from hypercontext.providers import ProviderRegistry
from hypercontext import LLMClient
registry = ProviderRegistry.instance()
provider = registry.create(
"openai_compatible",
model="default",
base_url="http://localhost:8080/v1",
api_key="dummy-key",
)
client = LLMClient(provider=provider)
text, _, _ = client.complete("Show me the provider workflow.")
print(text)
Use this backend when:
- your team already exposes an OpenAI-compatible gateway
- you want to swap providers without changing agent code
- you need a provider wrapper with tool-calling compatibility
7. Local Decoder¶
Use the local decoder when you want fully local inference with
transformers and torch.
.env recipe:
HYPERCONTEXT_PROVIDER=local
HYPERCONTEXT_MODEL=google/flan-t5-base
LOCAL_DEVICE=cpu
LOCAL_CACHE_DIR=
Python code:
from hypercontext.providers import ProviderRegistry
from hypercontext import LLMClient
registry = ProviderRegistry.instance()
provider = registry.create(
"local",
model="google/flan-t5-base",
device="cpu",
)
client = LLMClient(provider=provider)
text, _, _ = client.complete("Compress this provider guide into one paragraph.")
print(text)
Use this backend when:
- you want an offline summariser
- you are exercising caveman-mode or local-only workflows
- you are fine with slower first-run model downloads
8. Using Providers In Agents¶
The practical pattern is:
- Resolve the provider from the environment.
The provider examples will auto-load a repository-root
.envfile if one is present. - Create an
LLMClientaround that provider. - Adapt
LLMClient.chat()orLLMClient.complete()to the agent class. - Reuse the same provider in your task loop, CLI helper, or benchmark code.
The repository's examples/python/provider_agent_workflow.py demonstrates this
flow end to end.
import json
from hypercontext import LLMClient, TaskAgent
from hypercontext.agents.base import BaseAgentConfig
from hypercontext.providers import ProviderRegistry
registry = ProviderRegistry.instance()
provider = registry.create("anthropic", model="claude-sonnet-4-20250514")
client = LLMClient(provider=provider)
def llm_fn(*, model, messages, temperature=0.0, max_tokens=8192):
response = client.chat(
messages,
model=model,
temperature=temperature,
max_tokens=max_tokens,
)
return json.dumps({"prediction": response.content})
agent = TaskAgent(
model=provider.model,
config=BaseAgentConfig(model=provider.model, temperature=0.0, max_tokens=512),
)
agent._llm_fn = llm_fn # type: ignore[attr-defined]
prediction, history = agent.forward(
{
"domain": "provider_workflow",
"task": "Return a JSON object with a 'prediction' key.",
}
)
print(prediction)
If you need MetaAgent, reuse the same provider construction, but be aware
that MetaAgent expects tool-call-friendly responses and is therefore a little
stricter about the prompt shape.
The provider workflow example also strips TaskAgent-only metadata from the
message history before sending it to Anthropic, because the Anthropic SDK
expects plain role and content message fields.
9. Example Scripts¶
Run these scripts directly from the repository root:
python examples/python/provider_catalog.py
python examples/python/provider_agent_workflow.py
python examples/python/provider_agent_workflow_compact.py
What they show:
provider_catalog.pyprints the registered providers, the ones detected in your environment, and the exact config the examples will use. The availability list means the SDK is installed; it does not guarantee that your API key or backend endpoint is valid yet.provider_agent_workflow.pyruns a directLLMClientcall and then runs aTaskAgentusing the same provider.provider_agent_workflow_compact.pydoes the same thing with much shorter output and normalizes the final confirmation toProvider is working correctly.so it is the best smoke test when you want a quick confirmation that the configured provider is working.
These scripts auto-load .env from the repository root if the file exists.
10. Troubleshooting¶
- If the provider is not detected, run
python -m hypercontext providersand check whether the relevant SDK is installed. - If Claude or OpenAI authentication fails, confirm that the API key is present in the process environment before you start Python.
- If
openai_compatiblefails, make sure the endpoint includes the/v1prefix expected by OpenAI-style clients. - If
localfails, confirm that bothtransformersandtorchare installed and that the model name is reachable from Hugging Face. - If you want a guaranteed offline run, switch back to
HYPERCONTEXT_PROVIDER=mock.