Operational Provider Recipes¶

This page is the practical provider setup guide for Hypercontext. It is intentionally operational: choose a backend, set the environment variables, run the example script, and wire the same provider into your own agent code.

Quick Decision Table¶

Provider	Use It When	Install / Requirements	Main Env Vars
`mock`	You want deterministic offline testing	None	`HYPERCONTEXT_PROVIDER=mock`
`anthropic`	You want Claude in production or local testing	`pip install anthropic`	`ANTHROPIC_API_KEY`, `HYPERCONTEXT_MODEL`, optional `HYPERCONTEXT_BASE_URL`
`openai`	You want OpenAI models	`pip install openai` (`tiktoken` is optional for better token counts)	`OPENAI_API_KEY`, `HYPERCONTEXT_MODEL`, optional `HYPERCONTEXT_BASE_URL`, `HYPERCONTEXT_ORGANIZATION`
`ollama`	You want a local Ollama server	`requests` plus a running Ollama server	`OLLAMA_BASE_URL`, `HYPERCONTEXT_MODEL`
`openai_compatible`	You have a server that speaks OpenAI chat-completions	`pip install openai`	`OPENAI_COMPATIBLE_BASE_URL`, `HYPERCONTEXT_MODEL`
`local`	You want on-device transformers inference	`pip install transformers torch`	`LOCAL_MODEL`, `LOCAL_DEVICE`, `LOCAL_CACHE_DIR`

For the complete Ollama workflow, see Ollama guide.

1. Environment Contract¶

Hypercontext reads core settings from the process environment. The provider page and the provider examples use the same contract so you can test the whole stack without guessing which variable matters.

Core HYPERCONTEXT_* keys:

HYPERCONTEXT_PROVIDER
HYPERCONTEXT_MODEL
HYPERCONTEXT_API_KEY
HYPERCONTEXT_BASE_URL
HYPERCONTEXT_ORGANIZATION
HYPERCONTEXT_TEMPERATURE
HYPERCONTEXT_MAX_TOKENS
HYPERCONTEXT_OUTPUT_DIR
HYPERCONTEXT_TOKEN_BUDGET
HYPERCONTEXT_MAX_GENERATIONS
HYPERCONTEXT_PARENT_SELECTION
HYPERCONTEXT_CAVEMAN_MODE
HYPERCONTEXT_DOCKER_IMAGE
HYPERCONTEXT_SANDBOX_ENABLED
HYPERCONTEXT_CONVERGENCE_CEILING

Provider-native variables that the example scripts also understand:

ANTHROPIC_API_KEY
ANTHROPIC_BASE_URL
OPENAI_API_KEY
OPENAI_BASE_URL
OPENAI_ORG_ID
OLLAMA_BASE_URL
OPENAI_COMPATIBLE_BASE_URL
LOCAL_MODEL
LOCAL_DEVICE
LOCAL_CACHE_DIR
LOCAL_LOAD_IN_8BIT

Named Provider Presets¶

Hypercontext also supports named provider presets through Settings.provider_presets. This lets you keep several provider configurations in one config file and select them by name at runtime.

Example YAML:

provider_name: claude-test
provider_presets:
  claude-test:
    provider: anthropic
    model: claude-sonnet-4-20250514
    api_key: ${ANTHROPIC_API_KEY}
    base_url: https://api.anthropic.com
  local-mock:
    provider: mock
    model: demo
    extra_kwargs:
      response_delay: 0.0

The repository includes the same pattern in config.yaml. Hypercontext expands ${VAR} values from the process environment when loading YAML, so you can keep secrets out of the config file.

Load and use it from Python:

from hypercontext.settings import Settings
from hypercontext import LLMClient

settings = Settings.from_yaml("config.yaml")
provider = settings.create_provider()
client = LLMClient.from_settings(settings)

Use settings.create_provider() when you want the concrete provider instance, and use LLMClient.from_settings() when you want the same runtime resolver to build the client for you.

If you are testing with Claude, the most common setup is:

HYPERCONTEXT_PROVIDER=anthropic
HYPERCONTEXT_MODEL=claude-sonnet-4-20250514
ANTHROPIC_API_KEY=your-key-here

If you are testing against a local server, change only the backend-specific variables and keep the same agent code.

2. Mock Provider¶

Use mock while you are shaping prompts, testing the CLI, or validating a workflow offline.

.env recipe:

HYPERCONTEXT_PROVIDER=mock
HYPERCONTEXT_MODEL=default
HYPERCONTEXT_API_KEY=

Minimal code:

from hypercontext.providers import ProviderRegistry

registry = ProviderRegistry.instance()
provider = registry.create("mock", model="mock")
print(provider.complete("hello"))

Best use cases:

fast CI runs
offline development
deterministic example output

3. Anthropic / Claude¶

This is the most direct path for Claude-based testing.

.env recipe:

HYPERCONTEXT_PROVIDER=anthropic
HYPERCONTEXT_MODEL=claude-sonnet-4-20250514
ANTHROPIC_API_KEY=your-key-here

Optional overrides:

HYPERCONTEXT_BASE_URL when you need a proxy or gateway endpoint
HYPERCONTEXT_TEMPERATURE when you want more or less variation
HYPERCONTEXT_MAX_TOKENS when you want shorter or longer responses

For the Anthropic Python SDK, use the API root URL:

HYPERCONTEXT_BASE_URL=https://api.anthropic.com
ANTHROPIC_BASE_URL=https://api.anthropic.com

If you copied an OpenAI-style /v1 URL here, the provider examples will strip that suffix automatically before creating the Anthropic client.

Python code:

import os

from hypercontext.providers import ProviderRegistry
from hypercontext import LLMClient

registry = ProviderRegistry.instance()
provider = registry.create(
    "anthropic",
    model="claude-sonnet-4-20250514",
    api_key=os.getenv("ANTHROPIC_API_KEY"),
)
client = LLMClient(provider=provider)
text, history, metadata = client.complete("Summarize Hypercontext in one sentence.")

Use this backend when:

you want Claude behavior in the same code path you will ship
you are testing prompt quality and response format
you want a provider that speaks native tool_use blocks

4. OpenAI¶

Use this when you want OpenAI Chat Completions semantics.

.env recipe:

HYPERCONTEXT_PROVIDER=openai
HYPERCONTEXT_MODEL=gpt-4o
OPENAI_API_KEY=your-key-here
OPENAI_ORG_ID=your-org-id

Python code:

import os

from hypercontext.providers import ProviderRegistry
from hypercontext import LLMClient

registry = ProviderRegistry.instance()
provider = registry.create(
    "openai",
    model="gpt-4o",
    api_key=os.getenv("OPENAI_API_KEY"),
    organization=os.getenv("OPENAI_ORG_ID", ""),
)
client = LLMClient(provider=provider)
response = client.chat([{"role": "user", "content": "List three use cases for Hypercontext."}])
print(response.content)

Use this backend when:

your project already uses OpenAI models
you want the OpenAI tool-calling format
you need tiktoken-based token counting

5. Ollama¶

Use Ollama when you want a local server with a familiar OpenAI-like workflow.

.env recipe:

HYPERCONTEXT_PROVIDER=ollama
HYPERCONTEXT_MODEL=llama3
OLLAMA_BASE_URL=http://localhost:11434

Python code:

from hypercontext.providers import ProviderRegistry
from hypercontext import LLMClient

registry = ProviderRegistry.instance()
provider = registry.create(
    "ollama",
    model="llama3",
    base_url="http://localhost:11434",
)
client = LLMClient(provider=provider)
text, _, _ = client.complete("Explain how to run Hypercontext providers.")
print(text)

Use this backend when:

you want a local model server
you need offline or private inference
you want to keep the same provider-shaped API

For a full local-model walkthrough across the CLI, TUI, MCP, and agent stack, see Ollama guide.

6. OpenAI-Compatible Endpoints¶

Use this for servers that expose the OpenAI chat completions API, such as vLLM, LM Studio, LiteLLM, or a self-hosted gateway.

.env recipe:

HYPERCONTEXT_PROVIDER=openai_compatible
HYPERCONTEXT_MODEL=default
OPENAI_COMPATIBLE_BASE_URL=http://localhost:8080/v1

Python code:

from hypercontext.providers import ProviderRegistry
from hypercontext import LLMClient

registry = ProviderRegistry.instance()
provider = registry.create(
    "openai_compatible",
    model="default",
    base_url="http://localhost:8080/v1",
    api_key="dummy-key",
)
client = LLMClient(provider=provider)
text, _, _ = client.complete("Show me the provider workflow.")
print(text)

Use this backend when:

your team already exposes an OpenAI-compatible gateway
you want to swap providers without changing agent code
you need a provider wrapper with tool-calling compatibility

7. Local Decoder¶

Use the local decoder when you want fully local inference with transformers and torch.

.env recipe:

HYPERCONTEXT_PROVIDER=local
HYPERCONTEXT_MODEL=google/flan-t5-base
LOCAL_DEVICE=cpu
LOCAL_CACHE_DIR=

Python code:

from hypercontext.providers import ProviderRegistry
from hypercontext import LLMClient

registry = ProviderRegistry.instance()
provider = registry.create(
    "local",
    model="google/flan-t5-base",
    device="cpu",
)
client = LLMClient(provider=provider)
text, _, _ = client.complete("Compress this provider guide into one paragraph.")
print(text)

Use this backend when:

you want an offline summariser
you are exercising caveman-mode or local-only workflows
you are fine with slower first-run model downloads

8. Using Providers In Agents¶

The practical pattern is:

Resolve the provider from the environment. The provider examples will auto-load a repository-root .env file if one is present.
Create an LLMClient around that provider.
Adapt LLMClient.chat() or LLMClient.complete() to the agent class.
Reuse the same provider in your task loop, CLI helper, or benchmark code.

The repository's examples/python/provider_agent_workflow.py demonstrates this flow end to end.

import json

from hypercontext import LLMClient, TaskAgent
from hypercontext.agents.base import BaseAgentConfig
from hypercontext.providers import ProviderRegistry

registry = ProviderRegistry.instance()
provider = registry.create("anthropic", model="claude-sonnet-4-20250514")
client = LLMClient(provider=provider)

def llm_fn(*, model, messages, temperature=0.0, max_tokens=8192):
    response = client.chat(
        messages,
        model=model,
        temperature=temperature,
        max_tokens=max_tokens,
    )
    return json.dumps({"prediction": response.content})

agent = TaskAgent(
    model=provider.model,
    config=BaseAgentConfig(model=provider.model, temperature=0.0, max_tokens=512),
)
agent._llm_fn = llm_fn  # type: ignore[attr-defined]

prediction, history = agent.forward(
    {
        "domain": "provider_workflow",
        "task": "Return a JSON object with a 'prediction' key.",
    }
)
print(prediction)

If you need MetaAgent, reuse the same provider construction, but be aware that MetaAgent expects tool-call-friendly responses and is therefore a little stricter about the prompt shape.

The provider workflow example also strips TaskAgent-only metadata from the message history before sending it to Anthropic, because the Anthropic SDK expects plain role and content message fields.

9. Example Scripts¶

Run these scripts directly from the repository root:

python examples/python/provider_catalog.py
python examples/python/provider_agent_workflow.py
python examples/python/provider_agent_workflow_compact.py

What they show:

provider_catalog.py prints the registered providers, the ones detected in your environment, and the exact config the examples will use. The availability list means the SDK is installed; it does not guarantee that your API key or backend endpoint is valid yet.
provider_agent_workflow.py runs a direct LLMClient call and then runs a TaskAgent using the same provider.
provider_agent_workflow_compact.py does the same thing with much shorter output and normalizes the final confirmation to Provider is working correctly. so it is the best smoke test when you want a quick confirmation that the configured provider is working.

These scripts auto-load .env from the repository root if the file exists.

10. Troubleshooting¶

If the provider is not detected, run python -m hypercontext providers and check whether the relevant SDK is installed.
If Claude or OpenAI authentication fails, confirm that the API key is present in the process environment before you start Python.
If openai_compatible fails, make sure the endpoint includes the /v1 prefix expected by OpenAI-style clients.
If local fails, confirm that both transformers and torch are installed and that the model name is reachable from Hugging Face.
If you want a guaranteed offline run, switch back to HYPERCONTEXT_PROVIDER=mock.