Skip to content

Operational Provider Recipes

This page is the practical provider setup guide for Hypercontext. It is intentionally operational: choose a backend, set the environment variables, run the example script, and wire the same provider into your own agent code.

Quick Decision Table

Provider Use It When Install / Requirements Main Env Vars
mock You want deterministic offline testing None HYPERCONTEXT_PROVIDER=mock
anthropic You want Claude in production or local testing pip install anthropic ANTHROPIC_API_KEY, HYPERCONTEXT_MODEL, optional HYPERCONTEXT_BASE_URL
openai You want OpenAI models pip install openai (tiktoken is optional for better token counts) OPENAI_API_KEY, HYPERCONTEXT_MODEL, optional HYPERCONTEXT_BASE_URL, HYPERCONTEXT_ORGANIZATION
ollama You want a local Ollama server requests plus a running Ollama server OLLAMA_BASE_URL, HYPERCONTEXT_MODEL
openai_compatible You have a server that speaks OpenAI chat-completions pip install openai OPENAI_COMPATIBLE_BASE_URL, HYPERCONTEXT_MODEL
local You want on-device transformers inference pip install transformers torch LOCAL_MODEL, LOCAL_DEVICE, LOCAL_CACHE_DIR

For the complete Ollama workflow, see Ollama guide.

1. Environment Contract

Hypercontext reads core settings from the process environment. The provider page and the provider examples use the same contract so you can test the whole stack without guessing which variable matters.

Core HYPERCONTEXT_* keys:

  • HYPERCONTEXT_PROVIDER
  • HYPERCONTEXT_MODEL
  • HYPERCONTEXT_API_KEY
  • HYPERCONTEXT_BASE_URL
  • HYPERCONTEXT_ORGANIZATION
  • HYPERCONTEXT_TEMPERATURE
  • HYPERCONTEXT_MAX_TOKENS
  • HYPERCONTEXT_OUTPUT_DIR
  • HYPERCONTEXT_TOKEN_BUDGET
  • HYPERCONTEXT_MAX_GENERATIONS
  • HYPERCONTEXT_PARENT_SELECTION
  • HYPERCONTEXT_CAVEMAN_MODE
  • HYPERCONTEXT_DOCKER_IMAGE
  • HYPERCONTEXT_SANDBOX_ENABLED
  • HYPERCONTEXT_CONVERGENCE_CEILING

Provider-native variables that the example scripts also understand:

  • ANTHROPIC_API_KEY
  • ANTHROPIC_BASE_URL
  • OPENAI_API_KEY
  • OPENAI_BASE_URL
  • OPENAI_ORG_ID
  • OLLAMA_BASE_URL
  • OPENAI_COMPATIBLE_BASE_URL
  • LOCAL_MODEL
  • LOCAL_DEVICE
  • LOCAL_CACHE_DIR
  • LOCAL_LOAD_IN_8BIT

Named Provider Presets

Hypercontext also supports named provider presets through Settings.provider_presets. This lets you keep several provider configurations in one config file and select them by name at runtime.

Example YAML:

provider_name: claude-test
provider_presets:
  claude-test:
    provider: anthropic
    model: claude-sonnet-4-20250514
    api_key: ${ANTHROPIC_API_KEY}
    base_url: https://api.anthropic.com
  local-mock:
    provider: mock
    model: demo
    extra_kwargs:
      response_delay: 0.0

The repository includes the same pattern in config.yaml. Hypercontext expands ${VAR} values from the process environment when loading YAML, so you can keep secrets out of the config file.

Load and use it from Python:

from hypercontext.settings import Settings
from hypercontext import LLMClient

settings = Settings.from_yaml("config.yaml")
provider = settings.create_provider()
client = LLMClient.from_settings(settings)

Use settings.create_provider() when you want the concrete provider instance, and use LLMClient.from_settings() when you want the same runtime resolver to build the client for you.

If you are testing with Claude, the most common setup is:

HYPERCONTEXT_PROVIDER=anthropic
HYPERCONTEXT_MODEL=claude-sonnet-4-20250514
ANTHROPIC_API_KEY=your-key-here

If you are testing against a local server, change only the backend-specific variables and keep the same agent code.

2. Mock Provider

Use mock while you are shaping prompts, testing the CLI, or validating a workflow offline.

.env recipe:

HYPERCONTEXT_PROVIDER=mock
HYPERCONTEXT_MODEL=default
HYPERCONTEXT_API_KEY=

Minimal code:

from hypercontext.providers import ProviderRegistry

registry = ProviderRegistry.instance()
provider = registry.create("mock", model="mock")
print(provider.complete("hello"))

Best use cases:

  • fast CI runs
  • offline development
  • deterministic example output

3. Anthropic / Claude

This is the most direct path for Claude-based testing.

.env recipe:

HYPERCONTEXT_PROVIDER=anthropic
HYPERCONTEXT_MODEL=claude-sonnet-4-20250514
ANTHROPIC_API_KEY=your-key-here

Optional overrides:

  • HYPERCONTEXT_BASE_URL when you need a proxy or gateway endpoint
  • HYPERCONTEXT_TEMPERATURE when you want more or less variation
  • HYPERCONTEXT_MAX_TOKENS when you want shorter or longer responses

For the Anthropic Python SDK, use the API root URL:

HYPERCONTEXT_BASE_URL=https://api.anthropic.com
ANTHROPIC_BASE_URL=https://api.anthropic.com

If you copied an OpenAI-style /v1 URL here, the provider examples will strip that suffix automatically before creating the Anthropic client.

Python code:

import os

from hypercontext.providers import ProviderRegistry
from hypercontext import LLMClient

registry = ProviderRegistry.instance()
provider = registry.create(
    "anthropic",
    model="claude-sonnet-4-20250514",
    api_key=os.getenv("ANTHROPIC_API_KEY"),
)
client = LLMClient(provider=provider)
text, history, metadata = client.complete("Summarize Hypercontext in one sentence.")

Use this backend when:

  • you want Claude behavior in the same code path you will ship
  • you are testing prompt quality and response format
  • you want a provider that speaks native tool_use blocks

4. OpenAI

Use this when you want OpenAI Chat Completions semantics.

.env recipe:

HYPERCONTEXT_PROVIDER=openai
HYPERCONTEXT_MODEL=gpt-4o
OPENAI_API_KEY=your-key-here
OPENAI_ORG_ID=your-org-id

Python code:

import os

from hypercontext.providers import ProviderRegistry
from hypercontext import LLMClient

registry = ProviderRegistry.instance()
provider = registry.create(
    "openai",
    model="gpt-4o",
    api_key=os.getenv("OPENAI_API_KEY"),
    organization=os.getenv("OPENAI_ORG_ID", ""),
)
client = LLMClient(provider=provider)
response = client.chat([{"role": "user", "content": "List three use cases for Hypercontext."}])
print(response.content)

Use this backend when:

  • your project already uses OpenAI models
  • you want the OpenAI tool-calling format
  • you need tiktoken-based token counting

5. Ollama

Use Ollama when you want a local server with a familiar OpenAI-like workflow.

.env recipe:

HYPERCONTEXT_PROVIDER=ollama
HYPERCONTEXT_MODEL=llama3
OLLAMA_BASE_URL=http://localhost:11434

Python code:

from hypercontext.providers import ProviderRegistry
from hypercontext import LLMClient

registry = ProviderRegistry.instance()
provider = registry.create(
    "ollama",
    model="llama3",
    base_url="http://localhost:11434",
)
client = LLMClient(provider=provider)
text, _, _ = client.complete("Explain how to run Hypercontext providers.")
print(text)

Use this backend when:

  • you want a local model server
  • you need offline or private inference
  • you want to keep the same provider-shaped API

For a full local-model walkthrough across the CLI, TUI, MCP, and agent stack, see Ollama guide.

6. OpenAI-Compatible Endpoints

Use this for servers that expose the OpenAI chat completions API, such as vLLM, LM Studio, LiteLLM, or a self-hosted gateway.

.env recipe:

HYPERCONTEXT_PROVIDER=openai_compatible
HYPERCONTEXT_MODEL=default
OPENAI_COMPATIBLE_BASE_URL=http://localhost:8080/v1

Python code:

from hypercontext.providers import ProviderRegistry
from hypercontext import LLMClient

registry = ProviderRegistry.instance()
provider = registry.create(
    "openai_compatible",
    model="default",
    base_url="http://localhost:8080/v1",
    api_key="dummy-key",
)
client = LLMClient(provider=provider)
text, _, _ = client.complete("Show me the provider workflow.")
print(text)

Use this backend when:

  • your team already exposes an OpenAI-compatible gateway
  • you want to swap providers without changing agent code
  • you need a provider wrapper with tool-calling compatibility

7. Local Decoder

Use the local decoder when you want fully local inference with transformers and torch.

.env recipe:

HYPERCONTEXT_PROVIDER=local
HYPERCONTEXT_MODEL=google/flan-t5-base
LOCAL_DEVICE=cpu
LOCAL_CACHE_DIR=

Python code:

from hypercontext.providers import ProviderRegistry
from hypercontext import LLMClient

registry = ProviderRegistry.instance()
provider = registry.create(
    "local",
    model="google/flan-t5-base",
    device="cpu",
)
client = LLMClient(provider=provider)
text, _, _ = client.complete("Compress this provider guide into one paragraph.")
print(text)

Use this backend when:

  • you want an offline summariser
  • you are exercising caveman-mode or local-only workflows
  • you are fine with slower first-run model downloads

8. Using Providers In Agents

The practical pattern is:

  1. Resolve the provider from the environment. The provider examples will auto-load a repository-root .env file if one is present.
  2. Create an LLMClient around that provider.
  3. Adapt LLMClient.chat() or LLMClient.complete() to the agent class.
  4. Reuse the same provider in your task loop, CLI helper, or benchmark code.

The repository's examples/python/provider_agent_workflow.py demonstrates this flow end to end.

import json

from hypercontext import LLMClient, TaskAgent
from hypercontext.agents.base import BaseAgentConfig
from hypercontext.providers import ProviderRegistry

registry = ProviderRegistry.instance()
provider = registry.create("anthropic", model="claude-sonnet-4-20250514")
client = LLMClient(provider=provider)

def llm_fn(*, model, messages, temperature=0.0, max_tokens=8192):
    response = client.chat(
        messages,
        model=model,
        temperature=temperature,
        max_tokens=max_tokens,
    )
    return json.dumps({"prediction": response.content})

agent = TaskAgent(
    model=provider.model,
    config=BaseAgentConfig(model=provider.model, temperature=0.0, max_tokens=512),
)
agent._llm_fn = llm_fn  # type: ignore[attr-defined]

prediction, history = agent.forward(
    {
        "domain": "provider_workflow",
        "task": "Return a JSON object with a 'prediction' key.",
    }
)
print(prediction)

If you need MetaAgent, reuse the same provider construction, but be aware that MetaAgent expects tool-call-friendly responses and is therefore a little stricter about the prompt shape.

The provider workflow example also strips TaskAgent-only metadata from the message history before sending it to Anthropic, because the Anthropic SDK expects plain role and content message fields.

9. Example Scripts

Run these scripts directly from the repository root:

python examples/python/provider_catalog.py
python examples/python/provider_agent_workflow.py
python examples/python/provider_agent_workflow_compact.py

What they show:

  • provider_catalog.py prints the registered providers, the ones detected in your environment, and the exact config the examples will use. The availability list means the SDK is installed; it does not guarantee that your API key or backend endpoint is valid yet.
  • provider_agent_workflow.py runs a direct LLMClient call and then runs a TaskAgent using the same provider.
  • provider_agent_workflow_compact.py does the same thing with much shorter output and normalizes the final confirmation to Provider is working correctly. so it is the best smoke test when you want a quick confirmation that the configured provider is working.

These scripts auto-load .env from the repository root if the file exists.

10. Troubleshooting

  • If the provider is not detected, run python -m hypercontext providers and check whether the relevant SDK is installed.
  • If Claude or OpenAI authentication fails, confirm that the API key is present in the process environment before you start Python.
  • If openai_compatible fails, make sure the endpoint includes the /v1 prefix expected by OpenAI-style clients.
  • If local fails, confirm that both transformers and torch are installed and that the model name is reachable from Hugging Face.
  • If you want a guaranteed offline run, switch back to HYPERCONTEXT_PROVIDER=mock.