Ollama Guide¶
This page is the dedicated guide for running Hypercontext with Ollama. Ollama is a good fit when you want local inference, private data handling, and a model server that works with the full Hypercontext stack.
What Ollama Gives You¶
With Ollama you can use Hypercontext for:
- direct provider-backed completions
LLMClientchat/completion flowsTaskAgenttask workflowsMetaAgentrepository and tool workflows- the
run,evaluate,archive,benchmark,mcp,serve,ui, andtuicommands - local or offline experimentation without sending prompts to a cloud API
Hypercontext talks to Ollama through its REST API, so any chat-capable model
that works with Ollama can be used with the provider layer. In practice, that
means you can keep the same Hypercontext code and switch the model name from
llama3 to another installed Ollama model without changing your app logic.
Prerequisites¶
- Hypercontext installed as a Python package
- Ollama installed and running locally
- At least one model pulled into Ollama
requestsavailable in the Python environment
Start Ollama if it is not already running:
ollama serve
Pull a model:
ollama pull llama3
List installed models:
ollama list
Environment Variables¶
The minimal Ollama configuration is:
HYPERCONTEXT_PROVIDER=ollama
HYPERCONTEXT_MODEL=llama3
OLLAMA_BASE_URL=http://localhost:11434
Optional settings:
HYPERCONTEXT_TEMPERATUREfor creativity vs. determinismHYPERCONTEXT_MAX_TOKENSfor longer or shorter responsesHYPERCONTEXT_OUTPUT_DIRfor run artifactsHYPERCONTEXT_MAX_GENERATIONSfor evolution runsHYPERCONTEXT_PARENT_SELECTIONfor archive-driven parent choice
You can keep these in .env and let the provider examples auto-load them, or
export them in your shell.
Use The Provider Directly¶
The provider registry includes Ollama as a first-class backend.
from hypercontext import LLMClient
from hypercontext.providers import ProviderRegistry
registry = ProviderRegistry.instance()
provider = registry.create(
"ollama",
model="llama3",
base_url="http://localhost:11434",
)
client = LLMClient(provider=provider)
text, history, metadata = client.complete(
"Explain how Hypercontext uses provider backends."
)
print(text)
If you prefer chat-style calls:
response = client.chat([
{"role": "user", "content": "List three Hypercontext commands that help with local workflows."}
])
print(response.content)
Use Ollama With Agents¶
Hypercontext agents can use Ollama through the same client/provider layer.
TaskAgent¶
Use TaskAgent when you want a structured single-task wrapper:
import json
from hypercontext import LLMClient, TaskAgent
from hypercontext.agents.base import BaseAgentConfig
from hypercontext.providers import ProviderRegistry
registry = ProviderRegistry.instance()
provider = registry.create("ollama", model="llama3", base_url="http://localhost:11434")
client = LLMClient(provider=provider)
def llm_fn(*, model, messages, temperature=0.0, max_tokens=512):
response = client.chat(
messages,
model=model,
temperature=temperature,
max_tokens=max_tokens,
)
return json.dumps({"prediction": response.content})
agent = TaskAgent(
model="llama3",
config=BaseAgentConfig(
model="llama3",
temperature=0.0,
max_tokens=512,
system_prompt="Return a concise answer.",
),
)
agent._llm_fn = llm_fn
prediction, history = agent.forward(
{"domain": "docs", "task": "Explain why Ollama is useful for Hypercontext."}
)
print(prediction)
HyperContext¶
Use the top-level orchestrator when you want the evolution loop:
from hypercontext import HyperContext
hc = HyperContext(output_dir="./hypercontext_output")
print(hc.run(max_generations=3))
The same code works with Ollama as long as your environment points at the Ollama server and model you want to use.
Use Ollama With The CLI¶
Once the environment variables are set, the rest of Hypercontext works with the same provider:
python -m hypercontext providers
python -m hypercontext run --generations 3 --output-dir ./runs/ollama-demo --workdir .
python -m hypercontext evaluate path/to/code.py --domains memory --workdir .
python -m hypercontext archive list
python -m hypercontext benchmark --domains search --generations 2 --workdir .
python -m hypercontext tui --workdir .
python -m hypercontext mcp --workdir .
python -m hypercontext serve --port 8080 --workdir .
python -m hypercontext ui --port 3000 --workdir .
Suggested Workflow¶
- Start Ollama with
ollama serve. - Pull the model you want.
- Export
HYPERCONTEXT_PROVIDER=ollamaandHYPERCONTEXT_MODEL=<model>. - Run
python -m hypercontext providersto confirm Ollama is available. - Use the CLI, TUI, browser dashboard, or MCP daemon with the same settings.
Model Compatibility¶
Ollama supports many local models. Hypercontext works best with models that can answer clearly in chat format and follow instructions well.
Good candidates include:
llama3mistralqwen2gemma- other Ollama-installed chat-capable models
If a model is weak at instruction following, Hypercontext still runs, but provider-backed workflows, benchmarks, and agent tasks may produce noisier results. For the smoothest experience, start with a chat-oriented model.
Named Provider Presets¶
You can also keep Ollama in a named preset alongside other providers:
provider_name: local-ollama
provider_presets:
local-ollama:
provider: ollama
model: llama3
base_url: http://localhost:11434
remote-claude:
provider: anthropic
model: claude-sonnet-4-20250514
api_key: ${ANTHROPIC_API_KEY}
base_url: https://api.anthropic.com
This is useful when you want to switch between local and cloud providers without changing your agent code.
Troubleshooting¶
If Ollama is not working:
- Confirm
ollama serveis running - Confirm the model name matches
ollama list - Confirm
OLLAMA_BASE_URLpoints at the correct host and port - Confirm the Python environment has
requestsinstalled - Run
python -m hypercontext providersto confirm Ollama is detected - Start with a small prompt before using a larger agent workflow