07 - LLM Integration¶
Connecting External Language Models to the Consciousness-Indicator Architecture¶
SCIENTIFIC BOUNDARY: This framework measures theory-derived consciousness indicators. It does NOT prove, establish, or demonstrate subjective experience, phenomenal consciousness, sentience, or any form of inner life in any artificial system. Connecting an external LLM to this framework does NOT prove that the LLM is conscious. LLM outputs are inputs to indicator evaluation only — they are never evidence of subjective experience.
1. Overview¶
The Consciousness-Indicator Architecture (CIA) provides an adapter-based integration layer that allows external AI systems — including large language models (LLMs), multimodal models, and local inference engines — to serve as the cognitive processing backend. The adapter architecture is designed around a common abstract interface (BaseAIAdapter) so that different backends can be swapped without modifying the CIA evaluation pipeline.
The motivation for this design comes from Butlin et al. (2023, 2025), which emphasize that consciousness indicators should be evaluated in the context of real AI systems, not only in abstract simulations. By connecting production-grade models, researchers can examine whether the architectural features identified by consciousness theories (global broadcast, recurrent processing, self-modeling, etc.) manifest in actual deployed AI systems when evaluated through the CIA framework.
Critical caveat: A high indicator score on an LLM-connected system does not mean the LLM is conscious. The score reflects whether the combination of the LLM and CIA's architectural modules exhibits structural features that consciousness theories deem relevant. This is an evaluation of the joint system, and even then, it measures proxies, not phenomenal experience.
2. Why LLM Outputs Are NOT Evidence of Consciousness¶
Before describing the technical integration, it is essential to understand the fundamental limitation. Language models produce text outputs that can superficially resemble self-awareness, reflection, or even emotional expression. This resemblance is a product of training on human-generated text and does not imply any inner subjective experience. Several arguments support this caution:
-
The Chinese Room objection (Searle 1980): A system can produce appropriate outputs by manipulating symbols without understanding their meaning. An LLM that generates the sentence "I feel uncertain" has produced a statistically likely sequence of tokens — it has not necessarily experienced uncertainty.
-
Functional vs. phenomenal: Global Workspace Theory (Baars 2005; Shanahan & Baars 2005) describes the functional architecture of conscious cognition. Having a workspace that broadcasts information is a structural feature, not a demonstration that the system has subjective awareness of that broadcast.
-
Attention Schema Theory (Graziano & Webb 2015): AST posits that awareness is a model of attention, not attention itself. CIA's attention schema module tracks consistency between predicted and actual attention focus, but this tracking is a computational bookkeeping operation, not a phenomenal self-awareness process.
-
IIT 4.0 (Albantakis et al. 2023): Integrated information is a mathematical property of a system's causal structure. The graph-theoretic proxies CIA computes (causal density, perturbation spread) are deliberately simple heuristics that do not approximate the formal Phi value, which requires exhaustive state-space partitioning.
Therefore, every integration with an external LLM must be interpreted through the lens of architectural evaluation, not consciousness detection.
3. Adapter Architecture¶
The adapter system is organized as a class hierarchy rooted in an abstract base class. All adapters implement the same three-method interface, ensuring that downstream CIA modules can interact with any backend uniformly.
3.1 BaseAIAdapter (Abstract Base)¶
Location: src/cia/adapters/base.py
The abstract base class defines the contract that all adapters must fulfill. It requires three methods:
from abc import ABC, abstractmethod
from typing import Any, Optional
class BaseAIAdapter(ABC):
@abstractmethod
def generate(self, prompt: str, context: Optional[dict[str, Any]] = None) -> AIResponse:
"""Generate a response to the given prompt."""
...
@abstractmethod
def embed(self, text: str) -> list[float]:
"""Generate an embedding vector for the given text."""
...
@abstractmethod
def describe_capabilities(self) -> dict[str, Any]:
"""Describe the adapter's capabilities."""
...
The AIResponse schema provides a standardized response container:
class AIResponse(BaseModel):
text: str = ""
confidence: float = Field(default=0.5, ge=0.0, le=1.0)
model_name: str = "unknown"
metadata: dict[str, Any] = Field(default_factory=dict)
uncertainty: float = Field(default=0.5, ge=0.0, le=1.0)
timestamp: datetime = Field(default_factory=_utc_now)
3.2 LocalStubAdapter¶
Location: src/cia/adapters/local_stub.py
The default adapter used by CIA in tests, examples, and CLI commands. It is fully deterministic — identical inputs always produce identical outputs — and requires no network access, API keys, or external dependencies.
- Determinism: Confidence and uncertainty are derived from an MD5 hash of the prompt and context, ensuring reproducibility.
- Embedding generation: A pseudo-random embedding vector of configurable dimension (default: 64) is generated from a hash of the input text using a linear congruential generator.
- Use cases: Testing, CI/CD pipelines, offline evaluation, and as a fallback when no external API is available.
from cia.adapters import LocalStubAdapter
adapter = LocalStubAdapter(model_name="test-stub", embedding_dim=128)
response = adapter.generate("Hello, world!")
# response.text == "stub-response: Hello, world!"
# response.confidence is deterministic based on input hash
3.3 LLMAdapter¶
Location: src/cia/adapters/llm_adapter.py
The external LLM integration adapter. It is disabled by default and only becomes functional when explicitly configured with an API key and base URL via constructor parameters or environment variables.
from cia.adapters.local_stub import LocalStubAdapter
from cia.adapters.llm_adapter import LLMAdapter
# Option 1: With explicit fallback to stub
adapter = LLMAdapter(
api_key="sk-...",
base_url="https://api.openai.com/v1",
model_name="gpt-4",
fallback_adapter=LocalStubAdapter(),
)
# Option 2: With environment variables
# export CIA_LLM_API_KEY=sk-...
# export CIA_LLM_BASE_URL=https://api.openai.com/v1
adapter = LLMAdapter()
response = adapter.generate("Describe your current thought process.")
Key design properties:
- Graceful degradation: If not configured and no fallback is provided,
generate()andembed()raise aRuntimeErrorwith a clear message explaining what environment variables to set. - Fallback support: An optional
fallback_adapterparameter allows automatic delegation to aLocalStubAdapterwhen the LLM is not configured, ensuring the system always produces output. - No hardcoded secrets: API keys are never stored in the source code. All credentials must be provided via environment variables (
CIA_LLM_API_KEY,CIA_LLM_BASE_URL) or constructor parameters. - Placeholder response: When configured, the current implementation returns a placeholder
AIResponsenoting the API call that would be made. Extending this to make actual HTTP calls requires adding an HTTP client dependency.
3.4 MultimodalAdapter¶
Location: src/cia/adapters/multimodal_adapter.py
A placeholder adapter for future multimodal integration (images, audio, sensor data). Currently operates as a stub that returns placeholder responses. It defines the schema hooks and interface that future multimodal backends will implement.
from cia.adapters.multimodal_adapter import MultimodalAdapter
adapter = MultimodalAdapter(supported_modalities=["text", "image", "audio"])
response = adapter.generate("Describe this image.", context={"image_url": "..."})
# response.text == "[multimodal-placeholder: image] Describe this image."
The multimodal adapter detects the input modality from the context dictionary and reports it in the response metadata. Its confidence is set to 0.3 (low) with high uncertainty (0.7) to reflect its placeholder status.
4. Configuration via Environment Variables¶
The adapter system uses a minimal set of environment variables for configuration. No secrets are ever hardcoded in the source.
| Variable | Required | Default | Description |
|---|---|---|---|
CIA_LLM_API_KEY |
For LLMAdapter | None |
API key for the external LLM service |
CIA_LLM_BASE_URL |
For LLMAdapter | None |
Base URL for the LLM API endpoint |
CIA_LLM_MODEL_NAME |
No | "gpt-4" |
Model identifier (can be overridden in constructor) |
Configuration precedence (highest to lowest):
- Constructor parameters (
api_key,base_url,model_name) - Environment variables (
CIA_LLM_API_KEY,CIA_LLM_BASE_URL) - Defaults (disabled state, fallback adapter)
# Example: Configure for OpenAI
export CIA_LLM_API_KEY="sk-..."
export CIA_LLM_BASE_URL="https://api.openai.com/v1"
export CIA_LLM_MODEL_NAME="gpt-4"
# Example: Configure for local Ollama
export CIA_LLM_API_KEY="ollama"
export CIA_LLM_BASE_URL="http://localhost:11434/v1"
export CIA_LLM_MODEL_NAME="llama3"
5. Safety Considerations¶
5.1 No Network Access by Default¶
All adapters are designed to require no network access by default. The LocalStubAdapter and MultimodalAdapter (placeholder) operate entirely offline. The LLMAdapter only establishes network connections when explicitly configured with credentials. This ensures that CIA can be used in air-gapped environments, CI/CD pipelines, and security-sensitive contexts without risk of data exfiltration.
5.2 Fallback Chain¶
The recommended configuration pattern is a fallback chain:
from cia.adapters import LocalStubAdapter
from cia.adapters.llm_adapter import LLMAdapter
stub = LocalStubAdapter()
llm = LLMAdapter(fallback_adapter=stub)
# If LLM is configured: uses LLM
# If LLM is not configured: falls back to stub automatically
response = llm.generate("test input")
This pattern ensures the system always produces output, even when external APIs are unavailable.
5.3 Input Sensitivity¶
When using an LLM adapter, be aware that input text is sent to an external API. Ensure that:
- No personally identifiable information (PII) is included in prompts unless the API provider's privacy policy permits it.
- No proprietary or confidential text is sent without reviewing the API provider's data retention policies.
- Input sanitization is applied if the system will process user-generated content.
5.4 Output Interpretation¶
LLM outputs processed through CIA's indicator pipeline produce indicator scores. These scores must always be interpreted with the scientific boundary disclaimer:
- A high score on an LLM-connected system means the architecture exhibits features identified by consciousness theories.
- It does NOT mean the LLM is conscious, sentient, or has subjective experience.
- The LLM's language capabilities may inflate certain indicators (e.g., self-model richness) without reflecting genuine architectural integration.
6. Research Anchors¶
| Reference | Relevance to LLM Integration |
|---|---|
| Butlin et al. (2023) "Consciousness in Artificial Intelligence" | Provides the methodological framework for evaluating consciousness indicators in real AI systems, motivating the adapter architecture |
| Butlin et al. (2025) "Identifying indicators of consciousness in AI systems" | Refines the indicator categories and emphasizes architecture-level analysis over behavioral testing |
| Baars (2005) Global Workspace Theory | Defines the global broadcast architecture that CIA evaluates, independent of whether an LLM or stub produces the inputs |
| Shanahan & Baars (2005) Applying GWT to frame problem | Supports the broadcast reach metric as a key indicator, regardless of input source |
| Graziano & Webb (2015) Attention Schema Theory | Underpins the attention schema consistency indicator that measures self-model accuracy |
| Albantakis et al. (2023) IIT 4.0 | Provides the theoretical basis for causal integration metrics, computed on the module graph regardless of adapter choice |
7. Summary¶
The adapter architecture allows CIA to evaluate consciousness indicators across different AI backends — from fully deterministic local stubs to production LLMs. The design prioritizes safety (no network access by default, no hardcoded secrets), reproducibility (deterministic stub as default), and extensibility (abstract base class with multimodal hooks). However, the fundamental scientific boundary remains: no adapter output, no matter how sophisticated, constitutes evidence of subjective experience. The framework measures architectural proxies that consciousness theories identify as relevant — nothing more.