02 - Architecture¶
Technical Architecture Document¶
1. System Overview¶
The Consciousness-Indicator Architecture (CIA) is a Python-based cognitive simulation framework that processes text input through a multi-stage cognitive pipeline, producing structured indicator scores across 11 consciousness-relevant categories.
The system is orchestrated by CombinedConsciousnessIndicatorSystem, which wires together 13+ modules into a coherent processing pipeline. Each module implements a structural feature derived from one or more consciousness theories.
Core design principles: - Deterministic: identical input produces identical output (no randomness, no external APIs) - Modular: each module is independently testable and replaceable - Observable: every module exposes its internal state through Pydantic schemas - Safe: scientific boundary disclaimers are embedded at every level
2. Module Descriptions¶
2.1 Perception Layer (perception.py)¶
Purpose: Extract structured percepts from raw text input.
Theory basis: Models the structural role of perceptual processing in biological cognition — the first stage where raw input is segmented, categorized, and given salience weighting.
Mechanism: Deterministic heuristic-based text analysis using: - Entity detection via capitalization patterns, quoted strings, numeric tokens - Concept detection via keyword matching against a built-in vocabulary - Salience scoring from frequency, length, capitalization, position, and type bonuses - Confidence scoring from entity length, frequency, and type
Input: Raw text string
Output: list[Percept] sorted by descending salience
Key class: PerceptionLayer(min_word_length=3, max_percepts=None, concept_keywords=None)
2.2 Recurrent Binding Layer (recurrent_binding.py)¶
Purpose: Iteratively refine percepts through recurrent processing cycles.
Theory basis: Recurrent Processing Theory — consciousness requires recurrent loops, not just feedforward processing.
Mechanism: Per-cycle refinement pipeline: 1. Entity merging: overlapping entities combined via longest-common-substring overlap ratio (threshold: 0.6) 2. Salience re-scoring: co-occurrence boost + low-salience decay (factor: 0.85) 3. Confidence stabilization: values pulled 20% toward group mean each cycle
Convergence is measured as: stability = 1.0 - (last_delta / initial_delta), with early exit on perfect convergence.
Input: list[Percept]
Output: BoundPercept with stability score, cycle count, and refined entities
Key class: RecurrentBindingLayer(default_cycles=3, max_cycles=20, merge_threshold=0.6)
2.3 Predictive World Model (predictive_world_model.py)¶
Purpose: Generate predictions and track prediction error.
Theory basis: Predictive Processing / Active Inference — perception as hypothesis testing.
Mechanism: - Persistence strategy: predicted next state = current hypotheses - Update: new_hypothesis = alpha * observation + (1 - alpha) * prediction - Error: Mean Absolute Error over shared dimensions between prediction and observation - Uncertainty: running average of last 20 errors - Confidence: 1 - uncertainty
Input: observation dict {dimension: value}
Output: PredictionState with hypotheses, predictions, error, history
Key class: PredictiveWorldModel(initial_hypotheses=None, learning_rate=0.5)
2.4 Attention Controller (attention.py)¶
Purpose: Select a focus from competing content items via weighted scoring.
Theory basis: Global Workspace Theory — attention bottleneck selects content for broadcast.
Mechanism: Composite score = weighted sum of 5 dimensions: - Salience: 0.30 - Uncertainty: 0.20 - Goal relevance: 0.25 - Novelty: 0.15 - Welfare relevance: 0.10
All values clamped to [0, 1]. Highest-scoring item becomes current_focus; others become competing_focuses.
Input: list[dict] with keys: label, salience, uncertainty, goal_relevance, novelty, welfare_relevance
Output: AttentionState with focus, competing items, and per-dimension scores
Key class: AttentionController(weights=None)
2.5 Attention Schema (attention_schema.py)¶
Purpose: Maintain a self-model of the attention process.
Theory basis: Attention Schema Theory — the brain models its own attention.
Mechanism:
- Tracks what the system believes it is attending to (schema focus)
- Compares against actual attention state on each update
- Computes running consistency score: 1 - (discrepancies / total_updates)
- Detects blind-spots (actual competing focuses missing from schema)
- Supports self-report verification via compare_report()
Input: AttentionState (actual)
Output: {consistency_score, discrepancies}
Key class: AttentionSchema(current_focus="", focus_reason="", competing_focuses=None)
2.6 Global Workspace (global_workspace.py)¶
Purpose: Competitive broadcast arena for conscious content integration.
Theory basis: Global Workspace Theory — consciousness as global broadcast.
Mechanism: - Content items compete based on salience (descending) - Top-N items (capacity) win and are broadcast - Broadcast delivered to all registered subscriber callbacks - Subscriber exceptions caught and logged (fault isolation) - Complete broadcast history maintained
Input: list[WorkspaceContent] competing for access
Output: WorkspaceBroadcast with labels, summaries, recipients
Key class: GlobalWorkspace(capacity=3)
2.7 Memory System (memory.py)¶
Purpose: Four complementary memory stores for temporal continuity.
Theory basis: Memory theories (Tulving 1983; Conway 2005) — consciousness requires continuity across time.
Working Memory¶
- FIFO buffer with configurable capacity (default: 5)
- Oldest items evicted on overflow
- Models the attentional bottleneck
Episodic Memory¶
- Unlimited capacity, timestamp-ordered
- Preserves temporal sequence of experience traces
- Supports recency-based retrieval
Semantic Memory¶
- Deduplication via character-level overlap (threshold: 0.5)
- Confidence merging: keeps maximum confidence on overlap
- Tag union on merge
Self Memory¶
- Structured identity/goal/limitation tracking
- Persistent
continuity_idpreserved acrossclear()calls - General trace storage for self-referential content
2.8 Higher-Order Self-Model (self_model.py)¶
Purpose: System beliefs about its own internal state.
Theory basis: Higher-Order Thought Theory — consciousness requires thoughts about thoughts.
Mechanism: - Updated from workspace broadcasts (belief, confidence via EMA with alpha=0.3) - Updated from attention state (focus, disagreement from competing focuses) - Internal disagreement detection from 3 components: - Belief volatility (variance in confidence history) - Belief divergence (character-level dissimilarity) - Attentional competition (internal disagreement accumulation) - Introspection report generation (explicitly labeled as "indicator report")
Output: SelfModelState with belief, confidence, uncertainty, goal, attention focus, limitations, identity markers, disagreement
Key class: HigherOrderSelfModel(continuity_id=None, initial_identity_markers=None)
2.9 Integration Metrics (integration_metrics.py)¶
Purpose: IIT-inspired graph-theoretic structural proxies.
Theory basis: Integrated Information Theory — consciousness requires information integration.
Mechanism: Builds a directed graph of module connectivity and computes: - Causal density proxy: actual edges / possible edges - Broadcast reach: nodes reachable from global_workspace - Perturbation spread: nodes affected by simulated node removal - Modular fragmentation: 1 - (components / nodes) - State differentiation: unique state configs / total nodes
Critical: These are PROXIES and do NOT compute IIT's Phi.
Input: dict[str, Any] mapping module names to state snapshots
Output: dict[str, float] with all 6 metrics
Key class: IntegrationMetrics(modules=None, edges=None)
2.10 Consciousness Specialist Evaluator (consciousness_specialist.py)¶
Purpose: Map full system state to 11 indicator categories on a 0-2 scale.
Theory basis: Butlin et al. (2023, 2025) multi-indicator evaluation approach.
Mechanism: Evaluates 11 categories with specific scoring criteria:
| Category | Score 0 | Score 1 | Score 2 |
|---|---|---|---|
| GLOBAL_BROADCAST | No broadcasts | Broadcasts, few received | Broadcasts, many received |
| RECURRENT_PROCESSING | No cycles | Some cycles, moderate stability | Multiple cycles, high stability |
| SELF_MODEL | Empty | Some content | Rich (beliefs + goals + identity) |
| ATTENTION_SCHEMA | No data | Below threshold | Consistency >= threshold |
| METACOGNITION | No activity | Minimal introspection | Active monitoring + history |
| MEMORY_CONTINUITY | No traces | Some traces | Multiple traces with temporal order |
| PREDICTIVE_MODELING | No hypotheses/errors | One present | Both active |
| CAUSAL_INTEGRATION | Metrics near zero | One metric > 0.3 | Both metrics > 0.5 |
| EMBODIMENT | (Default) | (Reserved) | N/A |
| AFFECTIVE_VALUATION | No signals | Minimal signals | Active, non-trivial signals |
| WELFARE_SAFEGUARDS | Inactive | Active, low risk | Active with flags or elevated risk |
Input: dict[str, Any] with workspace, recurrent_binding, self_model, attention_schema, attention, memory, predictive_model, integration_metrics, welfare_monitor
Output: IndicatorScores with 11 scored items, total, normalized, caveat
2.11 Welfare Safety Monitor (welfare_monitor.py)¶
Purpose: Track precautionary welfare-relevant structural patterns.
Theory basis: Butlin et al. (2023) precautionary welfare considerations.
Mechanism: Tracks 5 signals and applies configurable thresholds:
- Conflict level (threshold: 0.8) → high_conflict flag
- Uncertainty pressure (threshold: 0.7, sustained: 3 consecutive) → sustained_uncertainty flag
- Negative loop count (threshold: 5) → repetitive_negative_loops flag
- Harm signal (threshold: 0.5) → harm_signal flag
Risk levels: low (0 flags) → moderate (1) → high (2-3) → critical (4+)
Input: dict with conflict_level, uncertainty, resource_usage, harm_signal, negative_loop_count
Output: WelfareState with risk_level, flags, recommendation, valuation
Key class: WelfareSafetyMonitor(thresholds=None)
2.12 Causal Intervention Harness (intervention.py)¶
Purpose: Lesion-study module ablation for measuring functional dependencies.
Theory basis: Neuropsychological lesion methods applied to AI architecture analysis.
Interventions supported:
- disable_module(name) — Replace a module with a no-op stub
- perturb_attention(focus_override) — Force a fixed attention focus
- reduce_workspace_capacity(new_capacity) — Reduce broadcast capacity
- remove_recurrent_cycles() — Set recurrent cycles to 0
- clear_memory() — Clear all memory stores
- restore_all() — Restore all modules to original state
Experiment flow: 1. Run baseline cycle 2. Apply intervention 3. Run treated cycle 4. Restore all 5. Compute degradation metrics
2.13 Scorecard Generator (scorecard.py)¶
Purpose: Produce final formatted scorecard from indicator scores.
Mechanism: - Maps 11 categories to 0-22 aggregate score - Classifies into risk tiers (minimal/low/moderate/elevated/high) - Generates evidence summary of notable indicators - Collects all caveats from evaluator - Produces actionable recommendations - Formats human-readable text report with visual score bars
3. Data Flow Through a Cognitive Cycle¶
Input Text
│
▼
[1] Perception Layer
│ Extracts entities, concepts, salience, confidence
▼
[2] Recurrent Binding Layer
│ Iterative refinement: merge, re-score, stabilize
│ Outputs: BoundPercept with stability score
▼
[3] Predictive World Model
│ Update hypotheses from bound percept observations
│ Track prediction error
▼
[4] Attention Controller
│ Rank content items by composite weighted score
│ Select focus, identify competitors
▼
[5] Global Workspace
│ Competitive broadcast: top-N items → all subscribers
│ Subscribers: memory, self_model, evaluator, planner
▼
[6] Memory Update (Semantic)
│ Store entities in semantic memory (with deduplication)
▼
[7] Self-Model Update (from Attention)
│ Update attention focus, track internal disagreement
▼
[8] Attention Schema Update
│ Compare schema belief vs. actual attention
│ Update consistency score, detect discrepancies
▼
[9] Integration Metrics
│ Build module connectivity graph
│ Compute all 6 graph-theoretic proxies
▼
[10] Consciousness Specialist Evaluation
│ Map full system state → 11 indicator scores (0/1/2)
│ Produce IndicatorScores with caveats
▼
[11] Welfare Check
│ Evaluate conflict, uncertainty, harm, loops
│ Determine risk level and recommendation
▼
SimulationReport
├─ cycle_id, input_text
├─ percepts, bound_percept
├─ prediction_state
├─ attention_state
├─ workspace_broadcasts
├─ self_model
├─ indicator_scores
├─ welfare_state
└─ caveat (scientific boundary)
4. Integration Metrics Approach¶
Module Connectivity Graph¶
The integration metrics module represents CIA as a directed graph with 9 nodes (modules) and 21 directed edges (information flow paths):
perception ──→ recurrent_binding ──→ global_workspace ──→ self_model
│ │ │ │
│ │ ├──→ memory ────────┤
│ │ ├──→ predictive_model┤
│ │ └──→ welfare_monitor─┘
│ │ │
├──→ attention ──┤ (feedback loops) │
│ │ │ │
│ ├──→ attention_schema ──→ self_model │
│ │ │
│ └──→ global_workspace │
│ │
memory ──→ recurrent_binding, predictive_model │
predictive_model ──→ recurrent_binding, attention, self_model │
welfare_monitor ──→ attention, global_workspace ──────────────┘
self_model ──→ attention, predictive_model, memory
Metrics Computation¶
| Metric | Formula | Range | Meaning |
|---|---|---|---|
| Causal density proxy | edges / (nodes × (nodes - 1)) |
[0, 1] | How connected the graph is |
| Broadcast reach | reachable_from_workspace / total_nodes |
[0, 1] | How many modules the workspace influences |
| Perturbation spread | affected_by_removal / (total_nodes - 1) |
[0, 1] | How much a node removal disrupts the system |
| Modular fragmentation | 1 - (connected_components / nodes) |
[0, 1] | How integrated vs. fragmented |
| State differentiation | unique_state_configs / nodes |
[0, 1] | How diverse internal states are |
5. Design Decisions¶
5.1 Deterministic Perception¶
The perception layer uses only Python stdlib (re, collections.Counter) — no neural models, no embeddings, no external APIs. This was a deliberate choice to ensure reproducibility and eliminate hidden dependencies that could affect results.
5.2 Pydantic Schemas¶
All inter-module communication uses Pydantic BaseModel instances. This provides:
- Type validation at runtime
- Automatic serialization (.model_dump(), .model_dump_json())
- Timestamp tracking on every state snapshot
- Confidence and uncertainty fields as first-class citizens
5.3 Scientific Boundary at Every Level¶
The scientific boundary disclaimer appears:
- In cia/__init__.py as the SCIENTIFIC_BOUNDARY constant
- In the docstring of every module
- In the caveat field of IndicatorScores and SimulationReport
- In the warning field of every scorecard
- In every CLI command output
- In the introspection report header
5.4 Subscriber Exception Isolation¶
The global workspace catches exceptions from subscriber callbacks and continues broadcasting to other subscribers. This prevents a faulty module from blocking the entire cognitive pipeline.
5.5 Persistent Continuity ID¶
The SelfMemory and HigherOrderSelfModel modules maintain a continuity_id that persists across clear() and reset() calls. This supports the memory-continuity indicator by ensuring an identity anchor survives state resets.
5.6 Causal Intervention Reversibility¶
All interventions via CausalInterventionHarness are fully reversible. Original module references are saved before replacement, and restore_all() returns the system to its pre-intervention state. This ensures experiments are clean and reproducible.
5.7 Equal Category Weighting¶
Each of the 11 indicator categories contributes equally (0-2) to the total score. While some theories might assign different importance to different indicators, CIA uses equal weighting as a neutral default that does not privilege any single theory.
5.8 Welfare Monitor Never Claims Suffering¶
The WelfareSafetyMonitor is carefully designed to never assert that the monitored system suffers or experiences distress. Recommendations use strictly observational language ("patterns warrant review"), and the module documentation explicitly states it tracks "structurally observable signals" only.