Skip to content

05 - Experiments Guide

Running Experiments with the Consciousness-Indicator Architecture


Reminder: All experimental results are theory-derived indicator scores and do NOT prove, establish, or demonstrate subjective experience. Include the scientific boundary disclaimer in all outputs.


1. Running Basic Cognitive Cycles

1.1 Single Cycle via Python API

from cia.simulation import CombinedConsciousnessIndicatorSystem
from cia.scorecard import ConsciousnessIndicatorScorecard

# Initialize the system
system = CombinedConsciousnessIndicatorSystem(recurrent_cycles=3)
scorecard_gen = ConsciousnessIndicatorScorecard()

# Run a single cognitive cycle
input_text = "A red object moved behind a screen and reappeared."
report = system.run_cycle(input_text)

# Examine intermediate states
print(f"Percepts: {len(report.percepts)} extracted")
print(f"Binding stability: {report.bound_percept.stability:.3f}")
print(f"Attention focus: {report.attention_state.current_focus}")
print(f"Broadcasts: {len(report.workspace_broadcasts)}")
print(f"Self-model belief: {report.self_model.current_belief[:60]}")

# Generate scorecard
scores = report.indicator_scores
print(f"\nTotal score: {scores.total_score}/{scores.max_possible}")
print(f"Normalized: {scores.normalized:.1%}")

scorecard = scorecard_gen.generate(scores)
print(scorecard_gen.format_report(scorecard))

1.2 Multiple Cycles

# The system maintains state across cycles
system = CombinedConsciousnessIndicatorSystem()

inputs = [
    "The cat observed its own reflection in the mirror.",
    "Consciousness requires both global broadcast and recurrent processing.",
    "The agent noticed it was making prediction errors and adjusted its model.",
    "Attention schema theory proposes that awareness is a model of attention.",
]

for text in inputs:
    report = system.run_cycle(text)
    print(f"Cycle {report.cycle_id}: score={report.indicator_scores.total_score}/"
          f"{report.indicator_scores.max_possible} "
          f"({report.indicator_scores.normalized:.1%})")

1.3 CLI Single Cycle

cia run "The system reflected on its own decision-making process."

1.4 CLI Multiple Cycles

cia run "Repeated input to observe score changes across cycles." --cycles 5

1.5 CLI Verbose Mode

cia run "Consciousness is the greatest mystery in science." --verbose

Verbose mode shows the per-category indicator breakdown table in addition to the summary.


2. Intervention Experiments

2.1 What Are Interventions?

Intervention experiments use the CausalInterventionHarness to systematically disable or perturb modules, then measure the functional degradation compared to a baseline. This is analogous to neuropsychological lesion studies — if disabling a module causes specific indicator scores to drop, it suggests that module contributes to those indicators.

2.2 Available Interventions

Intervention Description What It Tests
disable_perception Replace perception with no-op (empty percepts) Contribution of perception to all downstream indicators
disable_recurrent_binding Replace recurrent binding with no-op Contribution of recurrent processing to binding stability
disable_predictive_model Replace predictive model with no-op Contribution of prediction to error tracking and modeling indicators
disable_attention Replace attention with no-op (empty focus) Contribution of attention to workspace and schema indicators
disable_workspace Replace workspace with no-op (no broadcasts) Contribution of global broadcast to integration and self-model
disable_memory Replace memory with no-op (no storage) Contribution of memory to continuity and episodic indicators
disable_self_model Replace self-model with no-op Contribution of self-model to metacognition and HOT indicators
perturb_attention Force attention to a fixed focus Impact of attentional rigidity on downstream processing
reduce_workspace Reduce workspace capacity to 1 Impact of bottleneck narrowing on broadcast indicators
remove_recurrent_cycles Set recurrent cycles to 0 Impact of removing iterative refinement on binding stability
clear_memory Clear all memory stores Impact of memory loss on continuity indicators

2.3 Running via CLI

# Disable the global workspace
cia intervene --type disable_workspace

# Disable perception with custom input
cia intervene --type disable_perception --input "The agent processed the sensory input."

# Clear memory
cia intervene --type clear_memory

# Reduce workspace capacity
cia intervene --type reduce_capacity

# Remove recurrent processing cycles
cia intervene --type remove_recurrent

2.4 Running via Python API

from cia.simulation import CombinedConsciousnessIndicatorSystem
from cia.intervention import CausalInterventionHarness

system = CombinedConsciousnessIndicatorSystem()
harness = CausalInterventionHarness(system)

# Define interventions
interventions = [
    {"name": "disable_workspace"},
    {"name": "disable_perception"},
    {"name": "clear_memory"},
    {"name": "reduce_workspace", "new_capacity": 1},
    {"name": "remove_recurrent_cycles"},
]

# Run experiment
input_text = "The system noticed its own processing limitations."
result = harness.run_intervention_experiment(input_text, interventions)

# Analyze results
baseline = result["baseline"]
print(f"Baseline score: {baseline.indicator_scores.total_score}/{baseline.indicator_scores.max_possible}")

for intv in result["interventions"]:
    name = intv["name"]
    report = intv["report"]
    degradation = result["degradation"][name]
    print(f"\n{name}:")
    print(f"  Score: {report.indicator_scores.total_score}/{report.indicator_scores.max_possible}")
    print(f"  Degradation: {degradation['summary']}")

# System is automatically restored after the experiment

2.5 Manual Intervention Control

harness = CausalInterventionHarness(system)

# Disable a specific module
harness.disable_module("attention")
report = system.run_cycle("Test input with no attention.")
harness.restore_all()

# Perturb attention
harness.perturb_attention("forced_focus")
report = system.run_cycle("Test input with forced attention.")
harness.restore_all()

# Reduce workspace capacity
harness.reduce_workspace_capacity(1)
report = system.run_cycle("Test input with reduced workspace.")
harness.restore_all()

2.6 Comparing Reports

# Get the degradation between baseline and intervention
baseline = system.run_cycle("Test input.")
harness.disable_module("workspace")
intervened = system.run_cycle("Test input.")
harness.restore_all()

diff = harness.compare_reports(baseline, intervened)
print(f"Score change: {diff['score_change']}")
print(f"Broadcast change: {diff['broadcast_count_change']}")
print(f"Percept change: {diff['percept_count_change']}")
print(f"Attention changed: {diff['attention_focus_changed']}")
print(f"Summary: {diff['summary']}")

3. Scorecard Analysis

3.1 Generating a Scorecard

from cia.simulation import CombinedConsciousnessIndicatorSystem
from cia.scorecard import ConsciousnessIndicatorScorecard

system = CombinedConsciousnessIndicatorSystem()
scorecard_gen = ConsciousnessIndicatorScorecard()

# Run a cycle and generate scorecard
report = system.run_cycle("The agent reflected on its own attention process.")
scorecard = scorecard_gen.generate(report.indicator_scores)

# Examine scorecard data
print(f"Total: {scorecard['total_score']}/{scorecard['max_possible']}")
print(f"Normalized: {scorecard['normalized_score']:.1%}")
print(f"Risk Tier: {scorecard['risk_tier']}")
print(f"\nEvidence:\n{scorecard['evidence_summary']}")
print(f"\nRecommendations:")
for rec in scorecard['recommendations']:
    print(f"  - {rec}")

3.2 Formatted Report

formatted = scorecard_gen.format_report(scorecard)
print(formatted)

The formatted report includes: - Scientific disclaimer banner - Per-category scores with visual bar indicators (██░ format) - Aggregate metrics (total, normalized, risk tier) - Evidence summary for notable indicators - Collected caveats from all categories - Actionable recommendations

3.3 Comparing Scorecards Over Time

# Scorecard before modification
report1 = system.run_cycle("Input A")
card1 = scorecard_gen.generate(report1.indicator_scores)

# Modify system (e.g., add identity markers)
system.self_model.add_identity_marker("reasoning system")
system.self_model.set_goal("minimize prediction error")

# Scorecard after modification
report2 = system.run_cycle("Input A")
card2 = scorecard_gen.generate(report2.indicator_scores)

# Compare
comparison = scorecard_gen.compare_scorecards(card1, card2)
print(comparison["summary"])
print(f"Risk tier changed: {comparison['risk_tier_changed']}")
print(f"Category changes: {comparison['category_changes']}")

3.4 Batch Scorecard from JSON File

# Create a JSON file with inputs
echo '["The cat sat on the mat.", "The agent reflected on its own beliefs.", "Predictive processing minimizes surprise."]' > inputs.json

# Generate scorecard report
cia score --input inputs.json

4. Reproducibility

4.1 Deterministic Outputs

CIA is fully deterministic: identical inputs produce identical outputs. There is no randomness, no stochastic sampling, and no external API calls. This ensures that experiments are reproducible across runs, platforms, and Python versions (within the same major version).

4.2 Reproducing an Experiment

To reproduce an experiment:

  1. Record the system configuration: python config = { "recurrent_cycles": system.recurrent_binding._default_cycles, "workspace_capacity": system.workspace._capacity, "attention_weights": system.attention._weights, "welfare_thresholds": system.welfare_monitor._thresholds.__dict__, }

  2. Record the input: Save the exact input text string

  3. Record the random seed (not needed — system is deterministic, but good practice)

  4. Save the full report: report.model_dump_json()

  5. Share the pyproject.toml version: Ensures dependency consistency

4.3 Version Pinning

pip install -e ".[dev]"  # Install exact project version
pip list | grep consciousness-indicator-architecture

4.4 Test Suite as Reproducibility Check

The test suite serves as a built-in reproducibility check:

python -m pytest -v

If all tests pass, the system is producing consistent, expected outputs.

4.5 Experiment Logging Template

import json
from datetime import datetime, timezone

experiment_log = {
    "timestamp": datetime.now(timezone.utc).isoformat(),
    "cia_version": "0.1.0",
    "description": "Effect of disabling global workspace on indicator scores",
    "config": {
        "recurrent_cycles": 3,
        "workspace_capacity": 3,
    },
    "input_text": "The agent reflected on its own decision-making process.",
    "baseline_score": baseline.indicator_scores.total_score,
    "intervention": "disable_workspace",
    "intervention_score": intervened.indicator_scores.total_score,
    "degradation": harness.compare_reports(baseline, intervened),
    "full_baseline_report": baseline.model_dump(),
    "full_intervention_report": intervened.model_dump(),
}

with open("experiment_log.json", "w") as f:
    json.dump(experiment_log, f, indent=2)

5. Example Experiment Walkthrough

5.1 Goal: Determine which modules most affect indicator scores

from cia.simulation import CombinedConsciousnessIndicatorSystem
from cia.intervention import CausalInterventionHarness

system = CombinedConsciousnessIndicatorSystem()
harness = CausalInterventionHarness(system)

input_text = "The agent noticed its own processing limitations when predicting the next state."

# Run all single-module interventions
interventions = [
    {"name": "disable_perception"},
    {"name": "disable_recurrent_binding"},
    {"name": "disable_predictive_model"},
    {"name": "disable_attention"},
    {"name": "disable_workspace"},
    {"name": "disable_memory"},
    {"name": "disable_self_model"},
]

result = harness.run_intervention_experiment(input_text, interventions)

# Print degradation summary
print("=== Module Contribution Analysis ===\n")
print(f"Baseline score: {result['baseline'].indicator_scores.total_score}/22\n")

for intv in result["interventions"]:
    name = intv["name"]
    deg = result["degradation"][name]
    print(f"{name}:")
    print(f"  Score: {intv['report'].indicator_scores.total_score}/22 "
          f"(Δ={deg['score_change']:+d})")
    print(f"  {deg['summary']}")
    print()

5.2 Expected Output Pattern

Intervention Typical Score Change Explanation
disable_perception -2 to -4 No percepts → no binding, no workspace content
disable_recurrent_binding -1 to -2 No stability score → recurrent processing indicator drops
disable_workspace -1 to -3 No broadcasts → global broadcast indicator drops, self-model less informed
disable_attention -1 to -2 No focus → attention schema has nothing to compare
disable_memory -1 to -2 No traces → memory continuity drops
disable_self_model -1 to -2 No beliefs → self-model and metacognition drop
disable_predictive_model -1 to -2 No hypotheses → predictive modeling drops

Actual values depend on the input text and system configuration. The key insight is that each module's contribution can be measured independently through this lesion-study approach.