05 - Experiments Guide¶

Running Experiments with the Consciousness-Indicator Architecture¶

Reminder: All experimental results are theory-derived indicator scores and do NOT prove, establish, or demonstrate subjective experience. Include the scientific boundary disclaimer in all outputs.

1. Running Basic Cognitive Cycles¶

1.1 Single Cycle via Python API¶

from cia.simulation import CombinedConsciousnessIndicatorSystem
from cia.scorecard import ConsciousnessIndicatorScorecard

# Initialize the system
system = CombinedConsciousnessIndicatorSystem(recurrent_cycles=3)
scorecard_gen = ConsciousnessIndicatorScorecard()

# Run a single cognitive cycle
input_text = "A red object moved behind a screen and reappeared."
report = system.run_cycle(input_text)

# Examine intermediate states
print(f"Percepts: {len(report.percepts)} extracted")
print(f"Binding stability: {report.bound_percept.stability:.3f}")
print(f"Attention focus: {report.attention_state.current_focus}")
print(f"Broadcasts: {len(report.workspace_broadcasts)}")
print(f"Self-model belief: {report.self_model.current_belief[:60]}")

# Generate scorecard
scores = report.indicator_scores
print(f"\nTotal score: {scores.total_score}/{scores.max_possible}")
print(f"Normalized: {scores.normalized:.1%}")

scorecard = scorecard_gen.generate(scores)
print(scorecard_gen.format_report(scorecard))

1.2 Multiple Cycles¶

# The system maintains state across cycles
system = CombinedConsciousnessIndicatorSystem()

inputs = [
    "The cat observed its own reflection in the mirror.",
    "Consciousness requires both global broadcast and recurrent processing.",
    "The agent noticed it was making prediction errors and adjusted its model.",
    "Attention schema theory proposes that awareness is a model of attention.",
]

for text in inputs:
    report = system.run_cycle(text)
    print(f"Cycle {report.cycle_id}: score={report.indicator_scores.total_score}/"
          f"{report.indicator_scores.max_possible} "
          f"({report.indicator_scores.normalized:.1%})")

1.3 CLI Single Cycle¶

cia run "The system reflected on its own decision-making process."

1.4 CLI Multiple Cycles¶

cia run "Repeated input to observe score changes across cycles." --cycles 5

1.5 CLI Verbose Mode¶

cia run "Consciousness is the greatest mystery in science." --verbose

Verbose mode shows the per-category indicator breakdown table in addition to the summary.

2. Intervention Experiments¶

2.1 What Are Interventions?¶

Intervention experiments use the CausalInterventionHarness to systematically disable or perturb modules, then measure the functional degradation compared to a baseline. This is analogous to neuropsychological lesion studies — if disabling a module causes specific indicator scores to drop, it suggests that module contributes to those indicators.

2.2 Available Interventions¶

Intervention	Description	What It Tests
`disable_perception`	Replace perception with no-op (empty percepts)	Contribution of perception to all downstream indicators
`disable_recurrent_binding`	Replace recurrent binding with no-op	Contribution of recurrent processing to binding stability
`disable_predictive_model`	Replace predictive model with no-op	Contribution of prediction to error tracking and modeling indicators
`disable_attention`	Replace attention with no-op (empty focus)	Contribution of attention to workspace and schema indicators
`disable_workspace`	Replace workspace with no-op (no broadcasts)	Contribution of global broadcast to integration and self-model
`disable_memory`	Replace memory with no-op (no storage)	Contribution of memory to continuity and episodic indicators
`disable_self_model`	Replace self-model with no-op	Contribution of self-model to metacognition and HOT indicators
`perturb_attention`	Force attention to a fixed focus	Impact of attentional rigidity on downstream processing
`reduce_workspace`	Reduce workspace capacity to 1	Impact of bottleneck narrowing on broadcast indicators
`remove_recurrent_cycles`	Set recurrent cycles to 0	Impact of removing iterative refinement on binding stability
`clear_memory`	Clear all memory stores	Impact of memory loss on continuity indicators

2.3 Running via CLI¶

# Disable the global workspace
cia intervene --type disable_workspace

# Disable perception with custom input
cia intervene --type disable_perception --input "The agent processed the sensory input."

# Clear memory
cia intervene --type clear_memory

# Reduce workspace capacity
cia intervene --type reduce_capacity

# Remove recurrent processing cycles
cia intervene --type remove_recurrent

2.4 Running via Python API¶

from cia.simulation import CombinedConsciousnessIndicatorSystem
from cia.intervention import CausalInterventionHarness

system = CombinedConsciousnessIndicatorSystem()
harness = CausalInterventionHarness(system)

# Define interventions
interventions = [
    {"name": "disable_workspace"},
    {"name": "disable_perception"},
    {"name": "clear_memory"},
    {"name": "reduce_workspace", "new_capacity": 1},
    {"name": "remove_recurrent_cycles"},
]

# Run experiment
input_text = "The system noticed its own processing limitations."
result = harness.run_intervention_experiment(input_text, interventions)

# Analyze results
baseline = result["baseline"]
print(f"Baseline score: {baseline.indicator_scores.total_score}/{baseline.indicator_scores.max_possible}")

for intv in result["interventions"]:
    name = intv["name"]
    report = intv["report"]
    degradation = result["degradation"][name]
    print(f"\n{name}:")
    print(f"  Score: {report.indicator_scores.total_score}/{report.indicator_scores.max_possible}")
    print(f"  Degradation: {degradation['summary']}")

# System is automatically restored after the experiment

2.5 Manual Intervention Control¶

harness = CausalInterventionHarness(system)

# Disable a specific module
harness.disable_module("attention")
report = system.run_cycle("Test input with no attention.")
harness.restore_all()

# Perturb attention
harness.perturb_attention("forced_focus")
report = system.run_cycle("Test input with forced attention.")
harness.restore_all()

# Reduce workspace capacity
harness.reduce_workspace_capacity(1)
report = system.run_cycle("Test input with reduced workspace.")
harness.restore_all()

2.6 Comparing Reports¶

# Get the degradation between baseline and intervention
baseline = system.run_cycle("Test input.")
harness.disable_module("workspace")
intervened = system.run_cycle("Test input.")
harness.restore_all()

diff = harness.compare_reports(baseline, intervened)
print(f"Score change: {diff['score_change']}")
print(f"Broadcast change: {diff['broadcast_count_change']}")
print(f"Percept change: {diff['percept_count_change']}")
print(f"Attention changed: {diff['attention_focus_changed']}")
print(f"Summary: {diff['summary']}")

3. Scorecard Analysis¶

3.1 Generating a Scorecard¶

from cia.simulation import CombinedConsciousnessIndicatorSystem
from cia.scorecard import ConsciousnessIndicatorScorecard

system = CombinedConsciousnessIndicatorSystem()
scorecard_gen = ConsciousnessIndicatorScorecard()

# Run a cycle and generate scorecard
report = system.run_cycle("The agent reflected on its own attention process.")
scorecard = scorecard_gen.generate(report.indicator_scores)

# Examine scorecard data
print(f"Total: {scorecard['total_score']}/{scorecard['max_possible']}")
print(f"Normalized: {scorecard['normalized_score']:.1%}")
print(f"Risk Tier: {scorecard['risk_tier']}")
print(f"\nEvidence:\n{scorecard['evidence_summary']}")
print(f"\nRecommendations:")
for rec in scorecard['recommendations']:
    print(f"  - {rec}")

3.2 Formatted Report¶

formatted = scorecard_gen.format_report(scorecard)
print(formatted)

The formatted report includes: - Scientific disclaimer banner - Per-category scores with visual bar indicators (██░ format) - Aggregate metrics (total, normalized, risk tier) - Evidence summary for notable indicators - Collected caveats from all categories - Actionable recommendations

3.3 Comparing Scorecards Over Time¶

# Scorecard before modification
report1 = system.run_cycle("Input A")
card1 = scorecard_gen.generate(report1.indicator_scores)

# Modify system (e.g., add identity markers)
system.self_model.add_identity_marker("reasoning system")
system.self_model.set_goal("minimize prediction error")

# Scorecard after modification
report2 = system.run_cycle("Input A")
card2 = scorecard_gen.generate(report2.indicator_scores)

# Compare
comparison = scorecard_gen.compare_scorecards(card1, card2)
print(comparison["summary"])
print(f"Risk tier changed: {comparison['risk_tier_changed']}")
print(f"Category changes: {comparison['category_changes']}")

3.4 Batch Scorecard from JSON File¶

# Create a JSON file with inputs
echo '["The cat sat on the mat.", "The agent reflected on its own beliefs.", "Predictive processing minimizes surprise."]' > inputs.json

# Generate scorecard report
cia score --input inputs.json

4. Reproducibility¶

4.1 Deterministic Outputs¶

CIA is fully deterministic: identical inputs produce identical outputs. There is no randomness, no stochastic sampling, and no external API calls. This ensures that experiments are reproducible across runs, platforms, and Python versions (within the same major version).

4.2 Reproducing an Experiment¶

To reproduce an experiment:

Record the system configuration: python config = { "recurrent_cycles": system.recurrent_binding._default_cycles, "workspace_capacity": system.workspace._capacity, "attention_weights": system.attention._weights, "welfare_thresholds": system.welfare_monitor._thresholds.__dict__, }
Record the input: Save the exact input text string
Record the random seed (not needed — system is deterministic, but good practice)
Save the full report: report.model_dump_json()
Share the pyproject.toml version: Ensures dependency consistency

4.3 Version Pinning¶

pip install -e ".[dev]"  # Install exact project version
pip list | grep consciousness-indicator-architecture

4.4 Test Suite as Reproducibility Check¶

The test suite serves as a built-in reproducibility check:

python -m pytest -v

If all tests pass, the system is producing consistent, expected outputs.

4.5 Experiment Logging Template¶

import json
from datetime import datetime, timezone

experiment_log = {
    "timestamp": datetime.now(timezone.utc).isoformat(),
    "cia_version": "0.1.0",
    "description": "Effect of disabling global workspace on indicator scores",
    "config": {
        "recurrent_cycles": 3,
        "workspace_capacity": 3,
    },
    "input_text": "The agent reflected on its own decision-making process.",
    "baseline_score": baseline.indicator_scores.total_score,
    "intervention": "disable_workspace",
    "intervention_score": intervened.indicator_scores.total_score,
    "degradation": harness.compare_reports(baseline, intervened),
    "full_baseline_report": baseline.model_dump(),
    "full_intervention_report": intervened.model_dump(),
}

with open("experiment_log.json", "w") as f:
    json.dump(experiment_log, f, indent=2)

5. Example Experiment Walkthrough¶

5.1 Goal: Determine which modules most affect indicator scores¶

from cia.simulation import CombinedConsciousnessIndicatorSystem
from cia.intervention import CausalInterventionHarness

system = CombinedConsciousnessIndicatorSystem()
harness = CausalInterventionHarness(system)

input_text = "The agent noticed its own processing limitations when predicting the next state."

# Run all single-module interventions
interventions = [
    {"name": "disable_perception"},
    {"name": "disable_recurrent_binding"},
    {"name": "disable_predictive_model"},
    {"name": "disable_attention"},
    {"name": "disable_workspace"},
    {"name": "disable_memory"},
    {"name": "disable_self_model"},
]

result = harness.run_intervention_experiment(input_text, interventions)

# Print degradation summary
print("=== Module Contribution Analysis ===\n")
print(f"Baseline score: {result['baseline'].indicator_scores.total_score}/22\n")

for intv in result["interventions"]:
    name = intv["name"]
    deg = result["degradation"][name]
    print(f"{name}:")
    print(f"  Score: {intv['report'].indicator_scores.total_score}/22 "
          f"(Δ={deg['score_change']:+d})")
    print(f"  {deg['summary']}")
    print()

5.2 Expected Output Pattern¶

Intervention	Typical Score Change	Explanation
`disable_perception`	-2 to -4	No percepts → no binding, no workspace content
`disable_recurrent_binding`	-1 to -2	No stability score → recurrent processing indicator drops
`disable_workspace`	-1 to -3	No broadcasts → global broadcast indicator drops, self-model less informed
`disable_attention`	-1 to -2	No focus → attention schema has nothing to compare
`disable_memory`	-1 to -2	No traces → memory continuity drops
`disable_self_model`	-1 to -2	No beliefs → self-model and metacognition drop
`disable_predictive_model`	-1 to -2	No hypotheses → predictive modeling drops

Actual values depend on the input text and system configuration. The key insight is that each module's contribution can be measured independently through this lesion-study approach.