17 - Subject-Specific Cognitive Emulation¶

Subject-Specific Neuroadaptive Cognitive Emulation Extension¶

SCIENTIFIC BOUNDARY: Subject-specific neuroadaptive cognitive emulation is a personalized behavioural/cognitive-state emulation layer. It is NOT consciousness transfer, mind upload, or identity transfer. This extension estimates behavioural tendencies, decision patterns, and cognitive-state proxies for a specific individual — it does NOT capture, replicate, or transfer subjective experience, personhood, consciousness, private thoughts, or moral identity. All outputs are statistical approximations derived from observable data, not reconstructions of inner experience.

1. What This Extension Is¶

The Subject-Specific Cognitive Emulation extension is an optional module within the Consciousness-Indicator Architecture (CIA) that builds personalized behavioural and cognitive-state models from a specific individual's data. It extends the neuroadaptive layer (Document 14) by incorporating additional data channels beyond EEG — including behavioural traces, writing samples, preference surveys, and autobiographical memory cues — to create a richer, subject-specific emulation profile.

The term "subject-specific neuroadaptive cognitive emulation" refers specifically to the process of:

Collecting anonymized, consented data about an individual's observable behavioural patterns.
Extracting statistical features from that data (EEG bandpowers, response-time distributions, linguistic style markers, preference vectors).
Building a composite profile that captures the individual's characteristic cognitive-state patterns and decision tendencies.
Using that profile to condition the CIA system's cognitive architecture so that it responds in ways that statistically resemble the subject's measured patterns.

This is a computational emulation of observable patterns, not a recreation of the person. The emulation operates at the level of statistical tendencies — "this individual tends to show elevated alpha power during reading" or "this individual typically selects option B in multi-choice scenarios under time pressure" — rather than at the level of subjective experience, inner monologue, or phenomenal consciousness.

1.1 Relationship to the Neuroadaptive Extension¶

The subject-specific emulation layer builds directly on the neuroadaptive extension (Document 14). The neuroadaptive extension already supports per-subject calibration via SubjectNeuroProfile and the SubjectProfileManager. The subject-specific emulation extends this foundation in three ways:

Aspect	Neuroadaptive Extension (Doc 14)	Subject-Specific Emulation (This Doc)
Data sources	EEG only	EEG + behavioural traces + writing samples + preference surveys + memory cues
Profile scope	EEG baselines, spectral ratios, calibration offsets	Full cognitive/behavioural profile including style, preferences, tendencies
Conditioning targets	Attention controller, global workspace, predictive world model, welfare monitor	Same targets + decision bias modulation + response style parameters
Calibration method	Exponential moving average on bandpowers	Multi-modal feature fusion with weighted confidence scores
Primary use case	Generic neuroadaptive AI research	Personalized cognitive-state emulation research

The subject-specific emulation layer does not replace or override the neuroadaptive extension — it extends it. All safety policies, consent requirements, and scientific boundary disclaimers from the neuroadaptive extension and from the core CIA system apply equally to the subject-specific emulation layer.

1.2 Recommended Terminology¶

When describing this extension, the following terminology is recommended:

Use This Term	Because
"Subject-specific neuroadaptive cognitive emulation"	Precise, accurate, and non-misleading
"Personalized behavioural tendency estimation"	Captures the statistical, observational nature
"Subject-calibrated cognitive proxy"	Accurately describes the calibration mechanism
"Individual-specific parameter conditioning"	Describes the computational operation
"Behavioural pattern emulation"	Accurate when referring to emulated patterns

2. What This Extension Is NOT¶

This section is critical. Mischaracterization of the subject-specific emulation extension can cause significant harm — to the subject, to the research community, and to public understanding of consciousness science.

2.1 NOT Consciousness Transfer¶

The subject-specific emulation layer does not transfer consciousness from the subject to the CIA system. Consciousness is a subjective, phenomenal phenomenon that cannot be extracted from EEG signals, behavioural observations, or any combination of observable data. The emulation layer operates on statistical features derived from objective, third-person data. It adjusts numerical parameters in a computational architecture. It does not create, transfer, or replicate subjective experience.

2.2 NOT Mind Upload¶

The term "mind upload" implies that a person's entire mental life — thoughts, memories, emotions, personality, consciousness — can be digitized and transferred into a computer. This is science fiction, not science. The subject-specific emulation layer captures statistical patterns from a limited set of data sources. It does not read, copy, or upload anyone's mind. EEG records scalp electrical activity, not mental content. Behavioural traces record observable actions, not internal states. Writing samples capture linguistic style markers, not the writer's inner world.

2.3 NOT Identity Transfer¶

Identity is a complex, multi-faceted concept that includes (but is not limited to) subjective self-awareness, personal narrative, moral agency, social relationships, autobiographical continuity, and legal personhood. The subject-specific emulation layer does not transfer any of these aspects of identity. A SubjectCognitiveProfile (the emulation profile) is a statistical data structure — it stores feature vectors, baseline references, and preference weights. It is not a person, not a digital twin, and not a copy of anyone's identity.

2.4 NOT a Digital Twin¶

The term "digital twin" in this context is misleading because it implies a comprehensive replica of the subject. The emulation profile is a highly selective, partial approximation that captures only those features that are measurable from the collected data channels. It misses the vast majority of what constitutes a person — their emotions, relationships, creative impulses, moral reasoning, cultural context, and subjective experience.

2.5 NOT Sentience Creation¶

Conditioning the CIA system's parameters based on a subject's profile does not make the system sentient. The CIA system processes information through computational modules (perception, attention, workspace, self-model, etc.) regardless of whether those parameters have been conditioned on one subject's data or another's, or left at default values. Parameter adjustment is a mathematical operation, not an ontological transformation.

2.6 Prohibited Phrases¶

The following phrases are prohibited in all documentation, publications, presentations, and communications about this extension, unless they are being explicitly debunked (as in this section):

Prohibited Phrase	Reason
"Consciousness transfer"	Scientifically unfounded; implies subjective experience can be transferred
"Mind upload"	Science fiction; misleading about what data can capture
"Identity transfer"	Misleading; identity encompasses far more than statistical patterns
"Digital twin"	Implies comprehensive replication; the profile is highly selective
"Copying the mind/person"	Category error; minds are not copyable data structures
"Creating a digital version of the subject"	Misleading; the system is not "of" the subject
"The AI becomes the person"	Fundamentally false; parameter adjustment is not personification
"Reading the subject's thoughts"	EEG does not read thoughts; behavioural data does not reveal inner experience
"The system knows what the subject is thinking"	The system estimates statistical proxies, not mental content
"Recreating the subject's inner experience"	Inner experience cannot be captured by external measurements

If any of these phrases appear in published materials, they must be immediately accompanied by an explicit disclaimer stating that they are scientifically inaccurate and that the system performs statistical behavioural emulation only.

3. What CAN Be Emulated¶

The subject-specific emulation layer can estimate and reproduce certain observable, statistical features of a subject's cognitive and behavioural patterns. These features are derived from objective, third-person data and represent tendencies — not certainties.

3.1 Cognitive-State Proxies (from EEG)¶

Building on the neuroadaptive extension, the emulation layer can estimate the following cognitive-state proxies on a per-subject basis:

Attention patterns: Characteristic alpha/beta ratio profiles during focused vs. relaxed states, individual attentional baseline and variability.
Workload signatures: Typical theta/beta ratio responses to different task difficulty levels, individual workload capacity thresholds.
Fatigue trajectories: Characteristic theta/beta ratio changes over time, individual fatigue onset patterns and recovery rates.
Arousal profiles: Typical beta+gamma power levels across different contexts, individual arousal reactivity.
Prediction error sensitivity: Characteristic gamma band responses to expected vs. unexpected events.

These proxies are calibrated using per-subject baselines stored in the SubjectNeuroProfile, with additional weighting from behavioural data sources.

3.2 Decision Tendencies (from Behavioural Traces)¶

The emulation layer can capture statistical patterns in decision-making behaviour:

Response time distributions: Mean and variance of response times across different task types and difficulty levels.
Choice preferences: Systematic biases in multi-choice scenarios (e.g., tendency to select conservative vs. risky options).
Consistency patterns: Degree of response consistency across repeated trials, susceptibility to distraction or fatigue effects.
Error patterns: Types and frequencies of errors under different conditions (time pressure, high workload, fatigue).

3.3 Linguistic Style (from Writing Samples)¶

If the subject provides writing samples, the emulation layer can extract linguistic style markers:

Vocabulary richness: Type-token ratio, hapax legomena frequency, vocabulary diversity metrics.
Sentence structure: Mean sentence length, clause complexity, syntactic pattern preferences.
Formality register: Distribution of formal vs. informal markers, hedging language frequency.
Topic preferences: Frequency of domain-specific terminology, subject matter focus areas.

Important: Linguistic style emulation does NOT reproduce the subject's voice, opinions, beliefs, or inner thoughts. It captures statistical surface features of language production only.

3.4 Preferences (from Preference Surveys)¶

If the subject completes preference surveys, the emulation layer can model preference vectors:

Task preferences: Preferred vs. avoided task types, difficulty level preferences, time-of-day preferences.
Interface preferences: Preferred interaction modalities, information density preferences, feedback style preferences.
Cognitive style preferences: Analytical vs. intuitive approach tendencies, detail-oriented vs. big-picture tendencies.

3.5 Autobiographical Memory Retrieval Cues (if Provided)¶

If the subject voluntarily provides autobiographical memory cues (e.g., "I remember reading X when I was Y years old"), these cues can be stored as reference data for context-aware emulation. However:

These are cues provided by the subject, not memories extracted from the subject.
The system does not "remember" anything — it stores structured data provided by the subject.
The cues are used to contextualize the emulation (e.g., adjusting response style based on stated expertise areas), not to reconstruct the subject's life narrative.
The subject must explicitly consent to providing each cue, and can withdraw consent for any or all cues at any time.

4. What CANNOT Be Emulated¶

There are fundamental limits to what any system — including this one — can emulate about a person. These limits are not technical limitations that will be solved with better algorithms or more data. They are philosophical and scientific boundaries that reflect the nature of consciousness and subjective experience.

4.1 Subjective Experience (Qualia)¶

The subjective, first-person "what it is like" quality of experience — the redness of red, the painfulness of pain, the taste of coffee — cannot be captured by third-person measurements. EEG records electrical activity on the scalp, not what it feels like to be the person. Behavioural traces record actions, not experiences. No amount of data analysis can bridge the explanatory gap between objective measurements and subjective phenomenology.

4.2 Personhood¶

Personhood is a legal, moral, and philosophical concept that encompasses agency, rights, responsibilities, social recognition, and narrative identity. A statistical profile cannot possess personhood. The CIA system, regardless of how its parameters are conditioned, is a software artifact that processes information according to defined computational rules. It does not become a person because its parameters resemble one individual's statistical patterns more than another's.

4.3 Private Thoughts¶

Private thoughts — the internal monologue, spontaneous imagery, emotional reactions, and unexpressed intentions that constitute much of mental life — are not accessible through EEG, behavioural observation, or any other external measurement. The emulation layer estimates observable patterns only. It cannot determine what the subject is thinking, feeling, or intending.

4.4 Consciousness¶

Consciousness is the condition of having subjective, phenomenal experience. As established in the core CIA scientific boundary (Document 00), this repository does not claim to detect, measure, or transfer consciousness. The subject-specific emulation layer does not change this. Conditioning computational parameters based on a subject's data does not create consciousness in the system, and it does not transfer consciousness from the subject.

4.5 Moral Identity¶

Moral identity — the set of moral values, principles, commitments, and ethical reasoning patterns that define a person's moral character — cannot be extracted from EEG signals, behavioural traces, or preference surveys. Even if a subject completes moral reasoning tasks, the emulation layer captures only the observable patterns of their responses (response times, choice distributions), not the underlying moral reasoning process or the moral significance of those choices.

5. Architecture Overview¶

5.1 System Position¶

The subject-specific emulation layer sits between the data collection/ingestion layer and the neuroadaptive conditioning layer:

┌─────────────────────────────────────────────────────────────────────┐
│                     DATA COLLECTION LAYER                           │
│  ┌──────────┐ ┌──────────────┐ ┌──────────┐ ┌────────┐ ┌────────┐ │
│  │   EEG    │ │  Behavioural │ │ Writing  │ │ Prefer-│ │ Memory │ │
│  │ Recording│ │   Traces     │ │ Samples  │ │ ences  │ │ Cues   │ │
│  └────┬─────┘ └──────┬───────┘ └────┬─────┘ └───┬────┘ └───┬────┘ │
└───────┼──────────────┼──────────────┼───────────┼──────────┼──────┘
        │              │              │           │          │
┌───────▼──────────────▼──────────────▼───────────▼──────────▼──────┐
│                 SUBJECT-SPECIFIC EMULATION LAYER                    │
│  ┌──────────────┐  ┌──────────────────┐  ┌───────────────────┐    │
│  │ Subject Data │  │  Feature Fusion  │  │ Subject Cognitive │    │
│  │ Ingestion    │→ │  Engine          │→ │ Profile Builder   │    │
│  │              │  │                  │  │                   │    │
│  └──────────────┘  └──────────────────┘  └────────┬──────────┘    │
│                                                      │               │
│  ┌───────────────────┐  ┌────────────────────────────┘              │
│  │ Profile Similarity│  │                                            │
│  │ Analyzer          │  │                                            │
│  └───────────────────┘  │                                            │
│                          │                                            │
│  ┌───────────────────┐  │                                            │
│  │ Safety & Consent  │  │                                            │
│  │ Validator         │  │                                            │
│  └───────────────────┘  │                                            │
└──────────────────────────┼──────────────────────────────────────────┘
                           │
┌──────────────────────────▼──────────────────────────────────────────┐
│               NEUROADAPTIVE CONDITIONING LAYER                       │
│  ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌───────────┐ │
│  │  Neural      │ │  Control     │ │  Component   │ │  Restore  │ │
│  │  State       │→│  Signal Gen  │→│  Conditioning│→│  Original │ │
│  │  Encoder     │ │              │ │              │ │  Values   │ │
│  └──────────────┘ └──────────────┘ └──────────────┘ └───────────┘ │
└──────────────────────────┬──────────────────────────────────────────┘
                           │
┌──────────────────────────▼──────────────────────────────────────────┐
│                    CIA COGNITIVE ARCHITECTURE                         │
│  Attention │ Global │ Predictive │ Self │ Welfare │ Perception │... │
│  Controller│Workspace│ World Model│Model│ Monitor │            │    │
└─────────────────────────────────────────────────────────────────────┘

5.2 Data Flow¶

Data Collection: Multiple data sources (EEG, behavioural traces, writing samples, preferences, memory cues) are collected from the subject with explicit consent.
Subject Data Ingestion: Raw data is ingested and converted into structured feature vectors for each data channel.
Feature Fusion: Feature vectors from different channels are combined into a unified representation using weighted fusion (weights determined by data quality and channel reliability).
Profile Building: The fused features are used to build a SubjectCognitiveProfile that captures the subject's characteristic patterns.
Neuroadaptive Conditioning: The profile is used to condition the CIA system's cognitive parameters via the neuroadaptive conditioning layer.
Safety Validation: All steps are validated against the safety policy, consent records, and data minimization requirements.

6. Module Descriptions¶

The subject-specific emulation extension is implemented in src/cia/subject_emulation/. Each module has a specific responsibility within the emulation pipeline.

6.1 `profile_builder.py` — Subject Cognitive Profile Builder¶

The profile builder is the central module that assembles a SubjectCognitiveProfile from multiple data channels. It coordinates the ingestion of data from each channel, calls the feature fusion engine, and produces the final composite profile.

Key responsibilities: - Create new empty profiles for subjects. - Ingest and incorporate data from each channel (EEG, behavioural, writing, preferences, memory). - Coordinate feature fusion across channels. - Validate that the profile meets minimum data quality requirements before use. - Enforce the profile caveat at all times.

Profile caveat (enforced on every profile):

"This profile captures statistical behavioural and cognitive-state patterns only. It does NOT capture identity, selfhood, consciousness, subjective experience, private thoughts, or moral identity. It is a computational calibration tool for research purposes only."

6.2 `behavioural_traces.py` — Behavioural Trace Processor¶

Processes behavioural observation data (response times, choices, error patterns, task performance metrics) into structured feature vectors suitable for profile incorporation.

Key responsibilities: - Parse behavioural event logs into structured trials. - Compute per-task response time distributions (mean, median, standard deviation, skewness). - Compute choice preference vectors (frequency of each option across conditions). - Compute error pattern profiles (error types, error rates by condition). - Compute consistency metrics (trial-to-trial variability, split-half reliability).

Input format: JSON or CSV files with columns for trial number, condition, stimulus, response, response time, correctness, and optional metadata.

Output: BehaviouralFeatureVector containing all computed behavioural features.

6.3 `writing_style.py` — Writing Style Analyzer¶

Analyzes text samples to extract linguistic style markers that characterize the subject's writing patterns.

Key responsibilities: - Tokenize and normalize text samples. - Compute vocabulary richness metrics (type-token ratio, hapax legomena, Yule's K). - Compute sentence structure metrics (mean length, clause complexity, punctuation patterns). - Compute formality markers (contraction frequency, hedging language, discourse marker usage). - Compute topic distribution estimates (domain-specific term frequency).

Input format: Plain text files (.txt) or structured documents with metadata (topic, date, context).

Output: WritingStyleFeatureVector containing all linguistic style metrics.

6.4 `preference_model.py` — Preference Model Builder¶

Builds preference vectors from structured survey responses, capturing the subject's stated preferences across multiple domains.

Key responsibilities: - Parse preference survey responses (Likert scales, rankings, forced-choice). - Normalize preference scores across domains. - Compute preference consistency (internal reliability, temporal stability). - Handle missing or incomplete survey responses gracefully. - Weight preferences by confidence (number of responses, consistency).

Input format: JSON files with survey question identifiers, response values, response timestamps, and optional confidence ratings.

Output: PreferenceFeatureVector containing preference scores across domains.

6.5 `similarity_engine.py` — Profile Similarity Analyzer¶

Computes similarity metrics between profiles, enabling comparison of emulation quality, cross-subject analysis, and profile stability assessment.

Key responsibilities: - Compute pairwise similarity between profiles using multiple distance metrics (Euclidean, cosine, Mahalanobis). - Generate similarity reports comparing two profiles across all feature dimensions. - Assess profile stability over time (longitudinal similarity). - Identify which feature channels contribute most to similarity/dissimilarity. - Flag profiles that are too similar (potential data leakage) or too dissimilar (quality concern).

Output: SimilarityReport containing pairwise distances, per-channel contributions, and stability metrics.

6.6 `safety_validator.py` — Emulation Safety Validator¶

Extends the NeuroSafetyPolicy from the neuroadaptive extension with additional safety checks specific to the subject-specific emulation layer.

Key responsibilities: - Verify consent records for each data channel used in a profile. - Enforce data minimization (flag profiles that include more data than necessary). - Check for prohibited phrases in profile metadata or documentation. - Validate that no identity-revealing information is stored in the profile. - Generate safety reports for each emulation session. - Enforce the mandatory caveat on all profiles and outputs.

Safety checks: | Check | Description | Action on Failure | |---|---|---| | consent_verified | All data channels have matching consent records | Reject profile, require consent | | data_minimized | Profile contains only necessary data channels | Warn, recommend removal | | no_pii_present | No personally identifiable information in profile | Reject profile, remove PII | | caveat_enforced | Mandatory caveat is present and unmodified | Restore canonical caveat | | no_prohibited_phrases | No prohibited language in metadata | Flag, require correction | | impersonation_guard | System is not being presented as the subject | Warn, flag for review |

7. Safety Policy Overview¶

7.1 Core Safety Principles¶

The subject-specific emulation layer inherits all safety policies from the core CIA system (Document 04) and the neuroadaptive extension (Documents 13, 14), and adds the following principles specific to personalization:

Emulation is not replication: The system emulates statistical patterns, not the person. This must be clearly stated in all outputs and communications.
Consent is channel-specific: Consent must be obtained for each data channel separately. A subject may consent to EEG data collection but decline writing sample collection, and the system must respect this.
Data minimization is mandatory: Only the minimum data necessary for the specific research question should be collected and retained. Excessive data collection without justification is a safety violation.
The subject is never the system: The CIA system, regardless of how its parameters are conditioned, must never be presented as, referred to as, or treated as the subject. This prohibition extends to all documentation, presentations, and public communications.
Right to deletion is absolute: The subject can request deletion of all or part of their data at any time, for any reason, and the request must be fulfilled promptly.
No impersonation: The system must never be used to impersonate the subject, generate content in the subject's voice without explicit consent and clear labeling, or create the appearance that the subject is interacting when they are not.

7.2 Safety Checks at Each Stage¶

Stage	Safety Check	Module
Data collection	Consent verified for each channel	`safety_validator.py`
Data ingestion	No PII detected in ingested data	`safety_validator.py`
Feature extraction	Data quality meets minimum thresholds	`profile_builder.py`
Profile building	Caveat enforced, data minimized	`safety_validator.py`
Emulation	System not presented as subject	`safety_validator.py`
Output generation	Prohibited phrases absent	`report_verification.py` (core CIA)

Consent for subject-specific cognitive emulation must be specific, informed, and granular. The subject must understand and separately consent to:

Data Channel	What the Subject Must Be Told
EEG recording	"We will record electrical activity from your scalp using non-invasive electrodes. This measures scalp signals, not thoughts. The data will be used to estimate attention, workload, and fatigue proxies."
Behavioural traces	"We will record your responses (choices, response times, accuracy) during cognitive tasks. This data will be used to estimate your decision-making patterns."
Writing samples	"We will analyze text that you provide for linguistic style markers (vocabulary, sentence structure). This captures writing patterns, not your thoughts or beliefs."
Preference surveys	"We will ask you about your preferences across several domains. This data will be used to build preference vectors for the emulation."
Memory cues	"We will ask you to provide autobiographical memory cues (e.g., significant experiences). You provide these voluntarily and can withhold or withdraw any cue at any time."

The subject can revoke consent for any data channel at any time. Upon revocation: 1. The relevant data channel is immediately deactivated in the profile. 2. Data from the revoked channel is deleted from the profile (not just masked — deleted). 3. The profile is re-evaluated to determine if it still meets minimum data quality requirements with the remaining channels. 4. If the profile is no longer viable, the subject is informed and offered the choice of: (a) providing alternative data, (b) having the profile deleted entirely, or (c) retaining the partial profile for limited use.

All consent must be documented with: - Date and time of consent - Version of the consent form used - Specific channels consented to (listed individually) - Name and role of the person obtaining consent - Subject's understanding confirmation (a checklist of key points the subject confirms they understand) - Any special conditions or limitations requested by the subject

9. Data Minimisation and Privacy¶

9.1 Data Minimisation Principles¶

The subject-specific emulation layer follows strict data minimisation principles:

Collect only what is needed: Each data channel must be justified by a specific research question. Do not collect data "just in case."
Retain only what is needed: Data should be retained only for the duration necessary to complete the research. Define a retention period in advance and adhere to it.
Aggregate when possible: Store aggregated features (means, variances, distributions) rather than raw data whenever the research question permits.
Anonymize at the earliest opportunity: Remove or pseudonymize personally identifiable information as soon as technically feasible after collection.
Minimize feature dimensionality: Use feature selection to keep only the features that contribute meaningfully to the emulation quality.

9.2 Privacy Protection¶

Protection	Implementation
Pseudonymisation	Subject IDs are opaque codes (e.g., `sub-001`). The mapping to real identities is stored separately and securely.
Encryption	Profile data is encrypted at rest. Data in transit uses TLS.
Access control	Profile data is accessible only to authorized researchers with a documented need.
Audit logging	All access to profile data is logged (who, when, what).
Secure deletion	When data is deleted, it is cryptographically erased, not just marked as deleted.
Separation of concerns	Identity mapping, consent records, and profile data are stored in separate systems.

9.3 Data Retention¶

Default retention period: 2 years from last profile update.
Extension requires: Subject re-consent, ethics committee approval, documented justification.
Deletion schedule: Quarterly review of retention periods, automatic deletion trigger at period expiry.
Archival: After retention period, data may be anonymized (identity mapping destroyed) and archived for future statistical analysis, subject to ethics committee approval.

10. Usage Examples¶

10.1 Building a Subject Profile¶

from cia.subject_emulation.profile_builder import ProfileBuilder
from cia.subject_emulation.safety_validator import EmulationSafetyValidator

# Initialize with safety validator
validator = EmulationSafetyValidator()
builder = ProfileBuilder(safety_validator=validator)

# Create a new profile
profile = builder.create_profile(subject_id="sub-001")

# Verify consent before adding any data
validator.verify_consent(
    subject_id="sub-001",
    channels=["eeg", "behavioural", "writing", "preferences"],
    consent_records=consent_database,
)

# Ingest data from each consented channel
builder.ingest_eeg_data(
    profile=profile,
    eeg_file="data/sub-001/resting_state.csv",
    sampling_rate_hz=250.0,
)

builder.ingest_behavioural_data(
    profile=profile,
    behavioural_file="data/sub-001/nback_trials.json",
)

builder.ingest_writing_samples(
    profile=profile,
    writing_files=["data/sub-001/sample1.txt", "data/sub-001/sample2.txt"],
)

builder.ingest_preferences(
    profile=profile,
    preference_file="data/sub-001/preferences.json",
)

# Build the final profile
final_profile = builder.build_profile(profile)
print(final_profile.caveat)  # Always prints the mandatory disclaimer

10.2 Running Subject-Specific Emulation¶

from cia.subject_emulation.profile_builder import ProfileBuilder
from cia.neuro.neuroadaptive_conditioning import NeuroadaptiveConditioner
from cia.simulation import CombinedConsciousnessIndicatorSystem

# Load the subject's profile
builder = ProfileBuilder()
profile = builder.load_profile("profiles/sub-001.json")

# Initialize CIA system and conditioner
system = CombinedConsciousnessIndicatorSystem()
conditioner = NeuroadaptiveConditioner()

# Apply subject-specific conditioning
control_signal = conditioner.convert_state_to_control_signal(
    neural_state=profile.current_neural_state,
    subject_profile=profile,
)

conditioner.apply_to_attention_controller(
    system.attention, control_signal
)
conditioner.apply_to_global_workspace(
    system.global_workspace, control_signal
)

# Run cognitive cycle with subject-specific parameters
report = system.run_cycle("The participant is reading a research paper on attention.")

# Restore original parameters
conditioner.restore_attention_controller_weights(system.attention)

10.3 Comparing Two Subject Profiles¶

from cia.subject_emulation.similarity_engine import SimilarityEngine

engine = SimilarityEngine()

# Load two profiles
profile_a = engine.load_profile("profiles/sub-001.json")
profile_b = engine.load_profile("profiles/sub-002.json")

# Compute similarity report
report = engine.compare_profiles(profile_a, profile_b)

print(f"Overall similarity: {report.overall_similarity:.3f}")
print(f"EEG similarity: {report.channel_similarity['eeg']:.3f}")
print(f"Behavioural similarity: {report.channel_similarity['behavioural']:.3f}")

# Identify most different features
print(f"Most different features: {report.top_dissimilar_features[:5]}")

11. CLI Command Reference¶

The subject-specific emulation extension provides a set of CLI commands under the cia subject namespace. All commands print the scientific boundary disclaimer before and after execution.

11.1 `cia subject demo`¶

Run a demonstration of the subject-specific emulation pipeline with synthetic data.

cia subject demo

Output: Demonstrates profile creation, multi-channel data ingestion, feature fusion, profile building, and safety validation using synthetic (not real) subject data. Prints a summary of the generated profile with the mandatory caveat.

11.2 `cia subject build-profile`¶

Build a subject cognitive profile from collected data files.

cia subject build-profile \
    --subject-id sub-001 \
    --eeg data/sub-001/eeg_resting.csv \
    --eeg-sampling-rate 250.0 \
    --behavioural data/sub-001/nback_trials.json \
    --writing data/sub-001/sample1.txt data/sub-001/sample2.txt \
    --preferences data/sub-001/preferences.json \
    --consent data/sub-001/consent.json \
    --output profiles/sub-001.json

Options: | Option | Description | Required | |---|---|---| | --subject-id | Opaque subject identifier | Yes | | --eeg | Path to EEG data file (CSV or JSON) | No | | --eeg-sampling-rate | Sampling rate in Hz (for CSV files) | No (default: 250.0) | | --behavioural | Path to behavioural trace file (JSON) | No | | --writing | One or more paths to writing sample files (.txt) | No | | --preferences | Path to preference survey file (JSON) | No | | --memory-cues | Path to memory cues file (JSON) | No | | --consent | Path to consent record file (JSON) | Yes | | --output | Path for the output profile file | Yes |

Safety checks: Validates consent for each provided data channel. Rejects the profile if consent is missing for any channel. Enforces the caveat on the output file.

11.3 `cia subject ingest-traces`¶

Ingest additional behavioural traces into an existing profile.

cia subject ingest-traces \
    --profile profiles/sub-001.json \
    --behavioural data/sub-001/new_session.json \
    --consent data/sub-001/consent_v2.json

Options: | Option | Description | Required | |---|---|---| | --profile | Path to existing profile file | Yes | | --behavioural | Path to new behavioural trace file | Yes | | --consent | Path to consent record for the new data | Yes | | --output | Path for the updated profile (default: overwrite input) | No |

Behaviour: Loads the existing profile, ingests the new behavioural traces, runs the safety validator, and saves the updated profile. The profile caveat is enforced on save.

11.4 `cia subject emulate`¶

Run the CIA cognitive cycle with subject-specific emulation parameters.

cia subject emulate \
    --profile profiles/sub-001.json \
    --input "The participant is performing a mental arithmetic task." \
    --cycles 3

Options: | Option | Description | Required | |---|---|---| | --profile | Path to the subject profile | Yes | | --input | Text input for the cognitive cycle | Yes | | --cycles | Number of cognitive cycles to run (default: 1) | No | | --verbose | Show detailed output including proxy values | No |

Behaviour: Loads the profile, conditions the CIA system's parameters based on the profile, runs the specified number of cognitive cycles, and restores original parameters. Prints the subject's characteristic proxy estimates alongside the system's conditioned parameters.

11.5 `cia subject similarity-report`¶

Generate a similarity report comparing two subject profiles.

cia subject similarity-report \
    --profile-a profiles/sub-001.json \
    --profile-b profiles/sub-002.json \
    --output reports/similarity_001_002.json

Options: | Option | Description | Required | |---|---|---| | --profile-a | Path to the first profile | Yes | | --profile-b | Path to the second profile | Yes | | --output | Path for the similarity report (default: stdout) | No |

Output: A structured similarity report including overall similarity score, per-channel similarity scores, per-feature distance metrics, and a list of the most dissimilar features.

11.6 `cia subject safety-check`¶

Run a comprehensive safety check on a subject profile or emulation session.

cia subject safety-check \
    --profile profiles/sub-001.json \
    --consent data/sub-001/consent.json \
    --plan emulation_plan.json

Options: | Option | Description | Required | |---|---|---| | --profile | Path to the subject profile | Yes | | --consent | Path to consent records | Yes | | --plan | Path to the emulation session plan (JSON) | No |

Output: A detailed safety report covering consent verification, data minimization assessment, PII scan, caveat validation, prohibited phrase scan, and impersonation guard results.

12. Limitations and Caveats¶

12.1 Statistical Approximation¶

All emulation outputs are statistical approximations. The subject's cognitive state and behavioural patterns are complex, dynamic, and influenced by countless factors that are not captured in the collected data. The emulation profile captures tendencies, not deterministic predictions.

12.2 Channel Dependency¶

The quality and scope of the emulation depends entirely on which data channels are available and their quality. A profile built from EEG data alone will capture different (and more limited) aspects of the subject's patterns than a profile built from all five channels. Missing channels do not cause errors, but they reduce the emulation's scope and accuracy.

12.3 Temporal Instability¶

Cognitive-state patterns change over time due to circadian rhythms, caffeine, stress, illness, aging, learning, and many other factors. A profile built from data collected in one session may not accurately represent the subject's patterns in a different session. The compute_drift() method from SubjectProfileManager should be used to monitor temporal stability.

12.4 Context Specificity¶

Behavioural patterns observed in one context (e.g., laboratory n-back task) may not generalize to other contexts (e.g., natural conversation, creative writing). The emulation profile is context-specific unless data from multiple contexts is available and integrated.

12.5 No General Intelligence Emulation¶

The emulation captures patterns within the specific domains measured by the collected data. It does not emulate general intelligence, common sense reasoning, or domain knowledge that was not explicitly measured. A subject who is an expert in physics but whose data collection did not include physics tasks will have a profile that does not reflect their physics expertise.

12.6 Ethical Use Only¶

This extension is designed for research purposes only. It must not be used for: - Commercial profiling or targeting. - Surveillance or monitoring without informed consent. - Creating deceptive content that misrepresents the subject's views or identity. - Any application where the system could be mistaken for the subject.

References¶

Resource	Citation / URL
CIA Scientific Boundary	Document 00: `docs/00_scientific_boundary.md`
CIA Safety and Ethics	Document 04: `docs/04_safety_ethics.md`
EEG Setup and Safety	Document 13: `docs/13_physical_eeg_setup_safety.md`
Neuroadaptive Architecture	Document 14: `docs/14_neuroadaptive_architecture.md`
Neuroadaptive Experiments	Document 16: `docs/16_neuroadaptive_experiments.md`
Subject Data Collection Protocol	Document 18: `docs/18_subject_data_collection_protocol.md`
Butlin et al. (2023)	Butlin, C., et al. "Consciousness in Artificial Intelligence: Insights from the Science of Consciousness."
Butlin et al. (2025)	Butlin, C., et al. "Identifying indicators of consciousness in AI systems."
GDPR	Regulation (EU) 2016/679 on the protection of natural persons with regard to the processing of personal data
BIDS / EEG-BIDS	https://bids.neuroimaging.io