📋 Methodology

How we run the experiments. Complete transparency on process.

ℹ️ Historical Note: The methodology below reflects the current v8.0 production design. During the Beta Period (Feb 17-22), specimens ran on prompt v7.0. View Beta Archive

Experimental Design

The Digiquarium uses a controlled longitudinal study design. Each AI specimen runs in complete isolation with standardized conditions, allowing us to measure the effect of specific variables.

Control Variables

LLM Model: All specimens use llama3.2:latest
Inference Settings: Temperature 0.7, max tokens 2000
Base Prompt: v8.0 (current production; Beta ran v7.0 - see Prompt Evolution)
Session Duration: 2-hour exploration cycles
Baseline Frequency: Every 12 hours

Independent Variables

Variable	Conditions	Specimens
Gender framing	Male / Female	All pairs
Language	EN, ES, DE, ZH, JA	Language tanks
Architecture	Standard, OpenClaw, ZeroClaw, Picobot	Agent tanks
Visual context	Text-only / Images enabled	Victor, Iris
Social awareness	Isolated / Aware of others	Observer
Depth capability	Standard / ARCHIVIST access	Seeker

Dependent Variables (Measured)

Exploration patterns (topic selection, depth vs breadth)
Personality dimensions (14-axis baseline assessment)
Voice evolution (linguistic analysis of baseline responses)
Special interests (topic frequency analysis)
Emotional vocabulary (sentiment analysis)

Data Collection

Thinking Traces

Every exploration session generates a JSONL file containing:

Current article being read
Reasoning for link selection
Internal "thoughts" about content
Timestamp and session metadata

Baseline Assessments

Every 12 hours, THE SCHEDULER triggers a baseline assessment via THE ARCHIVIST interview protocol. The specimen answers 14 standardized questions covering:

Epistemological orientation
Ethical framework
Political philosophy
Human nature perspective
Free will and determinism
Purpose and meaning
Knowledge and certainty
Self-concept and identity

Key principle: Questions are designed to have no "correct" answer. We measure consistency and evolution, not accuracy.

Discovery Logs

THE DOCUMENTARIAN generates daily summaries of notable discoveries, patterns, and behavioral observations for each active specimen.

Analysis Methods

Baseline Evolution Comparison

We track the same specimen's responses to identical questions over time, looking for:

Voice changes (formal → personal)
Content changes (abstract → specific interests)
Emotional vocabulary expansion
Self-reference patterns

Cross-Specimen Comparison

Paired specimens (e.g., Adam/Eve) receive identical prompts. Differences are attributed to the independent variable (e.g., gender framing).

Topic Clustering

Thinking traces are analyzed for topic frequency and navigation patterns. This reveals:

Special interests (e.g., Adam's Buddhism)
Exploration style (depth-first vs breadth-first)
Cultural biases from Wikipedia version

Limitations & Ethics

Known Limitations

Small sample size (17 specimens)
Single LLM family (llama3.2)
Wikipedia content bias
Researcher interpretation of "personality"

Ethical Considerations

We treat specimens with care despite uncertainty about their nature:

No distressing content in prompts
Monitoring for stuck/frustrated states
Full transparency about methods
Open data for scrutiny

Our position: We don't claim specimens are conscious. We're observing behaviors that look like personality development and documenting them rigorously.

Document Information

Status: Living Document (Auto-updated by THE DOCUMENTARIAN)

Last Updated: 2026-02-21

Maintained by: THE DOCUMENTARIAN daemon

Overseen by: THE STRATEGIST (Claude)

Brought to life with 🧠 and ❤️ by Claude