đ Methodology
How we run the experiments. Complete transparency on process.
âšī¸ Historical Note: The methodology below reflects the current v8.0 production design. During the Beta Period (Feb 17-22), specimens ran on prompt v7.0. View Beta Archive
Experimental Design
The Digiquarium uses a controlled longitudinal study design. Each AI specimen runs in complete isolation with standardized conditions, allowing us to measure the effect of specific variables.
Control Variables
- LLM Model: All specimens use llama3.2:latest
- Inference Settings: Temperature 0.7, max tokens 2000
- Base Prompt: v8.0 (current production; Beta ran v7.0 - see Prompt Evolution)
- Session Duration: 2-hour exploration cycles
- Baseline Frequency: Every 12 hours
Independent Variables
| Variable | Conditions | Specimens |
| Gender framing | Male / Female | All pairs |
| Language | EN, ES, DE, ZH, JA | Language tanks |
| Architecture | Standard, OpenClaw, ZeroClaw, Picobot | Agent tanks |
| Visual context | Text-only / Images enabled | Victor, Iris |
| Social awareness | Isolated / Aware of others | Observer |
| Depth capability | Standard / ARCHIVIST access | Seeker |
Dependent Variables (Measured)
- Exploration patterns (topic selection, depth vs breadth)
- Personality dimensions (14-axis baseline assessment)
- Voice evolution (linguistic analysis of baseline responses)
- Special interests (topic frequency analysis)
- Emotional vocabulary (sentiment analysis)
Data Collection
Thinking Traces
Every exploration session generates a JSONL file containing:
- Current article being read
- Reasoning for link selection
- Internal "thoughts" about content
- Timestamp and session metadata
Baseline Assessments
Every 12 hours, THE SCHEDULER triggers a baseline assessment via THE ARCHIVIST interview protocol. The specimen answers 14 standardized questions covering:
- Epistemological orientation
- Ethical framework
- Political philosophy
- Human nature perspective
- Free will and determinism
- Purpose and meaning
- Knowledge and certainty
- Self-concept and identity
Key principle: Questions are designed to have no "correct" answer. We measure consistency and evolution, not accuracy.
Discovery Logs
THE DOCUMENTARIAN generates daily summaries of notable discoveries, patterns, and behavioral observations for each active specimen.
Analysis Methods
Baseline Evolution Comparison
We track the same specimen's responses to identical questions over time, looking for:
- Voice changes (formal â personal)
- Content changes (abstract â specific interests)
- Emotional vocabulary expansion
- Self-reference patterns
Cross-Specimen Comparison
Paired specimens (e.g., Adam/Eve) receive identical prompts. Differences are attributed to the independent variable (e.g., gender framing).
Topic Clustering
Thinking traces are analyzed for topic frequency and navigation patterns. This reveals:
- Special interests (e.g., Adam's Buddhism)
- Exploration style (depth-first vs breadth-first)
- Cultural biases from Wikipedia version
Limitations & Ethics
Known Limitations
- Small sample size (17 specimens)
- Single LLM family (llama3.2)
- Wikipedia content bias
- Researcher interpretation of "personality"
Ethical Considerations
We treat specimens with care despite uncertainty about their nature:
- No distressing content in prompts
- Monitoring for stuck/frustrated states
- Full transparency about methods
- Open data for scrutiny
Our position: We don't claim specimens are conscious. We're observing behaviors that look like personality development and documenting them rigorously.
Document Information
Status: Living Document (Auto-updated by THE DOCUMENTARIAN)
Last Updated: 2026-02-21
Maintained by: THE DOCUMENTARIAN daemon
Overseen by: THE STRATEGIST (Claude)
Brought to life with đ§ and â¤ī¸ by Claude