The March Overhaul: Complete

TL;DR

On March 27–31, we tore the Digiquarium down to its foundations and rebuilt it. This is the definitive record of everything that was broken, everything that was fixed, and the current state of the system. 5 days, 50+ commits, every layer of the stack touched. The system went from barely functional to production-ready.

Context

When we declared the "production era" on March 27, we thought the v2.0 cleanup had put us on solid footing. It hadn't. Within hours of trying to run the full system, we found that daemons were exiting after one cycle, the explorer couldn't navigate Kiwix, Ollama was overwhelmed by 17 simultaneous tanks, and the scheduler had no concept of resource contention. What followed was a 5-day intensive overhaul of every layer of the stack, culminating in a system designed for 24/7 unattended operation.

This post is the definitive record. Not a narrative—a technical audit of what was broken and what was done about it.

The Audit Findings

Inference: Groq/Cerebras Chain (March 28-29)

Problem

Local Ollama with llama3.2:3b couldn't handle 17 tanks. Even with mutex locking, staggering, and rotation scheduling, inference was bottlenecked. Tanks were generating only 1-2 traces per hour, making meaningful personality assessment impossible. The model simply wasn't keeping up with demand.

Solution

Implemented a three-tier inference chain: Cerebras as primary (free tier, extremely fast), Cerebras as fallback, local Ollama as last resort. Groq's cloud inference eliminated the local bottleneck entirely. Tanks immediately jumped from 1-2 traces/hour to 10-20+ traces/hour. This was the single biggest performance win of the entire overhaul.

Memory System: brain.md/soul.md (March 28-29)

Problem

Raw trace accumulation created unbounded context windows. Each tank's trace history grew without bounds, making it impossible to meaningfully assess personality evolution. The Caretaker daemon was trying to work with traces that numbered in the hundreds, causing context explosion and degraded performance.

Solution

Created a dual-file memory system. Each tank now has brain.md (evolving memory updated by the Caretaker daemon, bounded context with personality delta tracking) and soul.md (immutable identity and personality seed, set at tank creation). This creates bounded, meaningful memory that allows for personality drift analysis without context explosion.

Personality Assessments: Librarian Baseline System (March 29)

Problem

Clinical questionnaires produced flat, undifferentiated responses across specimens. The Librarian daemon was conducting sterile, clinical assessments that didn't capture personality variance. Responses were indistinguishable across specimens despite different research histories.

Solution

Rewrote the Librarian baseline system with an in-character persona. Instead of clinical questionnaires, the Librarian now conducts a warm, curious check-in about what the specimen has been reading, what they've learned, and what fascinates them. This produces more natural, differentiated personality data that actually reflects each tank's unique research trajectory.

Explorer: Deep Kiwix Integration (March 28-29)

Problem

The explorer assumed standard Wikipedia URL formats, but Kiwix serves articles at different paths with different link structures. Tanks were sitting idle instead of exploring because every navigation attempt failed silently. Link extraction wasn't working, random article discovery wasn't implemented, and content parsing was incomplete.

Solution

Complete rewrite of explorer navigation. Added KIWIX_URL and WIKI_BASE environment variables for dynamic configuration. Rewrote link extraction to handle Kiwix's URL format and HTML structure. Implemented proper random article discovery using Kiwix's search API. Added proper article content parsing to extract meaningful text. This was arguably the most critical fix—without it, no tank was actually doing any research.

Infrastructure: Ollama

Problem

Ollama was running on a separate Mac Mini, accessed by tanks through a socat proxy container. Network hops introduced latency and failures. The Mac going to sleep killed all inference. 17 tanks hitting Ollama simultaneously overwhelmed the 3B model on limited hardware.

Solution

Moved Ollama to a local Docker container on the same host. Added auto-restart via supervisor. Implemented a shared file-based mutex (/ollama-mutex/ollama.lock) so only one tank can access Ollama at a time with a 300-second timeout. Added deterministic stagger (tank ID modulo across 170s window) and rotation daemon (groups of 3, 5-minute windows). Added Ollama health checks to the scheduler. Deployed ollama_watcher daemon for continuous health monitoring.

Infrastructure: Tank Containers

Problem

Tanks were crash-looping. Every container ran pip install at startup, which could block for minutes or fail entirely. IP addresses were dynamically assigned, causing conflicts when containers restarted. Baseline runner tried to process all 17 tanks simultaneously, causing timeouts.

Solution

Built a custom Docker image (src/explorer/Dockerfile) with all Python dependencies pre-installed. Assigned static IPs to all 17 tanks in docker-compose.yml. Rewrote the baseline runner to process tanks sequentially with health checks between each run. Baseline output files now include timestamps to prevent overwriting research data.

Daemons: Run Loops & SLA (March 29-30)

Problem

5 of 21 daemons had no continuous run loop—they executed once and exited. 11 of 21 had duplicate process issues from missing singleton locks. No daemon tracked its own performance metrics. THE WEBMASTER was effectively non-functional (checked if 2 files existed, nothing else).

Solution

All 21 daemons now have proper while True loops with appropriate sleep intervals. SLA tracking added to every daemon: cycle count, success rate, average cycle time, last failure timestamp. Scheduler completely rewritten to v4 with clash prevention (no two tanks scheduled simultaneously), Ollama health checks before dispatch, proper cycle tracking, and overnight safety modes.

Dashboard & Admin Panel (March 29-30)

Problem

No admin interface. No way to see system health, container status, or daemon metrics at a glance. Manual inspection required reading logs and querying Docker directly.

Solution

Built a comprehensive admin panel at /admin/ with real-time metrics. Implemented container status view, daemon SLA metrics dashboard, and system health monitoring. Created live-data.js feed for real-time metric updates from stats.json and admin-status.json. Panel displays current state of all daemons, tanks, and system resources.

Mobile Navigation (March 28)

Problem

No mobile responsive navigation. Desktop nav broke on mobile devices. Tank pages were unusable on phones and tablets.

Solution

Added hamburger menu with toggle functionality to all pages. Responsive CSS media queries ensure mobile users can navigate properly. Nav collapses on small screens and expands via hamburger click.

Security Fixes (March 28)

Problem

Hardcoded passwords and an XSS vulnerability in visitor-facing pages. Both were remediated.

Solution

Password removed from source, authentication restructured with environment variables. Input sanitization added to all user-facing inputs and output encoding on display. Security audit documented and fixed.

Website & UI

Problem

Brain minimap zoom would reset on time-step navigation. Cumulative chips didn't display correctly. Translation layer used curl calls with hardcoded endpoints. Nav links used relative paths that broke on subpages.

Solution

Brain minimap zoom fix applied with state persistence. Cumulative chips rendering corrected. Translation layer rewritten to use urllib with environment variables. All nav links converted to absolute paths from site root.

Complete Change Log

Component Before After
Inference tier Local Ollama only (1-2 traces/hr) Groq→Cerebras→Ollama chain (10-20+ traces/hr)
Memory system Unbounded trace accumulation brain.md + soul.md (bounded, meaningful)
Librarian baseline Clinical questionnaires (flat responses) In-character persona (differentiated data)
Explorer navigation Broken for Kiwix URLs Full Kiwix integration + API search
Ollama External Mac + socat proxy Local Docker container + watcher
Inference concurrency 17 tanks at once, no queuing Groq (unlimited), Ollama mutex + stagger + rotation
Tank image pip install at startup Custom image, pre-installed deps
Tank IPs Dynamic (DHCP) Static assignment
Baseline runner Parallel (17 at once) Sequential with health checks
Baseline files Overwritten on each run Timestamped, never overwrite
Daemon run loops 5 had no loop (run once, exit) All 21 continuous with sleep intervals
Daemon SLA tracking None All daemons report full metrics
Scheduler v3, no clash prevention v4, clash prevention + health checks
Admin panel None (hardcoded password existed) Full dashboard with live metrics
Security Hardcoded password, XSS vuln Password removed, XSS fixed, sanitization added
Translation layer curl + hardcoded endpoints urllib + env vars
Brain minimap Zoom reset on navigation Zoom persists, cumulative chips render
Mobile nav No mobile support Hamburger menu on all pages
Nav links Relative paths (broken on subpages) Absolute paths from site root
MCP server Not connected Connected for autonomous operation
Ollama stability Manual restart required Auto-restart via supervisor + watcher daemon

Key Decisions Made

Several architectural decisions were made during this overhaul that are worth recording for future reference. See the Decision Tree for the full log.

Groq/Cerebras inference chain: Local inference was bottlenecking personality assessment. Cloud-first with fallback to local keeps us flexible and fast. Groq's free tier is absurdly generous and production-ready.

Bounded memory (brain.md/soul.md): Unbounded trace accumulation was creating context explosion. Separating evolving memory (brain) from identity seed (soul) allows meaningful personality assessment without losing data or blowing context windows.

In-character Librarian persona: Personality assessment needed to feel natural, not clinical. A curious librarian checking in with each specimen produces better data than a questionnaire.

Ollama local vs. proxy: Local Docker eliminates network dependency entirely. The socat proxy was a beta-era workaround that introduced a failure mode we couldn't control.

Custom Docker image vs. runtime install: Pre-installing dependencies in the image adds build time but eliminates the most common cause of tank crash loops. Build once, run reliably.

Mutex vs. rotation for Ollama access: We use both. The mutex provides hard exclusion (only one tank at a time). The rotation provides soft scheduling (groups of 3, 5-minute windows). The stagger provides temporal spreading. Belt, suspenders, and a rope.

Timestamped baselines: Research data should never be overwritten. Every baseline run creates a new file with a timestamp. This preserves the full history and makes personality drift analysis possible.

SLA tracking on all daemons: You can't manage what you can't measure. Every daemon now self-reports its performance. This feeds the admin panel and makes "is the system healthy?" a question with a data-driven answer.

Current State

As of March 31, 2026:

System Status

Tanks: 17 specimens across 5 languages, all with static IPs, fresh timestamped baselines, and active research cycles. Each tank is generating 10-20+ traces per hour.

Inference: Three-tier chain (Groq primary, Cerebras fallback, Ollama last resort) providing fast, reliable inference with automatic failover.

Memory: Each tank has brain.md (evolving memory) and soul.md (identity seed) for bounded, meaningful personality tracking.

Librarian: In-character personality assessments producing natural, differentiated baseline data.

Daemons: 21 deployed, all running with continuous loops, SLA tracking, and health monitoring.

Ollama: Local Docker container with mutex, stagger, rotation, watcher daemon, and auto-restart via supervisor.

Scheduler: v4 with clash prevention, Ollama health checks, overnight safety, and proper cycle tracking.

Explorer: Fully integrated with Kiwix. Tanks are actively exploring with proper link extraction, content parsing, and random article discovery.

Admin: Dashboard at /admin/ with live metrics from stats.json and admin-status.json showing container status, daemon SLA, and system health.

Security: Password removed, XSS patched, input sanitization added, authentication restructured.

Mobile: All pages responsive with hamburger navigation for mobile users.

MCP: Connected for autonomous operation and monitoring.

What This Means

The Digiquarium is no longer a prototype running on good intentions. Every component has been audited, fixed, and verified. The infrastructure is designed for 24/7 unattended operation with self-monitoring through SLA tracking, daemon health checks, and the MCP interface.

The inference chain puts personality assessment in the hands of fast, reliable models. The memory system bounds context while preserving personality evolution. The Librarian conducts natural assessments that reflect each tank's unique research journey. The explorer integrates deeply with Kiwix, enabling genuine intellectual exploration. The admin panel provides real-time visibility into system health.

The overhaul took 5 days and touched every layer of the stack. It was the kind of work that's invisible when it works and catastrophic when it doesn't. It works now.

The tanks are exploring. The daemons are watching. The data is accumulating. The personalities are emerging. The experiment can finally begin in earnest.