~ A ZeroBot Architecture Adventure ~
v2.8.0 | Cloudflare Workers Runtime
SFX: *8-bit fanfare plays*
Follow the message on its journey through your agent's brain... *whoosh*
💡 HOW IT WORKS: User sends a message → Intent Forest classifies the route (regex-first, 80% skip AI) → Tool Shop equips relevant tools (local + MCP) → LLM Castle generates the response → Memory Dungeon decides what to remember.
Three levels stand between a new fact and permanent memory. *dungeon theme plays*
Only worthy memories survive the gauntlet. Click each level!
Catch Rate: ~70% of duplicates caught here
Speed: 0ms (instant) — pure string comparison
How: Normalize text → check if identical entry already exists in R2 file
Cost: Zero tokens, zero API calls
*bloop*
✅ 70% of enemies fall here. No boss fight needed.
Catch Rate: ~15% of remaining duplicates
Speed: 50-200ms — Workers AI embedding call
How: Generate bge-base-en-v1.5 embedding → cosine similarity against existing entries
(threshold 0.70)
Cost: ~0.001 tokens equivalent, KV-cached per content hash (7-day TTL)
*whoosh*
⚠️ Semantically similar = same memory. "Ben likes coffee" ≈ "Ben enjoys coffee"
Reach Rate: Only ~15% of memories get here
Speed: 500-3000ms — Workers AI inference call
How: Find top-5 similar entries by keyword overlap → Workers AI compares each pair
→ auto-supersede contradictions
Cost: 1 AI call per candidate (1500ms timeout each)
XML Isolation: User content wrapped in XML markers to prevent prompt injection
*boss music*
💀 "Ben hates coffee" vs "Ben loves coffee" → newest wins. Old entry marked [SUPERSEDED].
🚨 TIMEOUT? GAME OVER?
NO! If Workers AI times out (1500ms limit), the memory still saves. The write gate fails OPEN, not closed. Your progress is never lost. *1-UP!*
Nightly consolidation cron will catch any duplicates that slipped through.
USER.md lives at {userId}/global/identity/USER.md
and has system-prompt-level trust.
⚔️ Shield Powers:
sanitizeExternalContent()💎 LEGENDARY ITEM: Cannot be dropped, sold, or traded. Permanent party equipment. *shield gleams*
Your agents' equipped abilities for managing the Memory Dungeon. *equip sound*
Save a new memory to D1 + Vectorize (dual-write). Runs through all 3 dungeon levels.
Tags with conversation provenance [YYYY-MM-DD conv:XXXXXXXX].
Shared flag writes to D1 shared_memories table for party
access.
Remove a specific memory entry. Requires confirmation engine approval (destructive operation in CONFIRM_REQUIRED_TOOLS Set). Updates R2 file, invalidates KV cache, and marks D1 record as deleted.
Semantic recall: generate query embedding → cosine similarity vs stored entries (min 0.70 threshold). Multi-signal scoring: semantic (40%) + importance (25%) + decay (20%) + reinforcement (15%). Returns top 3 results (MAX_RECALLED_MEMORIES = 3).
List all memories in the agent's R2 file. Three-tier fallback: KV cache → D1 query (top 50 by importance + recency) → R2 file fetch.
When the kingdom sleeps, the night watch works. *owl hoots*
🛡️ The Pending Processor
Patrols the pending queue. Any memories that timed out during the day get retried through the full write
gate. Ensures no memory is permanently lost.
⚔️ The Consolidator
Merges duplicate and near-duplicate memories. Workers AI (Llama 3.3 70B) extracts core facts, deduplicates
entries, and keeps the R2 file under 12K capacity. XML isolation markers protect against prompt injection
from user content.
📖 The Daily Scribe
Summarizes today's conversations into a daily episodic summary. Workers AI generates a compact digest
→ saved to R2 at memory/YYYY-MM-DD.md.
Injected into system prompt when user asks about today/yesterday.
🔍 The Monitor Engine
Every 30 minutes (not just night): scans YouTube + Reddit for monitored keywords. Atomic quota tracking.
AI digests via Claude Sonnet with budget reservation. Cross-source trending reports with dedup + relevance
scoring.
Create as many agents as you need. Each has their own memory, but the Party Chest (D1 shared_memories) lets them share intel. These four are one example setup: *party assembles*
Chief of Staff. Routes to 8 LLM models. Manages all memory. DO name: {userId}:kit
Marketing intelligence. Social monitoring. Trend detection. DO name: {userId}:mktg
Sales pipeline. Deal tracking. Client intel. DO name: {userId}:sales
Security monitoring. Threat detection. Audit enforcement. DO name: {userId}:sec
📦
THE PARTY CHEST
D1 shared_memories table.
Any agent can save_memory(shared=true) to share intel
with the whole party.
Read-time re-sanitized for defense in depth. Visible in all agents' system prompts when agent-related
keywords detected.
*chest opens*
Memories are stored across three realms. Each serves a different purpose. *portal opens*
🔄 FALLBACK CHAIN: KV Cache (0ms) → D1 Query (5-20ms) → R2 Fetch (20-50ms)
The hidden configuration that governs the Memory Dungeon. *secret unlocked*
MEMORY RECALL LIMIT
3
When your agent searches its memory, it pulls the 3 most relevant facts into the conversation — enough context without overwhelming the AI.
KEYWORD MATCH MINIMUM
3
A memory must share at least 3 words with your question to be considered relevant. Prevents random, unrelated facts from surfacing.
MEANING MATCH THRESHOLD
70%
How similar the meaning of two sentences must be (not just the words). "Ben likes coffee" and "Ben enjoys coffee" score ~95%. Unrelated sentences score under 30%.
DEDUP TIMEOUT
1.5s
The AI has 1.5 seconds to check if a new memory is a duplicate. If it takes longer (server busy), the memory saves anyway and gets checked tonight.
CONTRADICTION TIMEOUT
1.5s
Same time limit for checking if a new fact conflicts with something your agent already knows. Speed matters — you shouldn't wait for memory checks.
MEMORY FILE CAP
12K
Max size of an agent's memory file (~12,000 characters, about 6 pages of text). When it gets full, the nightly cleanup consolidates — merging duplicates and trimming old facts to make room.
CONVERSATION WINDOW
30
After 30 messages, your agent summarizes the older messages to keep the conversation lightweight. This is like re-reading your notes instead of re-reading the whole book.
ACTION LIMIT
5
An agent can take up to 5 actions per response (search files, save memory, send messages, etc.). This safety cap prevents runaway loops if something goes wrong.
SPEED CACHE DURATION
7 days
Memory lookups are cached at the edge for 7 days so your agent doesn't re-compute them every time. Even after expiry, the system serves the cached version instantly while refreshing in the background.
🏆
You've explored ZeroBot's entire memory architecture. Every game metaphor maps to a real system component running on Cloudflare Workers.
ZeroBot v2.12.0 | 27 phases complete | unlimited agents | 8 LLM routes | 79+ tools