Zero Traditional Servers

Powered by Cloudflare

ZeroBot runs entirely on Cloudflare's global edge network — 330+ hops worldwide. No origin servers, no containers, no cold starts. Every request is handled milliseconds from the user, with AI-powered DLP and guardrails on every call.

330+ Edge Hops
0 Traditional Servers
13 Cloudflare Services
AI Agents
8 LLM Routes

The Request Lifecycle

Every message to your agent flows through Cloudflare's edge infrastructure

👤
User
🌐
Cloudflare Edge
Worker
🤖
Durable Object
🗃
D1
📦
R2
KV
🧠
AI
Response

13 Cloudflare Services. One Platform.

Click any card to see how ZeroBot uses it

🏗

Agents SDK

The Foundation

The framework everything is built on. Cloudflare's Agents SDK (v0.3.10) provides the stateful agent architecture, WebSocket handling, tool orchestration, and conversation management that make ZeroBot possible.

  • Each agent extends Agent<Env, KitState> — the base class for all agents (unlimited)
  • Built-in WebSocket support for real-time streaming chat
  • @callable() decorators expose RPC methods for agent-to-agent communication
  • Automatic state persistence via Durable Objects — agents remember everything
  • Tool orchestration loop: message → LLM → tool_use → result → loop until done
  • Only 2 production dependencies total: agents and @cloudflare/playwright
▶ Click for details

Workers

Compute

The entire application runs as a Cloudflare Worker. No servers, no containers. Deployed globally across 330+ edge hops with sub-millisecond cold starts.

  • Web UI (15 pages), API endpoints, and webhook handlers all run in a single Worker
  • Slack, Discord, and WhatsApp webhooks processed at the edge
  • All business logic, LLM routing, and tool orchestration in one deployment
  • Automatic scaling: handles 1 request or 10,000 concurrently
  • Zero traditional servers to maintain, patch, or scale
▶ Click for details
🤖

Durable Objects

Stateful Agents

Each AI agent is a Durable Object instance that maintains persistent state across conversations. Create as many agents as you need.

  • Unlimited agents — e.g. Kit (chief-of-staff), Scout (marketing), Closer (sales), Sentinel (security)
  • WebSocket connections for real-time chat streaming
  • Each user gets their own isolated DO instances (multi-tenant by design)
  • Agents coordinate via RPC calls (e.g., one agent asks another for analysis)
  • State survives between requests — agents "remember" the conversation
▶ Click for details
🗃

D1

SQL Database

All structured data lives in D1 — conversations, messages, memories, skills, audit logs, experiments, and more. Tenant-isolated on every query.

  • Conversations, messages, and memory entities with full-text search
  • Skills engine, notification system, and achievement tracking
  • Audit log for every security-relevant action (compliance-ready)
  • A/B testing framework with statistical analysis
  • Every query includes user_id for strict tenant isolation
▶ Click for details
📦

R2

Object Storage

Identity files, knowledge documents, uploaded images, and daily memory summaries — all stored in R2 with zero egress fees.

  • Agent identity files (SOUL.md, CONTEXT.md) that define each agent's personality
  • User profile (USER.md) with system-prompt-level trust
  • Knowledge base documents and uploaded files (images, PDFs)
  • Daily memory summaries generated by nightly cron jobs
  • Paths scoped per-user: {userId}/{botId}/ for complete isolation
▶ Click for details

KV

Key-Value Cache

The speed layer. Rate limiting, personality caching, embedding caches, and config storage — all with global edge reads in under 1ms.

  • Rate limiting: per-token and per-user with dual-layer enforcement
  • Personality file caching with 7-day TTL and stale-while-revalidate
  • Embedding caches to avoid re-generating vectors
  • MCP tool definition caching (1-hour TTL per connection)
  • Config storage for changelog, roadmap, and feature flags
▶ Click for details
🧠

Vectorize

Vector Database

Semantic memory recall. When an agent searches its memory, Vectorize finds the most relevant entries using cosine similarity on embeddings.

  • Memory embeddings stored as high-dimensional vectors
  • Cosine similarity search finds semantically related memories
  • Per-user namespace isolation prevents cross-user data leakage
  • Multi-signal scoring: semantic similarity + importance + recency + reinforcement
  • Enables "fuzzy" recall — agents find relevant context even with different wording
▶ Click for details

Workers AI

On-Edge Inference

Cloudflare's GPU fleet handles lightweight AI tasks: intent classification, memory operations, embedding generation, context compaction, and content-level DLP.

  • Intent classification: routes messages to the right LLM (simple vs complex)
  • Memory consolidation: deduplicates entries before storage
  • Contradiction detection: identifies conflicting memories and auto-resolves
  • Context compaction: summarizes old messages to keep context lean
  • Embedding generation (bge-base-en-v1.5) for semantic search
  • Content-level DLP: Cloudflare AI API inspects content for PII and sensitive data at the inference layer
  • Audio transcription: Deepgram Nova-3 + Whisper models for voice and audio processing
▶ Click for details
🛡

AI Gateway

LLM Proxy

All external LLM calls (Claude, Gemini, Perplexity) route through AI Gateway for caching, logging, cost tracking, granular DLP, and guardrails.

  • 8-way routing: Workers AI, Claude Sonnet/Haiku/Opus, Perplexity, Gemini Flash/Pro/Image
  • Response caching reduces redundant API calls and costs
  • Real-time cost tracking and usage analytics per model
  • Granular DLP: Content inspection on every LLM request/response — PII, credentials, and sensitive data intercepted before leaving the edge
  • AI Guardrails: Configurable safety rules enforce content policies, block prompt injection, and prevent harmful outputs
  • Rate limiting prevents runaway spending
  • Automatic fallback if gateway is unavailable
▶ Click for details
🔒

Cloudflare Access

Zero-Trust Auth

kitbot.0arc.ai sits behind Cloudflare Access for SSO and zero-trust authentication. No VPN needed — identity-based access at the edge.

  • Zero-trust security: every request authenticated before reaching the Worker
  • SSO integration for seamless sign-in
  • No VPN infrastructure to maintain
  • Access policies enforced at the edge, not in application code
  • Combined with bearer token auth (SHA-256) for API-level security
▶ Click for details
🌐

Browser Rendering

Web Browsing

Your agents can browse the web using Cloudflare's headless browser API. Take screenshots, extract content, and interact with pages on demand.

  • Headless Chromium powered by @cloudflare/playwright
  • Screenshot capture and content extraction from any URL
  • Used for research tasks, competitor monitoring, and link previews
  • Runs in Cloudflare's infrastructure — no browser instances to manage
  • Integrated as an agent tool: agents decide when to browse
▶ Click for details
🌍

Custom Domain + DNS

Edge Routing

kitbot.0arc.ai routes through Cloudflare DNS with full SSL/TLS, DDoS protection, and WAF — all managed from one dashboard.

  • Custom domain with automatic SSL certificate management
  • Cloudflare DNS for fast, reliable resolution worldwide
  • Built-in DDoS protection at no extra cost
  • Web Application Firewall (WAF) blocks malicious requests
  • Full TLS encryption from user to Worker
▶ Click for details

Cron Triggers

Scheduled Tasks

Automated background jobs: memory consolidation, morning briefings, social monitoring every 30 minutes, and pending task processing.

  • Nightly memory consolidation: deduplicates and summarizes the day's conversations
  • Morning briefings: prepares a daily summary of priorities and events
  • Social media monitoring: scans Reddit and YouTube every 30 minutes
  • Pending memory processing: handles deferred memory writes
  • No external scheduler needed — runs natively on Cloudflare
▶ Click for details