ctx0-persistent-memory.md

ctx0 — Persistent Memory

Overview

Persistent Memory is a local-first, summary-primed memory system for AI agents — combining the simplicity of a flat memory log with AI-generated summaries for low-cost context loading. All data lives in a single SQLite file on the user's machine. The agent writes memories freely, searches them on demand via FTS5 full-text search, and starts each session with a concise AI-generated summary instead of thousands of tokens of raw entries.

Core insight: Writing memories should be dead simple (single INSERT). Loading memories into context should be as small as possible (AI summary, not raw entries). Searching memories should be accurate and fast (FTS5 over raw content). These three operations have different needs — optimize each independently.

┌─────────────────────────────────────────────────────────────────────────────┐
│                                                                             │
│   Full vault:     Agent → ctx0_remember → DB write → curator agent →       │
│                   vault tree → embedding queue → DB read → vector search   │
│                   → agent                                                  │
│                   (6 hops, 2 subagents, network I/O, DB costs)             │
│                                                                             │
│   This system:    Agent → db.insert() → done                               │
│                   Session start → load summary (~300 tokens)               │
│                   Need details? → FTS5 search (<1ms)                       │
│                   (2 tools, 0 subagents, local disk only)                  │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Why This Approach

This architecture was chosen after evaluating two alternatives:

CriterionFlat Memory LogGit-Like Context ControllerThis System (Hybrid)
Simplicity2 tools, ~200 LOC5 tools, ~400 LOC2 tools, ~300 LOC
Context cost~4,000 tokens (raw entries)~130-330 tokens (AI summary)~300-500 tokens (AI summary)
Retrieval accuracyHigh (raw content + FTS5)Lower (summaries are lossy)High (raw content + FTS5)
RobustnessNo external dependenciesRequires Haiku for every commitHaiku optional, graceful fallback
ComplexitySimple flat listDAG with branches, commits, mergesSimple flat list + summaries

The flat log gives us simplicity and accurate retrieval. The git controller gives us low context cost via AI summaries. This hybrid takes both strengths and discards the DAG/branching complexity (which adds cognitive load for the agent without proportional value in a single-user, single-agent MVP).

Why Local SQLite

ConcernLocal SQLiteRemote DB (Supabase)Flat File (JSONL)
Read latency<1ms (indexed query)50-150ms (network)<1ms (small), linear growth
Write latency<1ms (INSERT)20-50ms (network)<1ms (append)
Update/deleteUPDATE/DELETEUPDATE/DELETEFull file rewrite
FilteringSQL WHERE + indexesSQL WHERE + indexesParse all → JS filter
Full-text searchFTS5 (built-in, fast)pg_trgm / tsvectorString.includes()
Crash safetyWAL mode, transactionsServer-sideHope append was atomic
Concurrent accessWAL mode (built-in)Server handlesManual file locking
Cost$0Per-read/write billing$0
OfflineYesNoYes
PrivacyData never leaves machineRemote serverData never leaves machine
Disk footprintSingle .db fileN/A3 files (log + summary + backup)
Human-inspectablesqlite3 CLI / DB Browserpsql / dashboardText editor
Cross-device syncVia cloud sync (row-level)Built-inManual
ScalesMillions of rowsUnlimited~1000 lines before sluggish

SQLite is the sweet spot: all the local-first benefits of flat files, plus real querying, indexing, atomic writes, and crash safety. One file on disk (memory.db), zero network, zero cost.

better-sqlite3 is synchronous (no async overhead), battle-tested, and used by Turso, Obsidian, 1Password, and thousands of Electron apps.

Related documentation:


Design Principles

  1. Local-first, cloud-synced — Memory lives on the user's machine as a single SQLite file. All reads and writes are local (<1ms). Cloud sync is optional and fire-and-forget — if the user has multiple daemons, memories sync via Supabase. If offline or unauthenticated, the system works identically without sync.

  2. Summary-primed — Sessions start with a concise AI-generated summary (~300-500 tokens), not thousands of tokens of raw entries. The summary is the "big picture." Raw entries are the "source of truth."

  3. Search-accurate — When the summary isn't enough, the agent searches raw entries via FTS5. Full-text search over actual content is always more accurate than searching summaries.

  4. Zero-subagent — No curator, no librarian, no extractor. The main agent reads and writes memory directly. Summarization runs in the background, not as a blocking subagent.

  5. Save liberally, load smartly — Writing is cheap (single INSERT, <1ms). Loading is where the intelligence lives (what to summarize, when to summarize, how to prime).

  6. Graceful degradation — If AI summarization fails (network down, API error), fall back to loading raw entries with a token budget. The system never breaks — it just gets temporarily less efficient.

  7. Single-file simplicity — One memory.db file. Copy it to a new machine, back it up, inspect it with standard tools.


Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                       PERSISTENT MEMORY SYSTEM                              │
│                                                                             │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │                        MAIN AGENT                                    │   │
│   │   bot0 daemon │ Claude Code │ Cursor │ any agent                    │   │
│   │                                                                      │   │
│   │   Tools:                                                             │   │
│   │   ├── memory_save(content, tags?, pin?)                             │   │
│   │   └── memory_search(query, tags?, project?)                         │   │
│   │                                                                      │   │
│   │   Auto-injected at session start:                                    │   │
│   │   └── <memory> block in system prompt (summary + pinned entries)    │   │
│   │                                                                      │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│                         │                       │                           │
│            Local disk   │                       │  Haiku (background)       │
│                         ▼                       ▼                           │
│   ┌──────────────────────────┐   ┌──────────────────────────────────┐     │
│   │  LOCAL SQLITE STORAGE     │   │  AI SUMMARIZATION (async)        │     │
│   │                           │   │                                  │     │
│   │  ~/.bot0/memory.db        │   │  Triggered by:                   │     │
│   │                           │   │  • Entry count threshold (20+)   │     │
│   │  ┌─────────────────────┐ │   │  • Session end                   │     │
│   │  │ memories             │ │   │  • Session start (catch-up)      │     │
│   │  │ memories_fts (FTS5)  │ │   │  • Periodic interval (5 min)     │     │
│   │  │ memory_summaries     │ │   │  • Daily compaction cron         │     │
│   │  │ sync_meta            │ │   │  • Auto-save hooks (bus event)   │     │
│   │  └─────────────────────┘ │   │  Cost: ~$0.0001/summary (Haiku)  │     │
│   │                           │   │  Fallback: load raw if fails     │     │
│   │  WAL mode, <1ms r/w      │   └──────────────────────────────────┘     │
│   └────────────┬─────────────┘                                             │
│                │                                                            │
│                │  Cloud sync (fire-and-forget push, session-start pull)     │
│                ▼                                                            │
│   ┌──────────────────────────────────────────────────────────────────┐     │
│   │  SUPABASE: ctx0_memories (optional, multi-daemon sync)            │     │
│   │                                                                    │     │
│   │  Daemon A (laptop) ──push──▶ ctx0_memories ◀──push── Daemon B    │     │
│   │  Daemon A ◀──pull (session start)──▶ Daemon B                    │     │
│   │                                                                    │     │
│   │  • Row-level sync (append-only, nanoid PKs = no conflicts)       │     │
│   │  • Memories sync bidirectionally; summaries stay local-only       │     │
│   │  • Graceful degradation: unauthenticated → local-only mode       │     │
│   └──────────────────────────────────────────────────────────────────┘     │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Database Schema

Single SQLite file at ~/.bot0/memory.db, using better-sqlite3 in synchronous mode with WAL journaling.

memories — The Raw Log

Every memory the agent saves. This is the source of truth — summaries are derived from this.

sql
CREATE TABLE memories ( id TEXT PRIMARY KEY, -- nanoid (e.g., "m_a1b2c3") content TEXT NOT NULL, -- The memory itself (natural language) tags TEXT NOT NULL DEFAULT '[]', -- JSON array of strings source TEXT NOT NULL DEFAULT 'agent', -- 'user' | 'agent' | 'auto' project TEXT, -- Project context (null = global) pinned INTEGER NOT NULL DEFAULT 0, -- 1 = always load into context superseded_by TEXT, -- ID of newer version (soft-update) summarized INTEGER NOT NULL DEFAULT 0, -- 1 = included in a summary created_at TEXT NOT NULL DEFAULT (datetime('now')) -- ISO 8601 ); -- Loading: pinned memories (always loaded first) CREATE INDEX idx_memories_pinned ON memories(pinned, created_at DESC) WHERE pinned = 1 AND superseded_by IS NULL; -- Loading: recent active memories CREATE INDEX idx_memories_active ON memories(created_at DESC) WHERE superseded_by IS NULL; -- Loading: project-specific memories CREATE INDEX idx_memories_project ON memories(project, created_at DESC) WHERE project IS NOT NULL AND superseded_by IS NULL; -- Summarization: unsummarized entries CREATE INDEX idx_memories_unsummarized ON memories(created_at ASC) WHERE summarized = 0 AND superseded_by IS NULL; -- Search: by source CREATE INDEX idx_memories_source ON memories(source);

SQLite FTS5 gives us fast, ranked full-text search over memory content — no external dependencies, no embedding pipelines.

sql
-- FTS5 virtual table for content search CREATE VIRTUAL TABLE memories_fts USING fts5( content, content=memories, content_rowid=rowid ); -- Keep FTS in sync with triggers CREATE TRIGGER memories_ai AFTER INSERT ON memories BEGIN INSERT INTO memories_fts(rowid, content) VALUES (new.rowid, new.content); END; CREATE TRIGGER memories_ad AFTER DELETE ON memories BEGIN INSERT INTO memories_fts(memories_fts, rowid, content) VALUES ('delete', old.rowid, old.content); END; CREATE TRIGGER memories_au AFTER UPDATE ON memories BEGIN INSERT INTO memories_fts(memories_fts, rowid, content) VALUES ('delete', old.rowid, old.content); INSERT INTO memories_fts(rowid, content) VALUES (new.rowid, new.content); END;

memory_summaries — AI-Generated Summaries

Summaries are the primary artifact loaded into context at session start. Two types:

  • incremental — Summarizes a batch of recent entries (created throughout the day)
  • compacted — Rolls up multiple incremental summaries into a higher-level digest (created by daily cron)
sql
CREATE TABLE memory_summaries ( id TEXT PRIMARY KEY, -- nanoid (e.g., "ms_x7y8z9") type TEXT NOT NULL DEFAULT 'incremental', -- 'incremental' | 'compacted' summary TEXT NOT NULL, -- AI-generated summary text entry_count INTEGER NOT NULL, -- How many entries/summaries were summarized entry_ids TEXT NOT NULL DEFAULT '[]', -- JSON array of memory IDs included period_start TEXT NOT NULL, -- Earliest entry timestamp period_end TEXT NOT NULL, -- Latest entry timestamp created_at TEXT NOT NULL DEFAULT (datetime('now')) ); CREATE INDEX idx_summaries_type ON memory_summaries(type, created_at DESC); CREATE INDEX idx_summaries_period ON memory_summaries(period_end DESC);

sync_meta — Cloud Sync State

Tracks sync watermarks for the cloud sync system. See Cloud Sync for details.

sql
CREATE TABLE sync_meta ( key TEXT PRIMARY KEY, -- 'last_sync_ts', 'device_id' value TEXT NOT NULL );

Initialization

typescript
import Database from 'better-sqlite3'; import { join } from 'path'; import { homedir } from 'os'; const DB_PATH = join(homedir(), '.bot0', 'memory.db'); function initMemoryDb(): Database.Database { const db = new Database(DB_PATH); // WAL mode for better concurrent read/write performance db.pragma('journal_mode = WAL'); // Run schema creation (idempotent) db.exec(` CREATE TABLE IF NOT EXISTS memories ( id TEXT PRIMARY KEY, content TEXT NOT NULL, tags TEXT NOT NULL DEFAULT '[]', source TEXT NOT NULL DEFAULT 'agent', project TEXT, pinned INTEGER NOT NULL DEFAULT 0, superseded_by TEXT, summarized INTEGER NOT NULL DEFAULT 0, created_at TEXT NOT NULL DEFAULT (datetime('now')) ); CREATE INDEX IF NOT EXISTS idx_memories_pinned ON memories(pinned, created_at DESC) WHERE pinned = 1 AND superseded_by IS NULL; CREATE INDEX IF NOT EXISTS idx_memories_active ON memories(created_at DESC) WHERE superseded_by IS NULL; CREATE INDEX IF NOT EXISTS idx_memories_project ON memories(project, created_at DESC) WHERE project IS NOT NULL AND superseded_by IS NULL; CREATE INDEX IF NOT EXISTS idx_memories_unsummarized ON memories(created_at ASC) WHERE summarized = 0 AND superseded_by IS NULL; CREATE INDEX IF NOT EXISTS idx_memories_source ON memories(source); CREATE VIRTUAL TABLE IF NOT EXISTS memories_fts USING fts5( content, content=memories, content_rowid=rowid ); CREATE TABLE IF NOT EXISTS memory_summaries ( id TEXT PRIMARY KEY, type TEXT NOT NULL DEFAULT 'incremental', summary TEXT NOT NULL, entry_count INTEGER NOT NULL, entry_ids TEXT NOT NULL DEFAULT '[]', period_start TEXT NOT NULL, period_end TEXT NOT NULL, created_at TEXT NOT NULL DEFAULT (datetime('now')) ); CREATE INDEX IF NOT EXISTS idx_summaries_type ON memory_summaries(type, created_at DESC); CREATE INDEX IF NOT EXISTS idx_summaries_period ON memory_summaries(period_end DESC); CREATE TABLE IF NOT EXISTS sync_meta ( key TEXT PRIMARY KEY, value TEXT NOT NULL ); `); return db; }

File on Disk

~/.bot0/
├── config.json       # Existing daemon config
└── memory.db         # SQLite database (single file)

That's it. One file.

Size estimates:

  • 100 memories ≈ 50-100 KB
  • 1,000 memories ≈ 500 KB - 1 MB
  • 10,000 memories ≈ 5-10 MB
  • FTS5 index adds ~30% overhead

SQLite handles millions of rows. We will never be the bottleneck.


When Memories Are Saved

Memories enter the log through three channels:

1. Explicit User Request

The user says "remember this" or "save this for later."

User: "Remember that the deploy key for staging is in 1Password under 'staging-deploy'"
Agent: [calls memory_save]
→ INSERT into memories: { content: "...", source: "user", tags: ["infra", "credentials"] }

2. Agent Self-Save

The agent encounters information during work that seems worth persisting. The system prompt instructs it to recognize and save:

  • Surprising facts — Information that contradicts assumptions or is non-obvious
  • Learned patterns — "This codebase uses X pattern for Y"
  • User preferences — Implicit preferences revealed through corrections or choices
  • Environmental facts — API key locations, deployment targets, team structure
  • Decision rationale — Why a particular approach was chosen over alternatives
Agent: [reading code, discovers unconventional pattern]
Agent: [calls memory_save]
→ INSERT: { content: "bot0 wraps Drizzle in OrmClient (packages/db/orm.ts)...",
            source: "agent", tags: ["codebase", "pattern"] }

3. Auto-Save on Significant Events

Hooks subscribe to SessionStateChanged events on the daemon event bus. When a session transitions from workingcompleted, three hooks fire concurrently (fire-and-forget, never blocks the session):

EventWhat Gets SavedTagsExample
Task completion1-2 sentence summary of what was accomplishedsession_summary"Migrated auth from JWT to session tokens."
User correctionThe preference/correction as a standalone factpreference, correction"User prefers tabs over spaces."
Error resolutionError + fix patterndebug, error_resolution"ECONNREFUSED on 5432 → brew services start postgresql"

All auto-saves use source: 'auto' (vs 'agent' for explicit saves). Significance filtering skips trivial conversations (< 2 tool calls AND < 3 user messages) and sessions where the agent already called memory_save 2+ times. Each hook uses Haiku for summarization with a "SKIP"/"NONE" escape hatch to avoid noise.

See Auto-Save Hooks for full architecture details.


Context Loading (Session Start)

At session start, the daemon loads a <memory> block into the system prompt. This uses summary priming — loading the AI-generated summary instead of raw entries, keeping context cost low.

Loading Algorithm

LOAD_MEMORY_CONTEXT(project):

  blocks = []

  ── Phase 1: Pinned memories (always loaded, raw content) ──────────

  pinned = SELECT id, content, tags, created_at
           FROM memories
           WHERE pinned = 1
             AND superseded_by IS NULL
           ORDER BY created_at DESC

  for entry in pinned:
    blocks.push({ section: "pinned", content: entry.content })

  ── Phase 2: Latest compacted summary (big picture) ────────────────

  compacted = SELECT summary FROM memory_summaries
              WHERE type = 'compacted'
              ORDER BY created_at DESC
              LIMIT 1

  if compacted:
    blocks.push({ section: "context", content: compacted.summary })

  ── Phase 3: Incremental summaries since last compaction ───────────

  incrementals = SELECT summary FROM memory_summaries
                 WHERE type = 'incremental'
                   AND created_at > COALESCE(
                     (SELECT MAX(created_at) FROM memory_summaries WHERE type = 'compacted'),
                     '1970-01-01'
                   )
                 ORDER BY created_at DESC

  for s in incrementals:
    blocks.push({ section: "recent", content: s.summary })

  ── Phase 4: Unsummarized entries (newest, not yet in any summary) ─

  unsummarized = SELECT id, content, created_at
                 FROM memories
                 WHERE summarized = 0
                   AND superseded_by IS NULL
                   AND pinned = 0
                 ORDER BY created_at DESC
                 LIMIT 10

  if unsummarized.length > 0:
    blocks.push({ section: "latest", content: format_entries(unsummarized) })

  RETURN format_memory_block(blocks)

Context Window Injection

xml
<memory> You have persistent memory from previous sessions. ## Pinned - Deploy key for staging is in 1Password under 'staging-deploy' [2026-01-15] - User prefers Bun over Node for new TypeScript projects [2026-01-20] ## Context The user works on bot0, a local-first AI agent system using pnpm workspaces with Turborepo. The daemon is the core component. Authentication uses hardware-bound device keys via Secure Enclave/TPM 2.0. The project uses Drizzle ORM for database schema management with Supabase PostgreSQL. The user prefers TypeScript strict mode and monospace terminal aesthetics. ## Recent Last few sessions focused on designing a persistent memory system for the daemon. Evaluated flat memory log vs git-like context controller approaches. Decided on a hybrid: flat log simplicity with AI summary priming. SQLite with FTS5 for search. ## Latest (not yet summarized) - ctx0-persistent-memory.md spec finalized with hybrid approach [2026-02-26] - Summarization triggers: entry threshold, session end, session start catch-up, periodic interval, daily compaction cron [2026-02-26] </memory>

Token Budget

ComponentTokensNotes
Header + section markers~30<memory> tags, section headings
Pinned entries (3-5)~100-200Raw content, always loaded
Compacted summary~100-200AI-generated, concise
Recent incremental summaries (1-3)~100-200Since last compaction
Unsummarized entries (0-10)~0-200Only if any exist
Total~300-500<1% of 80K context window

Compare: the original flat log spec loaded ~4,000 tokens of raw entries. This is 8-13x more efficient.

Fallback: No Summaries Yet

If no summaries exist (new user, fresh install), fall back to loading raw entries with a token budget — identical to the original flat log approach:

if no summaries exist:
  load pinned (up to 1000 tokens)
  load recent entries (up to 2000 tokens)
  load project entries (up to 500 tokens)
  total: up to ~3500 tokens (one-time cost until first summarization runs)

This ensures the system works immediately on first install, before any summarization has occurred.


Agent Tools

memory_save

typescript
interface MemorySaveTool { name: 'memory_save'; description: 'Save information to your persistent memory. Use when you learn something worth remembering across sessions: user preferences, project patterns, environmental facts, surprising discoveries, or anything the user explicitly asks you to remember.'; parameters: { /** The memory content. Clear, concise, standalone. Max 2000 chars. */ content: string; /** Tags for categorization: "preference", "codebase", "contact", "infra", "decision", "pattern", "debug" */ tags?: string[]; /** Pin to always load into context. Use sparingly — pinned entries consume budget every session. */ pin?: boolean; }; }

Implementation:

typescript
import { nanoid } from 'nanoid'; const insertStmt = db.prepare(` INSERT INTO memories (id, content, tags, source, project, pinned, created_at) VALUES (?, ?, ?, ?, ?, ?, datetime('now')) `); function memorySave(params: { content: string; tags?: string[]; pin?: boolean; }, context: { source: 'user' | 'agent' | 'auto'; project?: string }): string { const { content, tags = [], pin = false } = params; // Enforce max content length (~500 tokens ≈ ~2000 chars) if (content.length > 2000) { return 'Memory too long. Keep under ~500 tokens (2000 chars). Be more concise.'; } const id = `m_${nanoid(6)}`; insertStmt.run( id, content.trim(), JSON.stringify(tags), context.source, context.project ?? null, pin ? 1 : 0, ); // Check if summarization should trigger (entry count threshold) checkSummarizationTrigger(); return `Saved${pin ? ' (pinned)' : ''}: "${content.slice(0, 60)}..."`; }

memory_search

Full-text search over the raw memory log using FTS5. Available when the pre-loaded summary isn't detailed enough.

typescript
interface MemorySearchTool { name: 'memory_search'; description: 'Search your memory for specific information. Use when the summary in your context doesn\'t contain what you need — this searches raw entries for exact details.'; parameters: { /** Search query — natural language or keywords */ query: string; /** Filter by tags */ tags?: string[]; /** Filter by project */ project?: string; /** Maximum results (default 10) */ limit?: number; }; }

Implementation:

typescript
const ftsSearchStmt = db.prepare(` SELECT m.id, m.content, m.tags, m.created_at FROM memories m JOIN memories_fts fts ON m.rowid = fts.rowid WHERE memories_fts MATCH ? AND m.superseded_by IS NULL ORDER BY rank LIMIT ? `); const keywordSearchStmt = db.prepare(` SELECT id, content, tags, created_at FROM memories WHERE content LIKE ? AND superseded_by IS NULL ORDER BY created_at DESC LIMIT ? `); function memorySearch(params: { query: string; tags?: string[]; project?: string; limit?: number; }): string { const { query, tags, project, limit = 10 } = params; // Try FTS5 first (handles phrases, boolean operators, ranking) let results: MemoryRow[]; try { results = ftsSearchStmt.all(query, limit); } catch { // FTS5 syntax error (e.g., unbalanced quotes) — fallback to LIKE results = keywordSearchStmt.all(`%${query}%`, limit); } // Post-filter by tags and project (small result set, fast in JS) if (tags?.length) { results = results.filter(r => { const entryTags: string[] = JSON.parse(r.tags); return tags.some(t => entryTags.includes(t)); }); } if (project) { results = results.filter(r => r.project === project); } if (results.length === 0) { return 'No memories found matching your search.'; } return results.map(r => { const entryTags: string[] = JSON.parse(r.tags); return `- ${r.content} [${entryTags.join(', ')}] (${r.created_at.slice(0, 10)})`; }).join('\n'); }

Memory Lifecycle

Writing

┌─────────────────────────────────────────────────────────────────┐
│                    MEMORY WRITE PATH                              │
│                                                                   │
│   Any save trigger                                               │
│   → Build MemoryEntry object                                     │
│   → db.prepare(INSERT).run(...)              (<1ms, local)       │
│   → FTS5 trigger auto-updates search index   (<1ms, local)       │
│   → Check summarization trigger              (entry count check)  │
│   → Push to cloud (fire-and-forget)          (non-blocking)      │
│   → Done. <1ms locally. Cloud push is async.                     │
│                                                                   │
└─────────────────────────────────────────────────────────────────┘

Reading (Session Start)

┌─────────────────────────────────────────────────────────────────┐
│                    MEMORY READ PATH                               │
│                                                                   │
│   New conversation starts                                        │
│   → Pull new memories from cloud (other daemons' writes)        │
│     (INSERT OR IGNORE into local SQLite, ~50-150ms)             │
│   → Check for unsummarized entries from crashed sessions         │
│     (self-healing catch-up — summarize if needed)                │
│   → Load pinned entries (raw content)                            │
│   → Load latest compacted summary                                │
│   → Load incremental summaries since last compaction             │
│   → Load any unsummarized entries (raw, newest)                  │
│   → Format into <memory> block string                            │
│   → Inject into system prompt                                    │
│   → Done. ~300-500 tokens.                                       │
│                                                                   │
└─────────────────────────────────────────────────────────────────┘

Updating (Supersede Pattern)

Memories are never edited in place. To update, insert a new entry and mark the old one:

typescript
const supersedeTx = db.transaction((oldId: string, newContent: string, project?: string) => { const old = db.prepare('SELECT * FROM memories WHERE id = ?').get(oldId); if (!old) return; const newId = `m_${nanoid(6)}`; // Insert replacement insertStmt.run(newId, newContent, old.tags, 'agent', project ?? old.project, old.pinned); // Mark old as superseded db.prepare('UPDATE memories SET superseded_by = ? WHERE id = ?').run(newId, oldId); });

Using a transaction ensures atomicity — both the insert and the update happen together or not at all.


Summarization System

Summarization is the bridge between raw entries and context-efficient loading. It runs asynchronously — never blocking the conversation — and uses Haiku via the existing LLM proxy.

Two-Tier Summary Model

TierTypeContainsCreated byLifespan
IncrementalincrementalSummary of a batch of recent entriesEntry threshold, session end, periodic timerAccumulates throughout the day
CompactedcompactedSummary of multiple incremental summariesDaily cron jobRolls up incrementals, represents longer-term knowledge

This ensures the memory_summaries table stays bounded regardless of how long the system has been running. After 6 months of daily use, context loading still involves one compacted summary + a few recent incrementals — not hundreds of summaries.

Summarization Triggers

Five trigger points ensure memories are summarized at the right times:

1. Entry Count Threshold (Primary)

After every memory_save, check if unsummarized entries exceed a threshold. If so, trigger summarization.

typescript
const SUMMARIZE_THRESHOLD = 20; function checkSummarizationTrigger(): void { const { count } = db.prepare(` SELECT COUNT(*) as count FROM memories WHERE summarized = 0 AND superseded_by IS NULL `).get() as { count: number }; if (count >= SUMMARIZE_THRESHOLD) { // Run async — don't block the conversation summarizeEntries().catch(err => console.error('[memory] Summarization failed:', err) ); } }

This is self-regulating: heavy sessions trigger more summaries, quiet sessions don't waste API calls.

2. Session End

When a session ends, summarize any remaining unsummarized entries. Ensures nothing is left dangling.

typescript
async function onSessionEnd(sessionId: string): Promise<void> { try { const { count } = db.prepare(` SELECT COUNT(*) as count FROM memories WHERE summarized = 0 AND superseded_by IS NULL `).get() as { count: number }; if (count > 0) { await summarizeEntries(); console.log(`[memory] Summarized ${count} entries at session end`); } } catch (err) { // Non-fatal — log but don't break session cleanup console.error('[memory] Session-end summarization failed:', err); } }

3. Session Start (Self-Healing Catch-Up)

If the previous session crashed or summarization failed, session start detects orphaned unsummarized entries and catches up. This is the self-healing mechanism.

typescript
async function onSessionStart(): Promise<void> { const { count } = db.prepare(` SELECT COUNT(*) as count FROM memories WHERE summarized = 0 AND superseded_by IS NULL `).get() as { count: number }; // If there are unsummarized entries older than the latest summary, // a previous session likely crashed before summarizing if (count > SUMMARIZE_THRESHOLD) { console.log(`[memory] Catch-up: ${count} unsummarized entries from previous session`); await summarizeEntries(); } }

4. Periodic Interval (Long Sessions)

A timer in the daemon catches the case of very long sessions where the entry count threshold hasn't been hit but time has passed.

typescript
const SUMMARIZE_INTERVAL_MS = 5 * 60 * 1000; // 5 minutes let summarizeTimer: NodeJS.Timeout | null = null; function startPeriodicSummarization(): void { summarizeTimer = setInterval(async () => { const { count } = db.prepare(` SELECT COUNT(*) as count FROM memories WHERE summarized = 0 AND superseded_by IS NULL `).get() as { count: number }; if (count > 0) { await summarizeEntries(); } }, SUMMARIZE_INTERVAL_MS); } function stopPeriodicSummarization(): void { if (summarizeTimer) { clearInterval(summarizeTimer); summarizeTimer = null; } }

5. Daily Compaction Cron

A daily job rolls up incremental summaries into a single compacted summary. This prevents the memory_summaries table from growing unbounded and keeps context loading fast.

typescript
async function dailyCompaction(): Promise<void> { // Get all incremental summaries older than 24 hours const incrementals = db.prepare(` SELECT id, summary, entry_count, period_start, period_end FROM memory_summaries WHERE type = 'incremental' AND created_at < datetime('now', '-1 day') ORDER BY created_at ASC `).all() as SummaryRow[]; if (incrementals.length < 2) return; // Nothing to compact // AI-summarize the incremental summaries into one compacted summary const compactedText = await generateCompactionSummary( incrementals.map(s => s.summary) ); const totalEntries = incrementals.reduce((sum, s) => sum + s.entry_count, 0); const periodStart = incrementals[0].period_start; const periodEnd = incrementals[incrementals.length - 1].period_end; // Atomic: insert compacted + delete old incrementals const compactTx = db.transaction(() => { db.prepare(` INSERT INTO memory_summaries (id, type, summary, entry_count, entry_ids, period_start, period_end) VALUES (?, 'compacted', ?, ?, '[]', ?, ?) `).run(`ms_${nanoid(6)}`, compactedText, totalEntries, periodStart, periodEnd); const ids = incrementals.map(s => s.id); db.prepare(` DELETE FROM memory_summaries WHERE id IN (${ids.map(() => '?').join(',')}) `).run(...ids); }); compactTx(); console.log(`[memory] Compacted ${incrementals.length} summaries into 1`); }

Trigger Priority

PriorityTriggerWhenWhy
Must haveEntry count thresholdAfter every memory_saveSelf-regulating, handles all session patterns
Must haveSession start catch-upOn new sessionSelf-healing, covers crashes and failures
Should haveSession endOn session cleanupEnsures nothing is left unsummarized
Should haveDaily compaction cronOnce daily (e.g., 3am)Keeps summaries bounded over months
Nice to havePeriodic intervalEvery 5 minutesExtra safety net for marathon sessions

AI Summarization Prompts

Incremental Summary (entries → summary)

typescript
async function generateIncrementalSummary( entries: Array<{ content: string; tags: string[]; source: string }> ): Promise<string> { const entryText = entries .map((e, i) => `${i + 1}. ${e.content}`) .join('\n'); const response = await haiku.messages.create({ model: 'claude-haiku-4-5-20251001', max_tokens: 300, messages: [{ role: 'user', content: `Summarize these memories into a concise paragraph (2-5 sentences). Preserve key facts, decisions, and preferences. Drop redundancies. Write in third person present tense ("The user prefers...", "The project uses..."). Memories: ${entryText} Summary:`, }], }); return response.content[0].text.trim(); }

Compaction Summary (summaries → summary)

typescript
async function generateCompactionSummary( summaries: string[] ): Promise<string> { const summaryText = summaries .map((s, i) => `${i + 1}. ${s}`) .join('\n\n'); const response = await haiku.messages.create({ model: 'claude-haiku-4-5-20251001', max_tokens: 500, messages: [{ role: 'user', content: `Merge these incremental memory summaries into a single comprehensive summary. Preserve all key facts, preferences, and decisions. Remove duplicates. Resolve any contradictions (prefer more recent information). Write in third person present tense. Keep it concise but complete. Summaries to merge: ${summaryText} Merged summary:`, }], }); return response.content[0].text.trim(); }

Core Summarization Function

typescript
async function summarizeEntries(): Promise<void> { // Get unsummarized entries const entries = db.prepare(` SELECT id, content, tags, source, created_at FROM memories WHERE summarized = 0 AND superseded_by IS NULL ORDER BY created_at ASC `).all() as MemoryRow[]; if (entries.length === 0) return; // Generate AI summary const summary = await generateIncrementalSummary( entries.map(e => ({ content: e.content, tags: JSON.parse(e.tags), source: e.source, })) ); const entryIds = entries.map(e => e.id); const periodStart = entries[0].created_at; const periodEnd = entries[entries.length - 1].created_at; // Atomic: insert summary + mark entries as summarized const summarizeTx = db.transaction(() => { db.prepare(` INSERT INTO memory_summaries (id, type, summary, entry_count, entry_ids, period_start, period_end) VALUES (?, 'incremental', ?, ?, ?, ?, ?) `).run(`ms_${nanoid(6)}`, summary, entries.length, JSON.stringify(entryIds), periodStart, periodEnd); db.prepare(` UPDATE memories SET summarized = 1 WHERE id IN (${entryIds.map(() => '?').join(',')}) `).run(...entryIds); }); summarizeTx(); console.log(`[memory] Summarized ${entries.length} entries`); }

Cost Analysis

OperationModelInput tokensOutput tokensCost
Incremental summary (20 entries)Haiku~1000~150~$0.0002
Compaction (5 incrementals)Haiku~1500~300~$0.0003
Typical day (2-3 incrementals + 0-1 compaction)Haiku~3500~600~$0.0007
Heavy day (10 incrementals + 1 compaction)Haiku~12000~2000~$0.003

Negligible. A heavy month of daily use costs ~$0.10.


How SQLite, FTS5, and Summarization Work Together

Each component serves a different moment in the workflow:

SESSION START
│
├─ Pull new memories from cloud (other daemons' writes)
│  → SELECT WHERE synced_at > last_sync_ts
│  → INSERT OR IGNORE into local SQLite
│  → Trigger summarization catch-up if new entries arrived
│
├─ SQLite query: load summaries + pinned entries
│  → inject ~300-500 tokens into <memory> system prompt
│  → agent starts with "big picture" context (includes other daemons' knowledge)
│
MID-CONVERSATION (on demand)
│
├─ Agent calls memory_save("user wants dark mode in all UIs")
│  → SQLite INSERT into memories (<1ms)
│  → FTS5 index auto-updated (SQLite trigger)
│  → Check entry count threshold → maybe trigger summarization
│  → Push to cloud (fire-and-forget, non-blocking)
│
├─ Agent calls memory_search("dark mode preference")
│  → FTS5 MATCH query → returns ranked raw entries (<1ms)
│  → Agent sees actual content, high fidelity
│
BACKGROUND (async, non-blocking)
│
├─ Summarization triggers (entry threshold / timer / session end)
│  → Read unsummarized entries from SQLite
│  → Send to Haiku for summary generation (~500-2000ms)
│  → INSERT summary + mark entries as summarized (atomic transaction)
│
SESSION END
│
├─ Final summarization of remaining unsummarized entries
│  → Next session loads fresh summary
│
DAILY (cron)
│
├─ Compaction: roll up incremental summaries into one compacted summary
│  → Keeps context loading bounded regardless of history length
MomentEngineWhat it doesToken cost
Session startSQLite (plain queries)Load summaries + pinned entries~300-500 tokens
Agent savesSQLite (INSERT)Write memory, update FTS5 index0 tokens
Agent searchesFTS5 (MATCH)Find relevant raw entries by full-text searchOnly results returned
SummarizationSQLite + HaikuSummarize batch of entries0 context tokens (background)
Daily compactionSQLite + HaikuRoll up summaries0 context tokens (background)

SQLite is the storage engine — reads and writes are <1ms, crash-safe, zero network.

FTS5 is the accuracy layer — when the summary isn't enough, search raw entries with full-text ranking. The agent always has access to the original content.

Summarization is the efficiency layer — keeps context costs low by distilling raw entries into concise summaries. Runs in the background, fails gracefully, never blocks.


Latency Profile

OperationLatencyNotes
Load context (summaries + pinned)<1msLocal SQLite, indexed
Format memory block<1msString concatenation
Save memory (INSERT)<1msSingle row, WAL mode
FTS5 index update (trigger)<1msAutomatic
Full-text search (FTS5 MATCH)<1msBuilt-in, pre-indexed
Update/supersede<1msTransaction, by primary key
Incremental summarization500-2000msOnly network call (Haiku)
Compaction500-2000msHaiku + SQLite transaction

Compare with remote DB: Supabase takes 50-150ms per query due to network. Local SQLite is 100-1000x faster.


System Prompt Instructions

The agent's system prompt includes instructions for memory usage:

markdown
## Memory You have persistent memory stored locally on this machine. A summary of what you've learned is shown in the <memory> block above. ### When to save memories Save a memory when you encounter: - **User requests**: "Remember that...", "Save this for later", "Don't forget..." - **User preferences**: Implicit or explicit (coding style, tools, communication) - **Surprising facts**: Information that contradicts assumptions or is non-obvious - **Project patterns**: Codebase conventions, architecture decisions, deploy configs - **People info**: Names, roles, preferences, relationships - **Resolutions**: How a tricky problem was solved (error + fix) - **Environment**: API endpoints, config locations, access patterns ### When NOT to save - Transient information (temp debugging steps, one-off commands) - Information already in the codebase (README content, code comments) - Obvious facts any LLM would know - Duplicate of something already in your <memory> ### How to write good memories - Clear, standalone sentences — useful without the original conversation - Specific: "User prefers Bun over Node for new TS projects" not "User likes Bun" - Include the 'why' when relevant: "Chose Supabase over Firebase for pgvector support" - Tag appropriately: preference, codebase, contact, infra, decision, pattern, debug ### Using your memories - Check <memory> before asking the user things you might already know - If the summary doesn't have what you need, use memory_search for exact details - When you notice a memory is outdated, save a corrected version

Edge Cases & Mitigations

Memory Spam

Problem: Agent saves too many memories per conversation.

Mitigation:

  • Rate limit: max 5 saves per conversation turn (enforced in tool)
  • Dedup: FTS5 search for similar content before inserting — skip if top match has high rank
  • System prompt: "Be selective. Not every fact is worth remembering."

Stale Memories

Problem: Old memories become outdated but still appear in summaries.

Mitigation:

  • Supersede pattern: new saves mark old entries via superseded_by
  • Agent instruction: "When a loaded memory is wrong, save a corrected version"
  • Compaction naturally ages out stale information as summaries get re-rolled

Large Individual Memories

Problem: Single memory consumes too much budget.

Mitigation:

  • Hard limit: 2000 chars (~500 tokens) per entry, enforced at save time
  • Rejection message guides agent to be more concise

Summarization Failure

Problem: Haiku API is down or unreachable.

Mitigation:

  • Entries still save normally (no network needed for writes)
  • Context loading falls back to raw entries with token budget
  • Next session start runs catch-up summarization
  • System is fully functional without summaries — just slightly less context-efficient

Summary Quality Drift

Problem: Compaction over many cycles loses nuance.

Mitigation:

  • Raw entries are never deleted — they remain searchable via FTS5
  • memory_search always searches raw content, not summaries
  • Agent can always find exact details even if the summary omits them

Database Corruption

Problem: Crash could corrupt the database.

Mitigation:

  • WAL mode: SQLite's Write-Ahead Logging provides automatic crash recovery
  • Transactions: summarization and supersede operations are atomic
  • PRAGMA integrity_check can be run as a health check
  • Users can back up by copying memory.db (safe to copy when no writers are active)

Cross-Project Pollution

Problem: Memories from Project A load when working on Project B.

Mitigation:

  • project column on each entry + indexed query
  • Summaries include project context naturally (Haiku includes it in the summary text)
  • memory_search supports project filtering
  • project IS NULL entries are global — useful for cross-cutting knowledge

Comparison with Other Approaches

AspectThis SystemFlat Memory LogGit Context ControllerFull Vault (ctx0 DB)Claude Code Auto-Memory
StorageLocal SQLite + cloud syncLocal SQLiteLocal SQLiteSupabase PostgreSQLLocal MEMORY.md
Context cost~300-500 tokens~4,000 tokens~130-330 tokensOn-demand retrieval~200 lines
Write latency<1ms (local)<1ms<1ms20-50ms (network)<1ms
Read latency<1ms (local)<1ms<1ms50-150ms (network)<1ms
SearchFTS5 over raw entriesFTS5 over raw entriesSummaries only (no FTS5)pg_trgm / tsvectorString grep
AI cost~$0.001/day$0~$0.001/commit~$0.01/session$0
OfflineYes (except sync + summarization)Fully offlineYes (except summarization)NoFully offline
Cross-deviceYes (via cloud sync)No (manual export)No (manual export)Built-inNo (manual)
Multi-daemonYes (row-level sync)NoNoYesNo
Complexity2 tools, ~400 LOC2 tools, ~200 LOC5 tools, ~400 LOC8+ tables, 3 subagents1 file, built-in
Self-healingYes (session start catch-up)NoOrphaned observations persistServer-sideNo
ScalesMillions of rows + compactionMillions of rowsMillions of rowsUnlimited~200 lines

Implementation Plan

Phase 1: Core (MVP) ✅

Goal: Agent can save and search memories from local SQLite. Context loading uses raw entries (no summarization yet).

TaskFilesEffort
Add better-sqlite3 dependency to daemonpackages/daemon/package.jsonTrivial
Create MemoryEntry, SummaryEntry typespackages/daemon/src/memory/types.tsTrivial
Implement SQLite init + schema creationpackages/daemon/src/memory/db.tsSmall
Implement memory_save toolpackages/daemon/src/tools/memory-save.tsSmall
Implement memory_search tool (FTS5)packages/daemon/src/tools/memory-search.tsSmall
Implement raw-entry memory loader (fallback mode)packages/daemon/src/memory/loader.tsMedium
Inject <memory> block into system promptpackages/daemon/src/agent/loop.tsSmall
Add memory instructions to system promptpackages/daemon/src/agent/agents.tsSmall
Token estimation utilitypackages/daemon/src/memory/tokens.tsSmall

Phase 2: Summary Priming ✅

Goal: Context loading uses AI-generated summaries instead of raw entries. Summarization triggers on entry count threshold and session end.

TaskFilesEffort
Implement generateIncrementalSummary (Haiku call)packages/daemon/src/memory/summarize.tsMedium
Implement summarizeEntries (core function)summarize.tsMedium
Implement entry count threshold triggersummarize.tsSmall
Implement session-end summarization hooksummarize.tsSmall
Update loader to use summary-primed loadingloader.tsMedium
Implement fallback to raw entries when no summaries existloader.tsSmall

Phase 3: Self-Healing + Periodic ✅

Goal: System handles crashes, long sessions, and edge cases gracefully.

TaskFilesEffort
Implement session-start catch-up summarizationsummarize.tsSmall
Implement periodic interval timer (5 min)summarize.tsSmall
Supersede pattern (update existing memories)db.tsSmall
Dedup check before saving (FTS5 similarity)db.tsSmall
Rate limiting (max 5 saves per turn)memory-save.tsSmall

Phase 4: Daily Compaction

Goal: System handles long-term growth. Incremental summaries are rolled up into compacted summaries daily.

Status: Not implemented. Deferred — intended to run on the centralized cloud computer. Without this, incremental summaries accumulate over weeks/months and eventually consume too much context budget. The existing loader.ts already supports consuming compacted summaries — Phase 4 only needs to produce them.

Why it matters: After 6 months of daily use, context loading should still involve one compacted summary + a few recent incrementals — not hundreds of summaries.

TaskFilesEffort
Implement generateCompactionSummary (Haiku call, max 500 tokens)packages/daemon/src/memory/summarize.tsSmall
Implement dailyCompaction (query incrementals > 24h old, AI-merge, atomic insert + delete)summarize.tsMedium
Register daily cron trigger in daemon lifecycle (e.g., setInterval at 24h or integrate with ScheduleExecutor)packages/daemon/src/index.tsSmall

Key behaviors:

  • Only compacts incremental summaries older than 24 hours
  • Requires at least 2 incrementals to trigger (otherwise no-op)
  • Atomic SQLite transaction: insert compacted summary + delete old incrementals
  • Old incrementals are deleted (raw entries they summarized remain in memories table)
  • Cost: ~$0.0003 per compaction (Haiku, ~1500 input tokens, ~300 output tokens)
  • Reference implementations: dailyCompaction() and generateCompactionSummary() in the AI Summarization Prompts section

Phase 5: Auto-Save Hooks ✅

Goal: Agent automatically saves memories on significant events without explicit tool calls.

Status: Implemented. Hooks subscribe to SessionStateChanged events on the daemon event bus, fire concurrently via Promise.allSettled on session completion, and use Haiku for summarization.

TaskFilesStatus
Create hooks module with event bus subscriptionpackages/daemon/src/memory/hooks.tsDone
Task completion summary (Haiku + digest builder)hooks.tsDone
User correction detection (last 6 messages → Haiku)hooks.tsDone
Error resolution detection (heuristic + Haiku)hooks.tsDone
Significance filtering (tool call + message thresholds)hooks.tsDone
Dedup check (skip if agent saved 2+ memories)hooks.tsDone
Export from memory indexpackages/daemon/src/memory/index.tsDone
Wire into daemon lifecycle (init + cleanup)packages/daemon/src/index.tsDone

Phase 6: Cloud Sync

Goal: Multiple daemons (different devices) can share the same memory via Supabase.

Status: Not implemented. Deferred — requires the ctx0 Drizzle schema (Supabase side) and proxy DB infrastructure. The sync_meta table already exists in the local SQLite schema. The existing setMemoryToolDeps pattern for signing provider/config can be extended to include getDb for the ProxiedSupabaseClient.

Note on dependency injection: Phase 6 uses a separate getDb: () => ProxiedSupabaseClient | null dep (following the todo.ts/plan.ts pattern), NOT the existing MemoryToolDeps which carries getSigningProvider/getConfig for Haiku calls. The cloud sync deps wire into the Daemon constructor alongside setTodoToolDeps and setPlanToolDeps.

Note on auto-save hook integration: Auto-saved memories (Phase 5, source: 'auto') should also be pushed to cloud. The syncMemoryToCloud() call should be added alongside the insertMemory() calls in hooks.ts, or the push can be wired at the insertMemory() level in db.ts to cover all sources.

TaskFilesEffort
Add ctx0_memories Drizzle table + registerpackages/ctx0/src/schema/memory.ts, index.tsSmall
Create packages/daemon/src/memory/sync.ts with MemorySyncDeps interfacesync.ts (new)Trivial
Implement syncMemoryToCloud() (fire-and-forget upsert)sync.tsSmall
Implement pullMemoriesFromCloud() (watermark-based incremental pull)sync.tsMedium
Implement bootstrapFromCloud() (new device full pull, paginated)sync.tsMedium
Implement syncSupersedeToCloud() (push supersede updates)sync.tsSmall
Wire push into memory_save tool (fire-and-forget .catch(() => {}))packages/daemon/src/tools/memory-save.tsTrivial
Wire push into auto-save hooks (fire-and-forget)packages/daemon/src/memory/hooks.tsTrivial
Wire pull/bootstrap into initMemoryWithSync() at session startpackages/daemon/src/index.tsSmall
Wire MemorySyncDeps in Daemon constructorpackages/daemon/src/index.tsTrivial

Auto-Save Hooks

When a session completes, lightweight hooks automatically create memories for significant events — without relying on the LLM to explicitly call memory_save. This ensures the memory system captures useful information even when the agent doesn't think to save it.

Architecture

Session completes → SessionStateChanged event (bus)
                          │
                    hooks.ts subscriber
                          │
              ┌───────────┼───────────┐
              │           │           │
              ▼           ▼           ▼
        Task Summary  Correction  Error Resolution
         (Haiku)      (Haiku)     (heuristic+Haiku)
              │           │           │
              ▼           ▼           ▼
           insertMemory(source: 'auto')
              │
              ▼
        checkSummarizationTrigger()

All three hooks run concurrently via Promise.allSettled. Each is fire-and-forget — failures are logged but never block the session or each other.

Why Event Bus Over Direct IPC Call

  • Decoupled — hooks know nothing about sockets or IPC
  • Universal — fires for local, remote, scheduled, and trigger sessions
  • handleRunTask in server.ts is already 600+ lines; adding more logic there increases coupling

Significance Filtering (Avoid Noise)

Before running any hooks, the system applies filters to avoid saving trivial conversations:

FilterThresholdRationale
Tool call count< 2 tool callsConversations with no tool use are typically greetings or simple Q&A
User message count< 3 user messagesCombined with above (both must be below threshold to skip)
Agent self-saves≥ 2 memory_save callsAgent was already deliberate about saving — don't duplicate
Haiku "SKIP"/"NONE"Per-hookFinal safety net — Haiku decides if the content is worth remembering
Inflight guardSet per session IDPrevents duplicate processing if event fires twice

Hook 1: Task Completion Summary

Builds a condensed digest of the session (first user message, tools used, error count, final assistant response, truncated to ~2000 chars) and asks Haiku for a 1-2 sentence factual summary. Haiku responds "SKIP" if trivial.

Saved as: { tags: ['session_summary'], source: 'auto' }

Hook 2: User Correction Detection

Only runs if the session had 3+ user messages (corrections require back-and-forth). Extracts the last 6 messages and asks Haiku: "Did the user correct the agent? If yes, state the preference. If no, respond 'NONE'."

Saved as: { tags: ['preference', 'correction'], source: 'auto' }

Hook 3: Error Resolution Detection

Heuristic scan: looks for tool_result messages with toolError set, followed by a later successful result from the same tool. If the pattern is found, asks Haiku to summarize the error → fix pattern in one sentence.

Saved as: { tags: ['debug', 'error_resolution'], source: 'auto' }

Haiku Access

Uses createProxiedModel(config, signingProvider, HAIKU_MODEL_ID, 'memory-auto-save') — same pattern as memory/summarize.ts. Requires authentication (no-ops silently if not authenticated).

Cost

~$0.0005–$0.001 per session (1-2 Haiku calls, ~500 input tokens + 50-100 output tokens each). Most sessions trigger 1 call (task summary). Correction and error hooks only fire when their heuristics match.

Files

FileDescription
packages/daemon/src/memory/hooks.tsHook implementations, event bus subscription, lifecycle
packages/daemon/src/memory/index.tsExports initAutoSaveHooks, stopAutoSaveHooks, AutoSaveHookDeps
packages/daemon/src/index.tsWires initAutoSaveHooks in Daemon.initialize(), stopAutoSaveHooks in Daemon.cleanup()

Dependency Injection

typescript
export interface AutoSaveHookDeps { getSessionManager: () => SessionManager | null; getSigningProvider: () => SigningProvider | null; getConfig: () => DaemonConfig; }

Wired in Daemon.initialize():

typescript
initAutoSaveHooks({ getSessionManager: () => this.getOrCreateSessionManager(), getSigningProvider: () => this.signingProvider, getConfig: () => loadConfig(), });

Cloud Sync

When a user runs multiple daemons (e.g., laptop + desktop), each daemon should see the same memories. Cloud sync makes this work by using Supabase as a shared merge point while keeping local SQLite as the fast path.

Design Principles

  1. Local-first, cloud-optional — All reads and writes hit local SQLite (<1ms). Cloud sync is fire-and-forget. If the user isn't authenticated or the cloud is unreachable, the system works exactly as before.

  2. Append-only sync — Memories use nanoid primary keys, so two daemons can never produce the same ID. This means INSERT OR IGNORE handles all deduplication with zero conflicts.

  3. Memories sync, summaries don't — Raw memory entries are the source of truth and sync bidirectionally. Summaries are local-only — each daemon generates its own from its merged memory set. This avoids summary conflicts entirely.

  4. Timestamp watermark — Each daemon tracks when it last synced. On pull, it only fetches rows newer than its watermark. Efficient even with thousands of memories.

Architecture

┌──────────────────────────────────────────────────────────────────────────┐
│                        MULTI-DAEMON CLOUD SYNC                            │
│                                                                           │
│   DAEMON A (laptop)                    DAEMON B (desktop)                │
│   ~/.bot0/memory.db                    ~/.bot0/memory.db                 │
│                                                                           │
│   memory_save("user prefers Bun")      memory_save("deploy key in 1PW")  │
│        │                                       │                         │
│        │ 1. INSERT locally (<1ms)              │ 1. INSERT locally       │
│        │ 2. Push to cloud (fire & forget)      │ 2. Push to cloud        │
│        ▼                                       ▼                         │
│   ┌──────────────────────────────────────────────────────┐               │
│   │           Supabase: ctx0_memories                     │               │
│   │                                                       │               │
│   │  id       │ content                    │ synced_at    │               │
│   │  m_a1b2c3 │ "user prefers Bun"         │ 2026-02-27   │               │
│   │  m_x7y8z9 │ "deploy key in 1PW"        │ 2026-02-27   │               │
│   │                                                       │               │
│   │  Proxy auto-injects user_id on all queries           │               │
│   └──────────────────────────────────────────────────────┘               │
│        │                                       │                         │
│        │ 3. Pull on session start              │ 3. Pull on session start│
│        │    (WHERE synced_at > last_sync_ts)   │                         │
│        ▼                                       ▼                         │
│   memory.db now has both memories      memory.db now has both memories   │
│   → local summarization catches up     → local summarization catches up  │
│                                                                           │
└──────────────────────────────────────────────────────────────────────────┘

Supabase Table: ctx0_memories

Mirrors the local memories table with user_id for RLS. Defined as a Drizzle schema in packages/ctx0/src/schema/memory.ts:

typescript
export const ctx0Memories = pgTable('ctx0_memories', { id: text('id').notNull(), userId: uuid('user_id').notNull().references(() => ctx0Users.id), content: text('content').notNull(), tags: jsonb('tags').notNull().default([]), source: text('source').notNull().default('agent'), project: text('project'), pinned: boolean('pinned').notNull().default(false), supersededBy: text('superseded_by'), createdAt: timestamp('created_at', { withTimezone: true }).notNull().defaultNow(), syncedAt: timestamp('synced_at', { withTimezone: true }).notNull().defaultNow(), }, (table) => ({ pk: primaryKey({ columns: [table.id, table.userId] }), userSyncIdx: index('idx_ctx0_memories_user').on(table.userId, table.syncedAt), }));

Equivalent SQL:

sql
CREATE TABLE ctx0_memories ( id TEXT NOT NULL, user_id UUID NOT NULL REFERENCES auth.users(id), content TEXT NOT NULL, tags JSONB NOT NULL DEFAULT '[]'::jsonb, source TEXT NOT NULL DEFAULT 'agent', project TEXT, pinned BOOLEAN NOT NULL DEFAULT false, superseded_by TEXT, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), synced_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), PRIMARY KEY (id, user_id) ); CREATE INDEX idx_ctx0_memories_user ON ctx0_memories(user_id, synced_at DESC);

Dependency Injection

Follows the todo.ts pattern — the daemon injects its ProxiedSupabaseClient at startup:

typescript
interface MemoryToolDeps { getDb: () => ProxiedSupabaseClient | null; } let _deps: MemoryToolDeps | null = null; export function setMemoryToolDeps(deps: MemoryToolDeps): void { _deps = deps; }

If _deps is null or getDb() returns null (user not authenticated), all sync operations silently no-op. The system is fully functional in local-only mode.

Push: Fire-and-Forget After Save

After every local memory_save, push the new row to cloud. Non-blocking, non-fatal.

typescript
async function syncMemoryToCloud(memory: { id: string; content: string; tags: string[]; source: string; project: string | null; pinned: boolean; }): Promise<void> { const db = _deps?.getDb(); if (!db) return; // Not authenticated — skip silently try { await db.from('ctx0_memories').upsert({ id: memory.id, content: memory.content, tags: memory.tags, source: memory.source, project: memory.project, pinned: memory.pinned, synced_at: new Date().toISOString(), }, { onConflict: 'id,user_id' }); } catch (err) { console.error('[memory] Cloud push failed:', err); // Non-fatal — local SQLite is the source of truth during execution } }

Wired into the save path:

typescript
function memorySave(params: { content: string; tags?: string[]; pin?: boolean }, context) { // ... existing local INSERT ... // Fire-and-forget cloud push syncMemoryToCloud({ id, content, tags, source: context.source, project, pinned: pin }) .catch(() => {}); return `Saved: "${content.slice(0, 60)}..."`; }

Push: Supersede Updates

When a memory is superseded locally, push the update:

typescript
async function syncSupersedeToCloud(oldId: string, newId: string): Promise<void> { const db = _deps?.getDb(); if (!db) return; try { await db.from('ctx0_memories') .update({ superseded_by: newId, synced_at: new Date().toISOString() }) .eq('id', oldId); } catch (err) { console.error('[memory] Cloud supersede sync failed:', err); } }

Pull: Session Start

On session start, pull memories created by other daemons since the last sync. This is the mechanism that gives Daemon B access to Daemon A's memories.

typescript
async function pullMemoriesFromCloud(): Promise<number> { const db = _deps?.getDb(); if (!db) return 0; try { // Read last sync timestamp const meta = localDb.prepare( 'SELECT value FROM sync_meta WHERE key = ?' ).get('last_sync_ts') as { value: string } | undefined; const lastSyncTs = meta?.value ?? '1970-01-01T00:00:00Z'; // Fetch new/updated memories from cloud const result = await db.from('ctx0_memories') .select('id,content,tags,source,project,pinned,superseded_by,created_at,synced_at') .gt('synced_at', lastSyncTs) .order('synced_at', { ascending: true }) .limit(1000); if (!result.data || result.data.length === 0) return 0; const rows = result.data as CloudMemoryRow[]; // INSERT OR IGNORE new memories into local SQLite const insertOrIgnore = localDb.prepare(` INSERT OR IGNORE INTO memories (id, content, tags, source, project, pinned, superseded_by, created_at) VALUES (?, ?, ?, ?, ?, ?, ?, ?) `); // UPDATE superseded_by for existing memories (if cloud has a newer supersede) const updateSupersede = localDb.prepare(` UPDATE memories SET superseded_by = ? WHERE id = ? AND superseded_by IS NULL AND ? IS NOT NULL `); let newCount = 0; const pullTx = localDb.transaction(() => { for (const row of rows) { const changes = insertOrIgnore.run( row.id, row.content, JSON.stringify(row.tags), row.source, row.project, row.pinned ? 1 : 0, row.superseded_by, row.created_at, ).changes; if (changes > 0) newCount++; // Apply supersede if it exists on cloud but not locally if (row.superseded_by) { updateSupersede.run(row.superseded_by, row.id, row.superseded_by); } } // Update watermark const latestSyncedAt = rows[rows.length - 1].synced_at; localDb.prepare( 'INSERT OR REPLACE INTO sync_meta (key, value) VALUES (?, ?)' ).run('last_sync_ts', latestSyncedAt); }); pullTx(); if (newCount > 0) { console.log(`[memory] Pulled ${newCount} new memories from cloud`); // Trigger summarization catch-up for new entries checkSummarizationTrigger(); } return newCount; } catch (err) { console.error('[memory] Cloud pull failed:', err); return 0; // Non-fatal — continue with local state } }

New Device Bootstrap

When ~/.bot0/memory.db doesn't exist but the user is authenticated, bootstrap from cloud:

typescript
async function bootstrapFromCloud(): Promise<boolean> { const db = _deps?.getDb(); if (!db) return false; try { // Pull ALL memories for this user (paginated for large sets) let offset = 0; const pageSize = 500; let totalPulled = 0; while (true) { const result = await db.from('ctx0_memories') .select('id,content,tags,source,project,pinned,superseded_by,created_at,synced_at') .order('created_at', { ascending: true }) .range(offset, offset + pageSize - 1); if (!result.data || result.data.length === 0) break; const rows = result.data as CloudMemoryRow[]; const insertStmt = localDb.prepare(` INSERT OR IGNORE INTO memories (id, content, tags, source, project, pinned, superseded_by, created_at) VALUES (?, ?, ?, ?, ?, ?, ?, ?) `); const insertTx = localDb.transaction(() => { for (const row of rows) { insertStmt.run( row.id, row.content, JSON.stringify(row.tags), row.source, row.project, row.pinned ? 1 : 0, row.superseded_by, row.created_at, ); } }); insertTx(); totalPulled += rows.length; offset += pageSize; if (rows.length < pageSize) break; // Last page } if (totalPulled > 0) { console.log(`[memory] Bootstrapped ${totalPulled} memories from cloud`); // Set sync watermark localDb.prepare( 'INSERT OR REPLACE INTO sync_meta (key, value) VALUES (?, ?)' ).run('last_sync_ts', new Date().toISOString()); // Generate summaries for bootstrapped entries await summarizeEntries(); } return totalPulled > 0; } catch (err) { console.error('[memory] Cloud bootstrap failed:', err); return false; } }

Session Lifecycle Integration

typescript
// In session start (packages/daemon/src/session/manager.ts or memory init) async function initMemoryWithSync(): Promise<void> { const isNewDb = !existsSync(DB_PATH); // Initialize local SQLite (creates tables if needed) initMemoryDb(); if (isNewDb) { // New device — try bootstrapping from cloud await bootstrapFromCloud(); } else { // Existing device — pull new memories from other daemons await pullMemoriesFromCloud(); } // Existing session start logic: catch-up summarization, etc. await onSessionStart(); }

Conflict Resolution

ScenarioResolutionWhy it works
Two daemons save different memoriesNo conflict — different nanoid PKsINSERT OR IGNORE on both sides
Both daemons supersede different memoriesBoth supersedes apply independentlyEach targets a different id
Both daemons supersede the SAME memoryLast-write-wins on superseded_byRare, harmless — both point to valid replacements
Summary divergence between daemonsNo conflict — summaries are local-onlyEach daemon summarizes its own merged set
Daemon offline for daysPulls all missed rows via timestamp watermarksynced_at > last_sync_ts catches everything
Cloud unreachable on pushFire-and-forget, memory exists locallyRetry happens on next save or session start
Cloud unreachable on pullSkip, use local stateSystem works fully offline
User not authenticatedAll sync operations silently no-op_deps?.getDb() returns null
Very large memory set (10K+)Paginated pull with batch INSERT OR IGNORErange() pagination, transaction batches

Cost Analysis

OperationNetwork callsLatencyNotes
Push (per memory_save)1 upsertFire-and-forget (~20-50ms)Non-blocking
Pull (per session start)1 select~50-150msOnly fetches new rows
Bootstrap (new device)N selects (paginated)~100-500ms per pageOne-time
Supersede sync1 updateFire-and-forget (~20-50ms)Rare

Supabase cost for a typical user:

  • ~50 memories/day × 30 days = 1,500 rows/month
  • Each row ≈ 200 bytes → ~300 KB/month storage
  • ~60 push operations/day + ~2 pull operations/day → negligible

Future Extensions

Vector Search Upgrade

When logs grow past ~1000 entries, add local vector search:

  • sqlite-vec extension (SQLite native vector similarity)
  • Generate embeddings via local model (ONNX) or API call
  • Add embedding BLOB column to memories table
  • Use for memory_search alongside FTS5

Memory CLI

bash
# List recent memories bot0 memory list --limit 20 # Search bot0 memory search "deploy key" # Export bot0 memory export --format json > backup.json bot0 memory export --format md > readable.md # Import bot0 memory import backup.json # Stats bot0 memory stats # → 342 memories, 8 summaries, 156 KB, oldest: 2025-12-01

Graduation to Full Vault

When a user outgrows the flat log:

  1. Each memory → ctx0_entries with path derived from tags
  2. Tags map to folders: preference/preferences/, contact/contacts/
  3. Summaries → high-level vault entries
  4. SQLite stays as local cache, vault becomes primary