ctx0 — Persistent Memory
Overview
Persistent Memory is a local-first, summary-primed memory system for AI agents — combining the simplicity of a flat memory log with AI-generated summaries for low-cost context loading. All data lives in a single SQLite file on the user's machine. The agent writes memories freely, searches them on demand via FTS5 full-text search, and starts each session with a concise AI-generated summary instead of thousands of tokens of raw entries.
Core insight: Writing memories should be dead simple (single INSERT). Loading memories into context should be as small as possible (AI summary, not raw entries). Searching memories should be accurate and fast (FTS5 over raw content). These three operations have different needs — optimize each independently.
┌─────────────────────────────────────────────────────────────────────────────┐
│ │
│ Full vault: Agent → ctx0_remember → DB write → curator agent → │
│ vault tree → embedding queue → DB read → vector search │
│ → agent │
│ (6 hops, 2 subagents, network I/O, DB costs) │
│ │
│ This system: Agent → db.insert() → done │
│ Session start → load summary (~300 tokens) │
│ Need details? → FTS5 search (<1ms) │
│ (2 tools, 0 subagents, local disk only) │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Why This Approach
This architecture was chosen after evaluating two alternatives:
| Criterion | Flat Memory Log | Git-Like Context Controller | This System (Hybrid) |
|---|---|---|---|
| Simplicity | 2 tools, ~200 LOC | 5 tools, ~400 LOC | 2 tools, ~300 LOC |
| Context cost | ~4,000 tokens (raw entries) | ~130-330 tokens (AI summary) | ~300-500 tokens (AI summary) |
| Retrieval accuracy | High (raw content + FTS5) | Lower (summaries are lossy) | High (raw content + FTS5) |
| Robustness | No external dependencies | Requires Haiku for every commit | Haiku optional, graceful fallback |
| Complexity | Simple flat list | DAG with branches, commits, merges | Simple flat list + summaries |
The flat log gives us simplicity and accurate retrieval. The git controller gives us low context cost via AI summaries. This hybrid takes both strengths and discards the DAG/branching complexity (which adds cognitive load for the agent without proportional value in a single-user, single-agent MVP).
Why Local SQLite
| Concern | Local SQLite | Remote DB (Supabase) | Flat File (JSONL) |
|---|---|---|---|
| Read latency | <1ms (indexed query) | 50-150ms (network) | <1ms (small), linear growth |
| Write latency | <1ms (INSERT) | 20-50ms (network) | <1ms (append) |
| Update/delete | UPDATE/DELETE | UPDATE/DELETE | Full file rewrite |
| Filtering | SQL WHERE + indexes | SQL WHERE + indexes | Parse all → JS filter |
| Full-text search | FTS5 (built-in, fast) | pg_trgm / tsvector | String.includes() |
| Crash safety | WAL mode, transactions | Server-side | Hope append was atomic |
| Concurrent access | WAL mode (built-in) | Server handles | Manual file locking |
| Cost | $0 | Per-read/write billing | $0 |
| Offline | Yes | No | Yes |
| Privacy | Data never leaves machine | Remote server | Data never leaves machine |
| Disk footprint | Single .db file | N/A | 3 files (log + summary + backup) |
| Human-inspectable | sqlite3 CLI / DB Browser | psql / dashboard | Text editor |
| Cross-device sync | Via cloud sync (row-level) | Built-in | Manual |
| Scales | Millions of rows | Unlimited | ~1000 lines before sluggish |
SQLite is the sweet spot: all the local-first benefits of flat files, plus real querying, indexing, atomic writes, and crash safety. One file on disk (memory.db), zero network, zero cost.
better-sqlite3 is synchronous (no async overhead), battle-tested, and used by Turso, Obsidian, 1Password, and thousands of Electron apps.
Related documentation:
- ctx0 System Architecture — Full vault architecture (for comparison)
- ctx0 Sessions — Session/conversation storage
- Flat Memory Log — Original flat log spec
- Git-Like Context Controller — Git-like alternative
Design Principles
-
Local-first, cloud-synced — Memory lives on the user's machine as a single SQLite file. All reads and writes are local (<1ms). Cloud sync is optional and fire-and-forget — if the user has multiple daemons, memories sync via Supabase. If offline or unauthenticated, the system works identically without sync.
-
Summary-primed — Sessions start with a concise AI-generated summary (~300-500 tokens), not thousands of tokens of raw entries. The summary is the "big picture." Raw entries are the "source of truth."
-
Search-accurate — When the summary isn't enough, the agent searches raw entries via FTS5. Full-text search over actual content is always more accurate than searching summaries.
-
Zero-subagent — No curator, no librarian, no extractor. The main agent reads and writes memory directly. Summarization runs in the background, not as a blocking subagent.
-
Save liberally, load smartly — Writing is cheap (single INSERT, <1ms). Loading is where the intelligence lives (what to summarize, when to summarize, how to prime).
-
Graceful degradation — If AI summarization fails (network down, API error), fall back to loading raw entries with a token budget. The system never breaks — it just gets temporarily less efficient.
-
Single-file simplicity — One
memory.dbfile. Copy it to a new machine, back it up, inspect it with standard tools.
Architecture
┌─────────────────────────────────────────────────────────────────────────────┐
│ PERSISTENT MEMORY SYSTEM │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ MAIN AGENT │ │
│ │ bot0 daemon │ Claude Code │ Cursor │ any agent │ │
│ │ │ │
│ │ Tools: │ │
│ │ ├── memory_save(content, tags?, pin?) │ │
│ │ └── memory_search(query, tags?, project?) │ │
│ │ │ │
│ │ Auto-injected at session start: │ │
│ │ └── <memory> block in system prompt (summary + pinned entries) │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │ │
│ Local disk │ │ Haiku (background) │
│ ▼ ▼ │
│ ┌──────────────────────────┐ ┌──────────────────────────────────┐ │
│ │ LOCAL SQLITE STORAGE │ │ AI SUMMARIZATION (async) │ │
│ │ │ │ │ │
│ │ ~/.bot0/memory.db │ │ Triggered by: │ │
│ │ │ │ • Entry count threshold (20+) │ │
│ │ ┌─────────────────────┐ │ │ • Session end │ │
│ │ │ memories │ │ │ • Session start (catch-up) │ │
│ │ │ memories_fts (FTS5) │ │ │ • Periodic interval (5 min) │ │
│ │ │ memory_summaries │ │ │ • Daily compaction cron │ │
│ │ │ sync_meta │ │ │ • Auto-save hooks (bus event) │ │
│ │ └─────────────────────┘ │ │ Cost: ~$0.0001/summary (Haiku) │ │
│ │ │ │ Fallback: load raw if fails │ │
│ │ WAL mode, <1ms r/w │ └──────────────────────────────────┘ │
│ └────────────┬─────────────┘ │
│ │ │
│ │ Cloud sync (fire-and-forget push, session-start pull) │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ SUPABASE: ctx0_memories (optional, multi-daemon sync) │ │
│ │ │ │
│ │ Daemon A (laptop) ──push──▶ ctx0_memories ◀──push── Daemon B │ │
│ │ Daemon A ◀──pull (session start)──▶ Daemon B │ │
│ │ │ │
│ │ • Row-level sync (append-only, nanoid PKs = no conflicts) │ │
│ │ • Memories sync bidirectionally; summaries stay local-only │ │
│ │ • Graceful degradation: unauthenticated → local-only mode │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Database Schema
Single SQLite file at ~/.bot0/memory.db, using better-sqlite3 in synchronous mode with WAL journaling.
memories — The Raw Log
Every memory the agent saves. This is the source of truth — summaries are derived from this.
CREATE TABLE memories ( id TEXT PRIMARY KEY, -- nanoid (e.g., "m_a1b2c3") content TEXT NOT NULL, -- The memory itself (natural language) tags TEXT NOT NULL DEFAULT '[]', -- JSON array of strings source TEXT NOT NULL DEFAULT 'agent', -- 'user' | 'agent' | 'auto' project TEXT, -- Project context (null = global) pinned INTEGER NOT NULL DEFAULT 0, -- 1 = always load into context superseded_by TEXT, -- ID of newer version (soft-update) summarized INTEGER NOT NULL DEFAULT 0, -- 1 = included in a summary created_at TEXT NOT NULL DEFAULT (datetime('now')) -- ISO 8601 ); -- Loading: pinned memories (always loaded first) CREATE INDEX idx_memories_pinned ON memories(pinned, created_at DESC) WHERE pinned = 1 AND superseded_by IS NULL; -- Loading: recent active memories CREATE INDEX idx_memories_active ON memories(created_at DESC) WHERE superseded_by IS NULL; -- Loading: project-specific memories CREATE INDEX idx_memories_project ON memories(project, created_at DESC) WHERE project IS NOT NULL AND superseded_by IS NULL; -- Summarization: unsummarized entries CREATE INDEX idx_memories_unsummarized ON memories(created_at ASC) WHERE summarized = 0 AND superseded_by IS NULL; -- Search: by source CREATE INDEX idx_memories_source ON memories(source);
memories_fts — Full-Text Search
SQLite FTS5 gives us fast, ranked full-text search over memory content — no external dependencies, no embedding pipelines.
-- FTS5 virtual table for content search CREATE VIRTUAL TABLE memories_fts USING fts5( content, content=memories, content_rowid=rowid ); -- Keep FTS in sync with triggers CREATE TRIGGER memories_ai AFTER INSERT ON memories BEGIN INSERT INTO memories_fts(rowid, content) VALUES (new.rowid, new.content); END; CREATE TRIGGER memories_ad AFTER DELETE ON memories BEGIN INSERT INTO memories_fts(memories_fts, rowid, content) VALUES ('delete', old.rowid, old.content); END; CREATE TRIGGER memories_au AFTER UPDATE ON memories BEGIN INSERT INTO memories_fts(memories_fts, rowid, content) VALUES ('delete', old.rowid, old.content); INSERT INTO memories_fts(rowid, content) VALUES (new.rowid, new.content); END;
memory_summaries — AI-Generated Summaries
Summaries are the primary artifact loaded into context at session start. Two types:
- incremental — Summarizes a batch of recent entries (created throughout the day)
- compacted — Rolls up multiple incremental summaries into a higher-level digest (created by daily cron)
CREATE TABLE memory_summaries ( id TEXT PRIMARY KEY, -- nanoid (e.g., "ms_x7y8z9") type TEXT NOT NULL DEFAULT 'incremental', -- 'incremental' | 'compacted' summary TEXT NOT NULL, -- AI-generated summary text entry_count INTEGER NOT NULL, -- How many entries/summaries were summarized entry_ids TEXT NOT NULL DEFAULT '[]', -- JSON array of memory IDs included period_start TEXT NOT NULL, -- Earliest entry timestamp period_end TEXT NOT NULL, -- Latest entry timestamp created_at TEXT NOT NULL DEFAULT (datetime('now')) ); CREATE INDEX idx_summaries_type ON memory_summaries(type, created_at DESC); CREATE INDEX idx_summaries_period ON memory_summaries(period_end DESC);
sync_meta — Cloud Sync State
Tracks sync watermarks for the cloud sync system. See Cloud Sync for details.
CREATE TABLE sync_meta ( key TEXT PRIMARY KEY, -- 'last_sync_ts', 'device_id' value TEXT NOT NULL );
Initialization
import Database from 'better-sqlite3'; import { join } from 'path'; import { homedir } from 'os'; const DB_PATH = join(homedir(), '.bot0', 'memory.db'); function initMemoryDb(): Database.Database { const db = new Database(DB_PATH); // WAL mode for better concurrent read/write performance db.pragma('journal_mode = WAL'); // Run schema creation (idempotent) db.exec(` CREATE TABLE IF NOT EXISTS memories ( id TEXT PRIMARY KEY, content TEXT NOT NULL, tags TEXT NOT NULL DEFAULT '[]', source TEXT NOT NULL DEFAULT 'agent', project TEXT, pinned INTEGER NOT NULL DEFAULT 0, superseded_by TEXT, summarized INTEGER NOT NULL DEFAULT 0, created_at TEXT NOT NULL DEFAULT (datetime('now')) ); CREATE INDEX IF NOT EXISTS idx_memories_pinned ON memories(pinned, created_at DESC) WHERE pinned = 1 AND superseded_by IS NULL; CREATE INDEX IF NOT EXISTS idx_memories_active ON memories(created_at DESC) WHERE superseded_by IS NULL; CREATE INDEX IF NOT EXISTS idx_memories_project ON memories(project, created_at DESC) WHERE project IS NOT NULL AND superseded_by IS NULL; CREATE INDEX IF NOT EXISTS idx_memories_unsummarized ON memories(created_at ASC) WHERE summarized = 0 AND superseded_by IS NULL; CREATE INDEX IF NOT EXISTS idx_memories_source ON memories(source); CREATE VIRTUAL TABLE IF NOT EXISTS memories_fts USING fts5( content, content=memories, content_rowid=rowid ); CREATE TABLE IF NOT EXISTS memory_summaries ( id TEXT PRIMARY KEY, type TEXT NOT NULL DEFAULT 'incremental', summary TEXT NOT NULL, entry_count INTEGER NOT NULL, entry_ids TEXT NOT NULL DEFAULT '[]', period_start TEXT NOT NULL, period_end TEXT NOT NULL, created_at TEXT NOT NULL DEFAULT (datetime('now')) ); CREATE INDEX IF NOT EXISTS idx_summaries_type ON memory_summaries(type, created_at DESC); CREATE INDEX IF NOT EXISTS idx_summaries_period ON memory_summaries(period_end DESC); CREATE TABLE IF NOT EXISTS sync_meta ( key TEXT PRIMARY KEY, value TEXT NOT NULL ); `); return db; }
File on Disk
~/.bot0/
├── config.json # Existing daemon config
└── memory.db # SQLite database (single file)
That's it. One file.
Size estimates:
- 100 memories ≈ 50-100 KB
- 1,000 memories ≈ 500 KB - 1 MB
- 10,000 memories ≈ 5-10 MB
- FTS5 index adds ~30% overhead
SQLite handles millions of rows. We will never be the bottleneck.
When Memories Are Saved
Memories enter the log through three channels:
1. Explicit User Request
The user says "remember this" or "save this for later."
User: "Remember that the deploy key for staging is in 1Password under 'staging-deploy'"
Agent: [calls memory_save]
→ INSERT into memories: { content: "...", source: "user", tags: ["infra", "credentials"] }
2. Agent Self-Save
The agent encounters information during work that seems worth persisting. The system prompt instructs it to recognize and save:
- Surprising facts — Information that contradicts assumptions or is non-obvious
- Learned patterns — "This codebase uses X pattern for Y"
- User preferences — Implicit preferences revealed through corrections or choices
- Environmental facts — API key locations, deployment targets, team structure
- Decision rationale — Why a particular approach was chosen over alternatives
Agent: [reading code, discovers unconventional pattern]
Agent: [calls memory_save]
→ INSERT: { content: "bot0 wraps Drizzle in OrmClient (packages/db/orm.ts)...",
source: "agent", tags: ["codebase", "pattern"] }
3. Auto-Save on Significant Events
Hooks subscribe to SessionStateChanged events on the daemon event bus. When a session transitions from working → completed, three hooks fire concurrently (fire-and-forget, never blocks the session):
| Event | What Gets Saved | Tags | Example |
|---|---|---|---|
| Task completion | 1-2 sentence summary of what was accomplished | session_summary | "Migrated auth from JWT to session tokens." |
| User correction | The preference/correction as a standalone fact | preference, correction | "User prefers tabs over spaces." |
| Error resolution | Error + fix pattern | debug, error_resolution | "ECONNREFUSED on 5432 → brew services start postgresql" |
All auto-saves use source: 'auto' (vs 'agent' for explicit saves). Significance filtering skips trivial conversations (< 2 tool calls AND < 3 user messages) and sessions where the agent already called memory_save 2+ times. Each hook uses Haiku for summarization with a "SKIP"/"NONE" escape hatch to avoid noise.
See Auto-Save Hooks for full architecture details.
Context Loading (Session Start)
At session start, the daemon loads a <memory> block into the system prompt. This uses summary priming — loading the AI-generated summary instead of raw entries, keeping context cost low.
Loading Algorithm
LOAD_MEMORY_CONTEXT(project):
blocks = []
── Phase 1: Pinned memories (always loaded, raw content) ──────────
pinned = SELECT id, content, tags, created_at
FROM memories
WHERE pinned = 1
AND superseded_by IS NULL
ORDER BY created_at DESC
for entry in pinned:
blocks.push({ section: "pinned", content: entry.content })
── Phase 2: Latest compacted summary (big picture) ────────────────
compacted = SELECT summary FROM memory_summaries
WHERE type = 'compacted'
ORDER BY created_at DESC
LIMIT 1
if compacted:
blocks.push({ section: "context", content: compacted.summary })
── Phase 3: Incremental summaries since last compaction ───────────
incrementals = SELECT summary FROM memory_summaries
WHERE type = 'incremental'
AND created_at > COALESCE(
(SELECT MAX(created_at) FROM memory_summaries WHERE type = 'compacted'),
'1970-01-01'
)
ORDER BY created_at DESC
for s in incrementals:
blocks.push({ section: "recent", content: s.summary })
── Phase 4: Unsummarized entries (newest, not yet in any summary) ─
unsummarized = SELECT id, content, created_at
FROM memories
WHERE summarized = 0
AND superseded_by IS NULL
AND pinned = 0
ORDER BY created_at DESC
LIMIT 10
if unsummarized.length > 0:
blocks.push({ section: "latest", content: format_entries(unsummarized) })
RETURN format_memory_block(blocks)
Context Window Injection
<memory> You have persistent memory from previous sessions. ## Pinned - Deploy key for staging is in 1Password under 'staging-deploy' [2026-01-15] - User prefers Bun over Node for new TypeScript projects [2026-01-20] ## Context The user works on bot0, a local-first AI agent system using pnpm workspaces with Turborepo. The daemon is the core component. Authentication uses hardware-bound device keys via Secure Enclave/TPM 2.0. The project uses Drizzle ORM for database schema management with Supabase PostgreSQL. The user prefers TypeScript strict mode and monospace terminal aesthetics. ## Recent Last few sessions focused on designing a persistent memory system for the daemon. Evaluated flat memory log vs git-like context controller approaches. Decided on a hybrid: flat log simplicity with AI summary priming. SQLite with FTS5 for search. ## Latest (not yet summarized) - ctx0-persistent-memory.md spec finalized with hybrid approach [2026-02-26] - Summarization triggers: entry threshold, session end, session start catch-up, periodic interval, daily compaction cron [2026-02-26] </memory>
Token Budget
| Component | Tokens | Notes |
|---|---|---|
| Header + section markers | ~30 | <memory> tags, section headings |
| Pinned entries (3-5) | ~100-200 | Raw content, always loaded |
| Compacted summary | ~100-200 | AI-generated, concise |
| Recent incremental summaries (1-3) | ~100-200 | Since last compaction |
| Unsummarized entries (0-10) | ~0-200 | Only if any exist |
| Total | ~300-500 | <1% of 80K context window |
Compare: the original flat log spec loaded ~4,000 tokens of raw entries. This is 8-13x more efficient.
Fallback: No Summaries Yet
If no summaries exist (new user, fresh install), fall back to loading raw entries with a token budget — identical to the original flat log approach:
if no summaries exist:
load pinned (up to 1000 tokens)
load recent entries (up to 2000 tokens)
load project entries (up to 500 tokens)
total: up to ~3500 tokens (one-time cost until first summarization runs)
This ensures the system works immediately on first install, before any summarization has occurred.
Agent Tools
memory_save
interface MemorySaveTool { name: 'memory_save'; description: 'Save information to your persistent memory. Use when you learn something worth remembering across sessions: user preferences, project patterns, environmental facts, surprising discoveries, or anything the user explicitly asks you to remember.'; parameters: { /** The memory content. Clear, concise, standalone. Max 2000 chars. */ content: string; /** Tags for categorization: "preference", "codebase", "contact", "infra", "decision", "pattern", "debug" */ tags?: string[]; /** Pin to always load into context. Use sparingly — pinned entries consume budget every session. */ pin?: boolean; }; }
Implementation:
import { nanoid } from 'nanoid'; const insertStmt = db.prepare(` INSERT INTO memories (id, content, tags, source, project, pinned, created_at) VALUES (?, ?, ?, ?, ?, ?, datetime('now')) `); function memorySave(params: { content: string; tags?: string[]; pin?: boolean; }, context: { source: 'user' | 'agent' | 'auto'; project?: string }): string { const { content, tags = [], pin = false } = params; // Enforce max content length (~500 tokens ≈ ~2000 chars) if (content.length > 2000) { return 'Memory too long. Keep under ~500 tokens (2000 chars). Be more concise.'; } const id = `m_${nanoid(6)}`; insertStmt.run( id, content.trim(), JSON.stringify(tags), context.source, context.project ?? null, pin ? 1 : 0, ); // Check if summarization should trigger (entry count threshold) checkSummarizationTrigger(); return `Saved${pin ? ' (pinned)' : ''}: "${content.slice(0, 60)}..."`; }
memory_search
Full-text search over the raw memory log using FTS5. Available when the pre-loaded summary isn't detailed enough.
interface MemorySearchTool { name: 'memory_search'; description: 'Search your memory for specific information. Use when the summary in your context doesn\'t contain what you need — this searches raw entries for exact details.'; parameters: { /** Search query — natural language or keywords */ query: string; /** Filter by tags */ tags?: string[]; /** Filter by project */ project?: string; /** Maximum results (default 10) */ limit?: number; }; }
Implementation:
const ftsSearchStmt = db.prepare(` SELECT m.id, m.content, m.tags, m.created_at FROM memories m JOIN memories_fts fts ON m.rowid = fts.rowid WHERE memories_fts MATCH ? AND m.superseded_by IS NULL ORDER BY rank LIMIT ? `); const keywordSearchStmt = db.prepare(` SELECT id, content, tags, created_at FROM memories WHERE content LIKE ? AND superseded_by IS NULL ORDER BY created_at DESC LIMIT ? `); function memorySearch(params: { query: string; tags?: string[]; project?: string; limit?: number; }): string { const { query, tags, project, limit = 10 } = params; // Try FTS5 first (handles phrases, boolean operators, ranking) let results: MemoryRow[]; try { results = ftsSearchStmt.all(query, limit); } catch { // FTS5 syntax error (e.g., unbalanced quotes) — fallback to LIKE results = keywordSearchStmt.all(`%${query}%`, limit); } // Post-filter by tags and project (small result set, fast in JS) if (tags?.length) { results = results.filter(r => { const entryTags: string[] = JSON.parse(r.tags); return tags.some(t => entryTags.includes(t)); }); } if (project) { results = results.filter(r => r.project === project); } if (results.length === 0) { return 'No memories found matching your search.'; } return results.map(r => { const entryTags: string[] = JSON.parse(r.tags); return `- ${r.content} [${entryTags.join(', ')}] (${r.created_at.slice(0, 10)})`; }).join('\n'); }
Memory Lifecycle
Writing
┌─────────────────────────────────────────────────────────────────┐
│ MEMORY WRITE PATH │
│ │
│ Any save trigger │
│ → Build MemoryEntry object │
│ → db.prepare(INSERT).run(...) (<1ms, local) │
│ → FTS5 trigger auto-updates search index (<1ms, local) │
│ → Check summarization trigger (entry count check) │
│ → Push to cloud (fire-and-forget) (non-blocking) │
│ → Done. <1ms locally. Cloud push is async. │
│ │
└─────────────────────────────────────────────────────────────────┘
Reading (Session Start)
┌─────────────────────────────────────────────────────────────────┐
│ MEMORY READ PATH │
│ │
│ New conversation starts │
│ → Pull new memories from cloud (other daemons' writes) │
│ (INSERT OR IGNORE into local SQLite, ~50-150ms) │
│ → Check for unsummarized entries from crashed sessions │
│ (self-healing catch-up — summarize if needed) │
│ → Load pinned entries (raw content) │
│ → Load latest compacted summary │
│ → Load incremental summaries since last compaction │
│ → Load any unsummarized entries (raw, newest) │
│ → Format into <memory> block string │
│ → Inject into system prompt │
│ → Done. ~300-500 tokens. │
│ │
└─────────────────────────────────────────────────────────────────┘
Updating (Supersede Pattern)
Memories are never edited in place. To update, insert a new entry and mark the old one:
const supersedeTx = db.transaction((oldId: string, newContent: string, project?: string) => { const old = db.prepare('SELECT * FROM memories WHERE id = ?').get(oldId); if (!old) return; const newId = `m_${nanoid(6)}`; // Insert replacement insertStmt.run(newId, newContent, old.tags, 'agent', project ?? old.project, old.pinned); // Mark old as superseded db.prepare('UPDATE memories SET superseded_by = ? WHERE id = ?').run(newId, oldId); });
Using a transaction ensures atomicity — both the insert and the update happen together or not at all.
Summarization System
Summarization is the bridge between raw entries and context-efficient loading. It runs asynchronously — never blocking the conversation — and uses Haiku via the existing LLM proxy.
Two-Tier Summary Model
| Tier | Type | Contains | Created by | Lifespan |
|---|---|---|---|---|
| Incremental | incremental | Summary of a batch of recent entries | Entry threshold, session end, periodic timer | Accumulates throughout the day |
| Compacted | compacted | Summary of multiple incremental summaries | Daily cron job | Rolls up incrementals, represents longer-term knowledge |
This ensures the memory_summaries table stays bounded regardless of how long the system has been running. After 6 months of daily use, context loading still involves one compacted summary + a few recent incrementals — not hundreds of summaries.
Summarization Triggers
Five trigger points ensure memories are summarized at the right times:
1. Entry Count Threshold (Primary)
After every memory_save, check if unsummarized entries exceed a threshold. If so, trigger summarization.
const SUMMARIZE_THRESHOLD = 20; function checkSummarizationTrigger(): void { const { count } = db.prepare(` SELECT COUNT(*) as count FROM memories WHERE summarized = 0 AND superseded_by IS NULL `).get() as { count: number }; if (count >= SUMMARIZE_THRESHOLD) { // Run async — don't block the conversation summarizeEntries().catch(err => console.error('[memory] Summarization failed:', err) ); } }
This is self-regulating: heavy sessions trigger more summaries, quiet sessions don't waste API calls.
2. Session End
When a session ends, summarize any remaining unsummarized entries. Ensures nothing is left dangling.
async function onSessionEnd(sessionId: string): Promise<void> { try { const { count } = db.prepare(` SELECT COUNT(*) as count FROM memories WHERE summarized = 0 AND superseded_by IS NULL `).get() as { count: number }; if (count > 0) { await summarizeEntries(); console.log(`[memory] Summarized ${count} entries at session end`); } } catch (err) { // Non-fatal — log but don't break session cleanup console.error('[memory] Session-end summarization failed:', err); } }
3. Session Start (Self-Healing Catch-Up)
If the previous session crashed or summarization failed, session start detects orphaned unsummarized entries and catches up. This is the self-healing mechanism.
async function onSessionStart(): Promise<void> { const { count } = db.prepare(` SELECT COUNT(*) as count FROM memories WHERE summarized = 0 AND superseded_by IS NULL `).get() as { count: number }; // If there are unsummarized entries older than the latest summary, // a previous session likely crashed before summarizing if (count > SUMMARIZE_THRESHOLD) { console.log(`[memory] Catch-up: ${count} unsummarized entries from previous session`); await summarizeEntries(); } }
4. Periodic Interval (Long Sessions)
A timer in the daemon catches the case of very long sessions where the entry count threshold hasn't been hit but time has passed.
const SUMMARIZE_INTERVAL_MS = 5 * 60 * 1000; // 5 minutes let summarizeTimer: NodeJS.Timeout | null = null; function startPeriodicSummarization(): void { summarizeTimer = setInterval(async () => { const { count } = db.prepare(` SELECT COUNT(*) as count FROM memories WHERE summarized = 0 AND superseded_by IS NULL `).get() as { count: number }; if (count > 0) { await summarizeEntries(); } }, SUMMARIZE_INTERVAL_MS); } function stopPeriodicSummarization(): void { if (summarizeTimer) { clearInterval(summarizeTimer); summarizeTimer = null; } }
5. Daily Compaction Cron
A daily job rolls up incremental summaries into a single compacted summary. This prevents the memory_summaries table from growing unbounded and keeps context loading fast.
async function dailyCompaction(): Promise<void> { // Get all incremental summaries older than 24 hours const incrementals = db.prepare(` SELECT id, summary, entry_count, period_start, period_end FROM memory_summaries WHERE type = 'incremental' AND created_at < datetime('now', '-1 day') ORDER BY created_at ASC `).all() as SummaryRow[]; if (incrementals.length < 2) return; // Nothing to compact // AI-summarize the incremental summaries into one compacted summary const compactedText = await generateCompactionSummary( incrementals.map(s => s.summary) ); const totalEntries = incrementals.reduce((sum, s) => sum + s.entry_count, 0); const periodStart = incrementals[0].period_start; const periodEnd = incrementals[incrementals.length - 1].period_end; // Atomic: insert compacted + delete old incrementals const compactTx = db.transaction(() => { db.prepare(` INSERT INTO memory_summaries (id, type, summary, entry_count, entry_ids, period_start, period_end) VALUES (?, 'compacted', ?, ?, '[]', ?, ?) `).run(`ms_${nanoid(6)}`, compactedText, totalEntries, periodStart, periodEnd); const ids = incrementals.map(s => s.id); db.prepare(` DELETE FROM memory_summaries WHERE id IN (${ids.map(() => '?').join(',')}) `).run(...ids); }); compactTx(); console.log(`[memory] Compacted ${incrementals.length} summaries into 1`); }
Trigger Priority
| Priority | Trigger | When | Why |
|---|---|---|---|
| Must have | Entry count threshold | After every memory_save | Self-regulating, handles all session patterns |
| Must have | Session start catch-up | On new session | Self-healing, covers crashes and failures |
| Should have | Session end | On session cleanup | Ensures nothing is left unsummarized |
| Should have | Daily compaction cron | Once daily (e.g., 3am) | Keeps summaries bounded over months |
| Nice to have | Periodic interval | Every 5 minutes | Extra safety net for marathon sessions |
AI Summarization Prompts
Incremental Summary (entries → summary)
async function generateIncrementalSummary( entries: Array<{ content: string; tags: string[]; source: string }> ): Promise<string> { const entryText = entries .map((e, i) => `${i + 1}. ${e.content}`) .join('\n'); const response = await haiku.messages.create({ model: 'claude-haiku-4-5-20251001', max_tokens: 300, messages: [{ role: 'user', content: `Summarize these memories into a concise paragraph (2-5 sentences). Preserve key facts, decisions, and preferences. Drop redundancies. Write in third person present tense ("The user prefers...", "The project uses..."). Memories: ${entryText} Summary:`, }], }); return response.content[0].text.trim(); }
Compaction Summary (summaries → summary)
async function generateCompactionSummary( summaries: string[] ): Promise<string> { const summaryText = summaries .map((s, i) => `${i + 1}. ${s}`) .join('\n\n'); const response = await haiku.messages.create({ model: 'claude-haiku-4-5-20251001', max_tokens: 500, messages: [{ role: 'user', content: `Merge these incremental memory summaries into a single comprehensive summary. Preserve all key facts, preferences, and decisions. Remove duplicates. Resolve any contradictions (prefer more recent information). Write in third person present tense. Keep it concise but complete. Summaries to merge: ${summaryText} Merged summary:`, }], }); return response.content[0].text.trim(); }
Core Summarization Function
async function summarizeEntries(): Promise<void> { // Get unsummarized entries const entries = db.prepare(` SELECT id, content, tags, source, created_at FROM memories WHERE summarized = 0 AND superseded_by IS NULL ORDER BY created_at ASC `).all() as MemoryRow[]; if (entries.length === 0) return; // Generate AI summary const summary = await generateIncrementalSummary( entries.map(e => ({ content: e.content, tags: JSON.parse(e.tags), source: e.source, })) ); const entryIds = entries.map(e => e.id); const periodStart = entries[0].created_at; const periodEnd = entries[entries.length - 1].created_at; // Atomic: insert summary + mark entries as summarized const summarizeTx = db.transaction(() => { db.prepare(` INSERT INTO memory_summaries (id, type, summary, entry_count, entry_ids, period_start, period_end) VALUES (?, 'incremental', ?, ?, ?, ?, ?) `).run(`ms_${nanoid(6)}`, summary, entries.length, JSON.stringify(entryIds), periodStart, periodEnd); db.prepare(` UPDATE memories SET summarized = 1 WHERE id IN (${entryIds.map(() => '?').join(',')}) `).run(...entryIds); }); summarizeTx(); console.log(`[memory] Summarized ${entries.length} entries`); }
Cost Analysis
| Operation | Model | Input tokens | Output tokens | Cost |
|---|---|---|---|---|
| Incremental summary (20 entries) | Haiku | ~1000 | ~150 | ~$0.0002 |
| Compaction (5 incrementals) | Haiku | ~1500 | ~300 | ~$0.0003 |
| Typical day (2-3 incrementals + 0-1 compaction) | Haiku | ~3500 | ~600 | ~$0.0007 |
| Heavy day (10 incrementals + 1 compaction) | Haiku | ~12000 | ~2000 | ~$0.003 |
Negligible. A heavy month of daily use costs ~$0.10.
How SQLite, FTS5, and Summarization Work Together
Each component serves a different moment in the workflow:
SESSION START
│
├─ Pull new memories from cloud (other daemons' writes)
│ → SELECT WHERE synced_at > last_sync_ts
│ → INSERT OR IGNORE into local SQLite
│ → Trigger summarization catch-up if new entries arrived
│
├─ SQLite query: load summaries + pinned entries
│ → inject ~300-500 tokens into <memory> system prompt
│ → agent starts with "big picture" context (includes other daemons' knowledge)
│
MID-CONVERSATION (on demand)
│
├─ Agent calls memory_save("user wants dark mode in all UIs")
│ → SQLite INSERT into memories (<1ms)
│ → FTS5 index auto-updated (SQLite trigger)
│ → Check entry count threshold → maybe trigger summarization
│ → Push to cloud (fire-and-forget, non-blocking)
│
├─ Agent calls memory_search("dark mode preference")
│ → FTS5 MATCH query → returns ranked raw entries (<1ms)
│ → Agent sees actual content, high fidelity
│
BACKGROUND (async, non-blocking)
│
├─ Summarization triggers (entry threshold / timer / session end)
│ → Read unsummarized entries from SQLite
│ → Send to Haiku for summary generation (~500-2000ms)
│ → INSERT summary + mark entries as summarized (atomic transaction)
│
SESSION END
│
├─ Final summarization of remaining unsummarized entries
│ → Next session loads fresh summary
│
DAILY (cron)
│
├─ Compaction: roll up incremental summaries into one compacted summary
│ → Keeps context loading bounded regardless of history length
| Moment | Engine | What it does | Token cost |
|---|---|---|---|
| Session start | SQLite (plain queries) | Load summaries + pinned entries | ~300-500 tokens |
| Agent saves | SQLite (INSERT) | Write memory, update FTS5 index | 0 tokens |
| Agent searches | FTS5 (MATCH) | Find relevant raw entries by full-text search | Only results returned |
| Summarization | SQLite + Haiku | Summarize batch of entries | 0 context tokens (background) |
| Daily compaction | SQLite + Haiku | Roll up summaries | 0 context tokens (background) |
SQLite is the storage engine — reads and writes are <1ms, crash-safe, zero network.
FTS5 is the accuracy layer — when the summary isn't enough, search raw entries with full-text ranking. The agent always has access to the original content.
Summarization is the efficiency layer — keeps context costs low by distilling raw entries into concise summaries. Runs in the background, fails gracefully, never blocks.
Latency Profile
| Operation | Latency | Notes |
|---|---|---|
| Load context (summaries + pinned) | <1ms | Local SQLite, indexed |
| Format memory block | <1ms | String concatenation |
| Save memory (INSERT) | <1ms | Single row, WAL mode |
| FTS5 index update (trigger) | <1ms | Automatic |
| Full-text search (FTS5 MATCH) | <1ms | Built-in, pre-indexed |
| Update/supersede | <1ms | Transaction, by primary key |
| Incremental summarization | 500-2000ms | Only network call (Haiku) |
| Compaction | 500-2000ms | Haiku + SQLite transaction |
Compare with remote DB: Supabase takes 50-150ms per query due to network. Local SQLite is 100-1000x faster.
System Prompt Instructions
The agent's system prompt includes instructions for memory usage:
## Memory You have persistent memory stored locally on this machine. A summary of what you've learned is shown in the <memory> block above. ### When to save memories Save a memory when you encounter: - **User requests**: "Remember that...", "Save this for later", "Don't forget..." - **User preferences**: Implicit or explicit (coding style, tools, communication) - **Surprising facts**: Information that contradicts assumptions or is non-obvious - **Project patterns**: Codebase conventions, architecture decisions, deploy configs - **People info**: Names, roles, preferences, relationships - **Resolutions**: How a tricky problem was solved (error + fix) - **Environment**: API endpoints, config locations, access patterns ### When NOT to save - Transient information (temp debugging steps, one-off commands) - Information already in the codebase (README content, code comments) - Obvious facts any LLM would know - Duplicate of something already in your <memory> ### How to write good memories - Clear, standalone sentences — useful without the original conversation - Specific: "User prefers Bun over Node for new TS projects" not "User likes Bun" - Include the 'why' when relevant: "Chose Supabase over Firebase for pgvector support" - Tag appropriately: preference, codebase, contact, infra, decision, pattern, debug ### Using your memories - Check <memory> before asking the user things you might already know - If the summary doesn't have what you need, use memory_search for exact details - When you notice a memory is outdated, save a corrected version
Edge Cases & Mitigations
Memory Spam
Problem: Agent saves too many memories per conversation.
Mitigation:
- Rate limit: max 5 saves per conversation turn (enforced in tool)
- Dedup: FTS5 search for similar content before inserting — skip if top match has high rank
- System prompt: "Be selective. Not every fact is worth remembering."
Stale Memories
Problem: Old memories become outdated but still appear in summaries.
Mitigation:
- Supersede pattern: new saves mark old entries via
superseded_by - Agent instruction: "When a loaded memory is wrong, save a corrected version"
- Compaction naturally ages out stale information as summaries get re-rolled
Large Individual Memories
Problem: Single memory consumes too much budget.
Mitigation:
- Hard limit: 2000 chars (~500 tokens) per entry, enforced at save time
- Rejection message guides agent to be more concise
Summarization Failure
Problem: Haiku API is down or unreachable.
Mitigation:
- Entries still save normally (no network needed for writes)
- Context loading falls back to raw entries with token budget
- Next session start runs catch-up summarization
- System is fully functional without summaries — just slightly less context-efficient
Summary Quality Drift
Problem: Compaction over many cycles loses nuance.
Mitigation:
- Raw entries are never deleted — they remain searchable via FTS5
memory_searchalways searches raw content, not summaries- Agent can always find exact details even if the summary omits them
Database Corruption
Problem: Crash could corrupt the database.
Mitigation:
- WAL mode: SQLite's Write-Ahead Logging provides automatic crash recovery
- Transactions: summarization and supersede operations are atomic
PRAGMA integrity_checkcan be run as a health check- Users can back up by copying
memory.db(safe to copy when no writers are active)
Cross-Project Pollution
Problem: Memories from Project A load when working on Project B.
Mitigation:
projectcolumn on each entry + indexed query- Summaries include project context naturally (Haiku includes it in the summary text)
memory_searchsupports project filteringproject IS NULLentries are global — useful for cross-cutting knowledge
Comparison with Other Approaches
| Aspect | This System | Flat Memory Log | Git Context Controller | Full Vault (ctx0 DB) | Claude Code Auto-Memory |
|---|---|---|---|---|---|
| Storage | Local SQLite + cloud sync | Local SQLite | Local SQLite | Supabase PostgreSQL | Local MEMORY.md |
| Context cost | ~300-500 tokens | ~4,000 tokens | ~130-330 tokens | On-demand retrieval | ~200 lines |
| Write latency | <1ms (local) | <1ms | <1ms | 20-50ms (network) | <1ms |
| Read latency | <1ms (local) | <1ms | <1ms | 50-150ms (network) | <1ms |
| Search | FTS5 over raw entries | FTS5 over raw entries | Summaries only (no FTS5) | pg_trgm / tsvector | String grep |
| AI cost | ~$0.001/day | $0 | ~$0.001/commit | ~$0.01/session | $0 |
| Offline | Yes (except sync + summarization) | Fully offline | Yes (except summarization) | No | Fully offline |
| Cross-device | Yes (via cloud sync) | No (manual export) | No (manual export) | Built-in | No (manual) |
| Multi-daemon | Yes (row-level sync) | No | No | Yes | No |
| Complexity | 2 tools, ~400 LOC | 2 tools, ~200 LOC | 5 tools, ~400 LOC | 8+ tables, 3 subagents | 1 file, built-in |
| Self-healing | Yes (session start catch-up) | No | Orphaned observations persist | Server-side | No |
| Scales | Millions of rows + compaction | Millions of rows | Millions of rows | Unlimited | ~200 lines |
Implementation Plan
Phase 1: Core (MVP) ✅
Goal: Agent can save and search memories from local SQLite. Context loading uses raw entries (no summarization yet).
| Task | Files | Effort |
|---|---|---|
Add better-sqlite3 dependency to daemon | packages/daemon/package.json | Trivial |
Create MemoryEntry, SummaryEntry types | packages/daemon/src/memory/types.ts | Trivial |
| Implement SQLite init + schema creation | packages/daemon/src/memory/db.ts | Small |
Implement memory_save tool | packages/daemon/src/tools/memory-save.ts | Small |
Implement memory_search tool (FTS5) | packages/daemon/src/tools/memory-search.ts | Small |
| Implement raw-entry memory loader (fallback mode) | packages/daemon/src/memory/loader.ts | Medium |
Inject <memory> block into system prompt | packages/daemon/src/agent/loop.ts | Small |
| Add memory instructions to system prompt | packages/daemon/src/agent/agents.ts | Small |
| Token estimation utility | packages/daemon/src/memory/tokens.ts | Small |
Phase 2: Summary Priming ✅
Goal: Context loading uses AI-generated summaries instead of raw entries. Summarization triggers on entry count threshold and session end.
| Task | Files | Effort |
|---|---|---|
Implement generateIncrementalSummary (Haiku call) | packages/daemon/src/memory/summarize.ts | Medium |
Implement summarizeEntries (core function) | summarize.ts | Medium |
| Implement entry count threshold trigger | summarize.ts | Small |
| Implement session-end summarization hook | summarize.ts | Small |
| Update loader to use summary-primed loading | loader.ts | Medium |
| Implement fallback to raw entries when no summaries exist | loader.ts | Small |
Phase 3: Self-Healing + Periodic ✅
Goal: System handles crashes, long sessions, and edge cases gracefully.
| Task | Files | Effort |
|---|---|---|
| Implement session-start catch-up summarization | summarize.ts | Small |
| Implement periodic interval timer (5 min) | summarize.ts | Small |
| Supersede pattern (update existing memories) | db.ts | Small |
| Dedup check before saving (FTS5 similarity) | db.ts | Small |
| Rate limiting (max 5 saves per turn) | memory-save.ts | Small |
Phase 4: Daily Compaction
Goal: System handles long-term growth. Incremental summaries are rolled up into compacted summaries daily.
Status: Not implemented. Deferred — intended to run on the centralized cloud computer. Without this, incremental summaries accumulate over weeks/months and eventually consume too much context budget. The existing loader.ts already supports consuming compacted summaries — Phase 4 only needs to produce them.
Why it matters: After 6 months of daily use, context loading should still involve one compacted summary + a few recent incrementals — not hundreds of summaries.
| Task | Files | Effort |
|---|---|---|
Implement generateCompactionSummary (Haiku call, max 500 tokens) | packages/daemon/src/memory/summarize.ts | Small |
Implement dailyCompaction (query incrementals > 24h old, AI-merge, atomic insert + delete) | summarize.ts | Medium |
Register daily cron trigger in daemon lifecycle (e.g., setInterval at 24h or integrate with ScheduleExecutor) | packages/daemon/src/index.ts | Small |
Key behaviors:
- Only compacts incremental summaries older than 24 hours
- Requires at least 2 incrementals to trigger (otherwise no-op)
- Atomic SQLite transaction: insert compacted summary + delete old incrementals
- Old incrementals are deleted (raw entries they summarized remain in
memoriestable) - Cost: ~$0.0003 per compaction (Haiku, ~1500 input tokens, ~300 output tokens)
- Reference implementations:
dailyCompaction()andgenerateCompactionSummary()in the AI Summarization Prompts section
Phase 5: Auto-Save Hooks ✅
Goal: Agent automatically saves memories on significant events without explicit tool calls.
Status: Implemented. Hooks subscribe to SessionStateChanged events on the daemon event bus, fire concurrently via Promise.allSettled on session completion, and use Haiku for summarization.
| Task | Files | Status |
|---|---|---|
| Create hooks module with event bus subscription | packages/daemon/src/memory/hooks.ts | Done |
| Task completion summary (Haiku + digest builder) | hooks.ts | Done |
| User correction detection (last 6 messages → Haiku) | hooks.ts | Done |
| Error resolution detection (heuristic + Haiku) | hooks.ts | Done |
| Significance filtering (tool call + message thresholds) | hooks.ts | Done |
| Dedup check (skip if agent saved 2+ memories) | hooks.ts | Done |
| Export from memory index | packages/daemon/src/memory/index.ts | Done |
| Wire into daemon lifecycle (init + cleanup) | packages/daemon/src/index.ts | Done |
Phase 6: Cloud Sync
Goal: Multiple daemons (different devices) can share the same memory via Supabase.
Status: Not implemented. Deferred — requires the ctx0 Drizzle schema (Supabase side) and proxy DB infrastructure. The sync_meta table already exists in the local SQLite schema. The existing setMemoryToolDeps pattern for signing provider/config can be extended to include getDb for the ProxiedSupabaseClient.
Note on dependency injection: Phase 6 uses a separate getDb: () => ProxiedSupabaseClient | null dep (following the todo.ts/plan.ts pattern), NOT the existing MemoryToolDeps which carries getSigningProvider/getConfig for Haiku calls. The cloud sync deps wire into the Daemon constructor alongside setTodoToolDeps and setPlanToolDeps.
Note on auto-save hook integration: Auto-saved memories (Phase 5, source: 'auto') should also be pushed to cloud. The syncMemoryToCloud() call should be added alongside the insertMemory() calls in hooks.ts, or the push can be wired at the insertMemory() level in db.ts to cover all sources.
| Task | Files | Effort |
|---|---|---|
Add ctx0_memories Drizzle table + register | packages/ctx0/src/schema/memory.ts, index.ts | Small |
Create packages/daemon/src/memory/sync.ts with MemorySyncDeps interface | sync.ts (new) | Trivial |
Implement syncMemoryToCloud() (fire-and-forget upsert) | sync.ts | Small |
Implement pullMemoriesFromCloud() (watermark-based incremental pull) | sync.ts | Medium |
Implement bootstrapFromCloud() (new device full pull, paginated) | sync.ts | Medium |
Implement syncSupersedeToCloud() (push supersede updates) | sync.ts | Small |
Wire push into memory_save tool (fire-and-forget .catch(() => {})) | packages/daemon/src/tools/memory-save.ts | Trivial |
| Wire push into auto-save hooks (fire-and-forget) | packages/daemon/src/memory/hooks.ts | Trivial |
Wire pull/bootstrap into initMemoryWithSync() at session start | packages/daemon/src/index.ts | Small |
Wire MemorySyncDeps in Daemon constructor | packages/daemon/src/index.ts | Trivial |
Auto-Save Hooks
When a session completes, lightweight hooks automatically create memories for significant events — without relying on the LLM to explicitly call memory_save. This ensures the memory system captures useful information even when the agent doesn't think to save it.
Architecture
Session completes → SessionStateChanged event (bus)
│
hooks.ts subscriber
│
┌───────────┼───────────┐
│ │ │
▼ ▼ ▼
Task Summary Correction Error Resolution
(Haiku) (Haiku) (heuristic+Haiku)
│ │ │
▼ ▼ ▼
insertMemory(source: 'auto')
│
▼
checkSummarizationTrigger()
All three hooks run concurrently via Promise.allSettled. Each is fire-and-forget — failures are logged but never block the session or each other.
Why Event Bus Over Direct IPC Call
- Decoupled — hooks know nothing about sockets or IPC
- Universal — fires for local, remote, scheduled, and trigger sessions
handleRunTaskin server.ts is already 600+ lines; adding more logic there increases coupling
Significance Filtering (Avoid Noise)
Before running any hooks, the system applies filters to avoid saving trivial conversations:
| Filter | Threshold | Rationale |
|---|---|---|
| Tool call count | < 2 tool calls | Conversations with no tool use are typically greetings or simple Q&A |
| User message count | < 3 user messages | Combined with above (both must be below threshold to skip) |
| Agent self-saves | ≥ 2 memory_save calls | Agent was already deliberate about saving — don't duplicate |
| Haiku "SKIP"/"NONE" | Per-hook | Final safety net — Haiku decides if the content is worth remembering |
| Inflight guard | Set per session ID | Prevents duplicate processing if event fires twice |
Hook 1: Task Completion Summary
Builds a condensed digest of the session (first user message, tools used, error count, final assistant response, truncated to ~2000 chars) and asks Haiku for a 1-2 sentence factual summary. Haiku responds "SKIP" if trivial.
Saved as: { tags: ['session_summary'], source: 'auto' }
Hook 2: User Correction Detection
Only runs if the session had 3+ user messages (corrections require back-and-forth). Extracts the last 6 messages and asks Haiku: "Did the user correct the agent? If yes, state the preference. If no, respond 'NONE'."
Saved as: { tags: ['preference', 'correction'], source: 'auto' }
Hook 3: Error Resolution Detection
Heuristic scan: looks for tool_result messages with toolError set, followed by a later successful result from the same tool. If the pattern is found, asks Haiku to summarize the error → fix pattern in one sentence.
Saved as: { tags: ['debug', 'error_resolution'], source: 'auto' }
Haiku Access
Uses createProxiedModel(config, signingProvider, HAIKU_MODEL_ID, 'memory-auto-save') — same pattern as memory/summarize.ts. Requires authentication (no-ops silently if not authenticated).
Cost
~$0.0005–$0.001 per session (1-2 Haiku calls, ~500 input tokens + 50-100 output tokens each). Most sessions trigger 1 call (task summary). Correction and error hooks only fire when their heuristics match.
Files
| File | Description |
|---|---|
packages/daemon/src/memory/hooks.ts | Hook implementations, event bus subscription, lifecycle |
packages/daemon/src/memory/index.ts | Exports initAutoSaveHooks, stopAutoSaveHooks, AutoSaveHookDeps |
packages/daemon/src/index.ts | Wires initAutoSaveHooks in Daemon.initialize(), stopAutoSaveHooks in Daemon.cleanup() |
Dependency Injection
export interface AutoSaveHookDeps { getSessionManager: () => SessionManager | null; getSigningProvider: () => SigningProvider | null; getConfig: () => DaemonConfig; }
Wired in Daemon.initialize():
initAutoSaveHooks({ getSessionManager: () => this.getOrCreateSessionManager(), getSigningProvider: () => this.signingProvider, getConfig: () => loadConfig(), });
Cloud Sync
When a user runs multiple daemons (e.g., laptop + desktop), each daemon should see the same memories. Cloud sync makes this work by using Supabase as a shared merge point while keeping local SQLite as the fast path.
Design Principles
-
Local-first, cloud-optional — All reads and writes hit local SQLite (<1ms). Cloud sync is fire-and-forget. If the user isn't authenticated or the cloud is unreachable, the system works exactly as before.
-
Append-only sync — Memories use nanoid primary keys, so two daemons can never produce the same ID. This means INSERT OR IGNORE handles all deduplication with zero conflicts.
-
Memories sync, summaries don't — Raw memory entries are the source of truth and sync bidirectionally. Summaries are local-only — each daemon generates its own from its merged memory set. This avoids summary conflicts entirely.
-
Timestamp watermark — Each daemon tracks when it last synced. On pull, it only fetches rows newer than its watermark. Efficient even with thousands of memories.
Architecture
┌──────────────────────────────────────────────────────────────────────────┐
│ MULTI-DAEMON CLOUD SYNC │
│ │
│ DAEMON A (laptop) DAEMON B (desktop) │
│ ~/.bot0/memory.db ~/.bot0/memory.db │
│ │
│ memory_save("user prefers Bun") memory_save("deploy key in 1PW") │
│ │ │ │
│ │ 1. INSERT locally (<1ms) │ 1. INSERT locally │
│ │ 2. Push to cloud (fire & forget) │ 2. Push to cloud │
│ ▼ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Supabase: ctx0_memories │ │
│ │ │ │
│ │ id │ content │ synced_at │ │
│ │ m_a1b2c3 │ "user prefers Bun" │ 2026-02-27 │ │
│ │ m_x7y8z9 │ "deploy key in 1PW" │ 2026-02-27 │ │
│ │ │ │
│ │ Proxy auto-injects user_id on all queries │ │
│ └──────────────────────────────────────────────────────┘ │
│ │ │ │
│ │ 3. Pull on session start │ 3. Pull on session start│
│ │ (WHERE synced_at > last_sync_ts) │ │
│ ▼ ▼ │
│ memory.db now has both memories memory.db now has both memories │
│ → local summarization catches up → local summarization catches up │
│ │
└──────────────────────────────────────────────────────────────────────────┘
Supabase Table: ctx0_memories
Mirrors the local memories table with user_id for RLS. Defined as a Drizzle schema in packages/ctx0/src/schema/memory.ts:
export const ctx0Memories = pgTable('ctx0_memories', { id: text('id').notNull(), userId: uuid('user_id').notNull().references(() => ctx0Users.id), content: text('content').notNull(), tags: jsonb('tags').notNull().default([]), source: text('source').notNull().default('agent'), project: text('project'), pinned: boolean('pinned').notNull().default(false), supersededBy: text('superseded_by'), createdAt: timestamp('created_at', { withTimezone: true }).notNull().defaultNow(), syncedAt: timestamp('synced_at', { withTimezone: true }).notNull().defaultNow(), }, (table) => ({ pk: primaryKey({ columns: [table.id, table.userId] }), userSyncIdx: index('idx_ctx0_memories_user').on(table.userId, table.syncedAt), }));
Equivalent SQL:
CREATE TABLE ctx0_memories ( id TEXT NOT NULL, user_id UUID NOT NULL REFERENCES auth.users(id), content TEXT NOT NULL, tags JSONB NOT NULL DEFAULT '[]'::jsonb, source TEXT NOT NULL DEFAULT 'agent', project TEXT, pinned BOOLEAN NOT NULL DEFAULT false, superseded_by TEXT, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), synced_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), PRIMARY KEY (id, user_id) ); CREATE INDEX idx_ctx0_memories_user ON ctx0_memories(user_id, synced_at DESC);
Dependency Injection
Follows the todo.ts pattern — the daemon injects its ProxiedSupabaseClient at startup:
interface MemoryToolDeps { getDb: () => ProxiedSupabaseClient | null; } let _deps: MemoryToolDeps | null = null; export function setMemoryToolDeps(deps: MemoryToolDeps): void { _deps = deps; }
If _deps is null or getDb() returns null (user not authenticated), all sync operations silently no-op. The system is fully functional in local-only mode.
Push: Fire-and-Forget After Save
After every local memory_save, push the new row to cloud. Non-blocking, non-fatal.
async function syncMemoryToCloud(memory: { id: string; content: string; tags: string[]; source: string; project: string | null; pinned: boolean; }): Promise<void> { const db = _deps?.getDb(); if (!db) return; // Not authenticated — skip silently try { await db.from('ctx0_memories').upsert({ id: memory.id, content: memory.content, tags: memory.tags, source: memory.source, project: memory.project, pinned: memory.pinned, synced_at: new Date().toISOString(), }, { onConflict: 'id,user_id' }); } catch (err) { console.error('[memory] Cloud push failed:', err); // Non-fatal — local SQLite is the source of truth during execution } }
Wired into the save path:
function memorySave(params: { content: string; tags?: string[]; pin?: boolean }, context) { // ... existing local INSERT ... // Fire-and-forget cloud push syncMemoryToCloud({ id, content, tags, source: context.source, project, pinned: pin }) .catch(() => {}); return `Saved: "${content.slice(0, 60)}..."`; }
Push: Supersede Updates
When a memory is superseded locally, push the update:
async function syncSupersedeToCloud(oldId: string, newId: string): Promise<void> { const db = _deps?.getDb(); if (!db) return; try { await db.from('ctx0_memories') .update({ superseded_by: newId, synced_at: new Date().toISOString() }) .eq('id', oldId); } catch (err) { console.error('[memory] Cloud supersede sync failed:', err); } }
Pull: Session Start
On session start, pull memories created by other daemons since the last sync. This is the mechanism that gives Daemon B access to Daemon A's memories.
async function pullMemoriesFromCloud(): Promise<number> { const db = _deps?.getDb(); if (!db) return 0; try { // Read last sync timestamp const meta = localDb.prepare( 'SELECT value FROM sync_meta WHERE key = ?' ).get('last_sync_ts') as { value: string } | undefined; const lastSyncTs = meta?.value ?? '1970-01-01T00:00:00Z'; // Fetch new/updated memories from cloud const result = await db.from('ctx0_memories') .select('id,content,tags,source,project,pinned,superseded_by,created_at,synced_at') .gt('synced_at', lastSyncTs) .order('synced_at', { ascending: true }) .limit(1000); if (!result.data || result.data.length === 0) return 0; const rows = result.data as CloudMemoryRow[]; // INSERT OR IGNORE new memories into local SQLite const insertOrIgnore = localDb.prepare(` INSERT OR IGNORE INTO memories (id, content, tags, source, project, pinned, superseded_by, created_at) VALUES (?, ?, ?, ?, ?, ?, ?, ?) `); // UPDATE superseded_by for existing memories (if cloud has a newer supersede) const updateSupersede = localDb.prepare(` UPDATE memories SET superseded_by = ? WHERE id = ? AND superseded_by IS NULL AND ? IS NOT NULL `); let newCount = 0; const pullTx = localDb.transaction(() => { for (const row of rows) { const changes = insertOrIgnore.run( row.id, row.content, JSON.stringify(row.tags), row.source, row.project, row.pinned ? 1 : 0, row.superseded_by, row.created_at, ).changes; if (changes > 0) newCount++; // Apply supersede if it exists on cloud but not locally if (row.superseded_by) { updateSupersede.run(row.superseded_by, row.id, row.superseded_by); } } // Update watermark const latestSyncedAt = rows[rows.length - 1].synced_at; localDb.prepare( 'INSERT OR REPLACE INTO sync_meta (key, value) VALUES (?, ?)' ).run('last_sync_ts', latestSyncedAt); }); pullTx(); if (newCount > 0) { console.log(`[memory] Pulled ${newCount} new memories from cloud`); // Trigger summarization catch-up for new entries checkSummarizationTrigger(); } return newCount; } catch (err) { console.error('[memory] Cloud pull failed:', err); return 0; // Non-fatal — continue with local state } }
New Device Bootstrap
When ~/.bot0/memory.db doesn't exist but the user is authenticated, bootstrap from cloud:
async function bootstrapFromCloud(): Promise<boolean> { const db = _deps?.getDb(); if (!db) return false; try { // Pull ALL memories for this user (paginated for large sets) let offset = 0; const pageSize = 500; let totalPulled = 0; while (true) { const result = await db.from('ctx0_memories') .select('id,content,tags,source,project,pinned,superseded_by,created_at,synced_at') .order('created_at', { ascending: true }) .range(offset, offset + pageSize - 1); if (!result.data || result.data.length === 0) break; const rows = result.data as CloudMemoryRow[]; const insertStmt = localDb.prepare(` INSERT OR IGNORE INTO memories (id, content, tags, source, project, pinned, superseded_by, created_at) VALUES (?, ?, ?, ?, ?, ?, ?, ?) `); const insertTx = localDb.transaction(() => { for (const row of rows) { insertStmt.run( row.id, row.content, JSON.stringify(row.tags), row.source, row.project, row.pinned ? 1 : 0, row.superseded_by, row.created_at, ); } }); insertTx(); totalPulled += rows.length; offset += pageSize; if (rows.length < pageSize) break; // Last page } if (totalPulled > 0) { console.log(`[memory] Bootstrapped ${totalPulled} memories from cloud`); // Set sync watermark localDb.prepare( 'INSERT OR REPLACE INTO sync_meta (key, value) VALUES (?, ?)' ).run('last_sync_ts', new Date().toISOString()); // Generate summaries for bootstrapped entries await summarizeEntries(); } return totalPulled > 0; } catch (err) { console.error('[memory] Cloud bootstrap failed:', err); return false; } }
Session Lifecycle Integration
// In session start (packages/daemon/src/session/manager.ts or memory init) async function initMemoryWithSync(): Promise<void> { const isNewDb = !existsSync(DB_PATH); // Initialize local SQLite (creates tables if needed) initMemoryDb(); if (isNewDb) { // New device — try bootstrapping from cloud await bootstrapFromCloud(); } else { // Existing device — pull new memories from other daemons await pullMemoriesFromCloud(); } // Existing session start logic: catch-up summarization, etc. await onSessionStart(); }
Conflict Resolution
| Scenario | Resolution | Why it works |
|---|---|---|
| Two daemons save different memories | No conflict — different nanoid PKs | INSERT OR IGNORE on both sides |
| Both daemons supersede different memories | Both supersedes apply independently | Each targets a different id |
| Both daemons supersede the SAME memory | Last-write-wins on superseded_by | Rare, harmless — both point to valid replacements |
| Summary divergence between daemons | No conflict — summaries are local-only | Each daemon summarizes its own merged set |
| Daemon offline for days | Pulls all missed rows via timestamp watermark | synced_at > last_sync_ts catches everything |
| Cloud unreachable on push | Fire-and-forget, memory exists locally | Retry happens on next save or session start |
| Cloud unreachable on pull | Skip, use local state | System works fully offline |
| User not authenticated | All sync operations silently no-op | _deps?.getDb() returns null |
| Very large memory set (10K+) | Paginated pull with batch INSERT OR IGNORE | range() pagination, transaction batches |
Cost Analysis
| Operation | Network calls | Latency | Notes |
|---|---|---|---|
| Push (per memory_save) | 1 upsert | Fire-and-forget (~20-50ms) | Non-blocking |
| Pull (per session start) | 1 select | ~50-150ms | Only fetches new rows |
| Bootstrap (new device) | N selects (paginated) | ~100-500ms per page | One-time |
| Supersede sync | 1 update | Fire-and-forget (~20-50ms) | Rare |
Supabase cost for a typical user:
- ~50 memories/day × 30 days = 1,500 rows/month
- Each row ≈ 200 bytes → ~300 KB/month storage
- ~60 push operations/day + ~2 pull operations/day → negligible
Future Extensions
Vector Search Upgrade
When logs grow past ~1000 entries, add local vector search:
sqlite-vecextension (SQLite native vector similarity)- Generate embeddings via local model (ONNX) or API call
- Add
embedding BLOBcolumn tomemoriestable - Use for
memory_searchalongside FTS5
Memory CLI
# List recent memories bot0 memory list --limit 20 # Search bot0 memory search "deploy key" # Export bot0 memory export --format json > backup.json bot0 memory export --format md > readable.md # Import bot0 memory import backup.json # Stats bot0 memory stats # → 342 memories, 8 summaries, 156 KB, oldest: 2025-12-01
Graduation to Full Vault
When a user outgrows the flat log:
- Each memory →
ctx0_entrieswith path derived from tags - Tags map to folders:
preference→/preferences/,contact→/contacts/ - Summaries → high-level vault entries
- SQLite stays as local cache, vault becomes primary
Related Documentation
- ctx0 System Architecture — Full vault architecture
- ctx0 Sessions — Session/conversation storage
- Flat Memory Log — Original flat log spec (predecessor)
- Git-Like Context Controller — Git-like alternative (evaluated, complexity rejected)
- ctx0 Supabase Data Architecture — Database schema