ctx0 — Persistent Memory

Overview

Persistent Memory is a local-first, summary-primed memory system for AI agents — combining the simplicity of a flat memory log with AI-generated summaries for low-cost context loading. All data lives in a single SQLite file on the user's machine. The agent writes memories freely, searches them on demand via FTS5 full-text search, and starts each session with a concise AI-generated summary instead of thousands of tokens of raw entries.

Core insight: Writing memories should be dead simple (single INSERT). Loading memories into context should be as small as possible (AI summary, not raw entries). Searching memories should be accurate and fast (FTS5 over raw content). These three operations have different needs — optimize each independently.

┌─────────────────────────────────────────────────────────────────────────────┐
│                                                                             │
│   Full vault:     Agent → ctx0_remember → DB write → curator agent →       │
│                   vault tree → embedding queue → DB read → vector search   │
│                   → agent                                                  │
│                   (6 hops, 2 subagents, network I/O, DB costs)             │
│                                                                             │
│   This system:    Agent → db.insert() → done                               │
│                   Session start → load summary (~300 tokens)               │
│                   Need details? → FTS5 search (<1ms)                       │
│                   (2 tools, 0 subagents, local disk only)                  │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Why This Approach

This architecture was chosen after evaluating two alternatives:

Criterion	Flat Memory Log	Git-Like Context Controller	This System (Hybrid)
Simplicity	2 tools, ~200 LOC	5 tools, ~400 LOC	2 tools, ~300 LOC
Context cost	~4,000 tokens (raw entries)	~130-330 tokens (AI summary)	~300-500 tokens (AI summary)
Retrieval accuracy	High (raw content + FTS5)	Lower (summaries are lossy)	High (raw content + FTS5)
Robustness	No external dependencies	Requires Haiku for every commit	Haiku optional, graceful fallback
Complexity	Simple flat list	DAG with branches, commits, merges	Simple flat list + summaries

The flat log gives us simplicity and accurate retrieval. The git controller gives us low context cost via AI summaries. This hybrid takes both strengths and discards the DAG/branching complexity (which adds cognitive load for the agent without proportional value in a single-user, single-agent MVP).

Why Local SQLite

Concern	Local SQLite	Remote DB (Supabase)	Flat File (JSONL)
Read latency	<1ms (indexed query)	50-150ms (network)	<1ms (small), linear growth
Write latency	<1ms (INSERT)	20-50ms (network)	<1ms (append)
Update/delete	`UPDATE`/`DELETE`	`UPDATE`/`DELETE`	Full file rewrite
Filtering	SQL WHERE + indexes	SQL WHERE + indexes	Parse all → JS filter
Full-text search	FTS5 (built-in, fast)	pg_trgm / tsvector	String.includes()
Crash safety	WAL mode, transactions	Server-side	Hope append was atomic
Concurrent access	WAL mode (built-in)	Server handles	Manual file locking
Cost	$0	Per-read/write billing	$0
Offline	Yes	No	Yes
Privacy	Data never leaves machine	Remote server	Data never leaves machine
Disk footprint	Single `.db` file	N/A	3 files (log + summary + backup)
Human-inspectable	`sqlite3` CLI / DB Browser	psql / dashboard	Text editor
Cross-device sync	Via cloud sync (row-level)	Built-in	Manual
Scales	Millions of rows	Unlimited	~1000 lines before sluggish

SQLite is the sweet spot: all the local-first benefits of flat files, plus real querying, indexing, atomic writes, and crash safety. One file on disk (memory.db), zero network, zero cost.

better-sqlite3 is synchronous (no async overhead), battle-tested, and used by Turso, Obsidian, 1Password, and thousands of Electron apps.

Related documentation:

ctx0 System Architecture — Full vault architecture (for comparison)

ctx0 Sessions — Session/conversation storage

Flat Memory Log — Original flat log spec

Git-Like Context Controller — Git-like alternative

Design Principles

Local-first, cloud-synced — Memory lives on the user's machine as a single SQLite file. All reads and writes are local (<1ms). Cloud sync is optional and fire-and-forget — if the user has multiple daemons, memories sync via Supabase. If offline or unauthenticated, the system works identically without sync.
Summary-primed — Sessions start with a concise AI-generated summary (~300-500 tokens), not thousands of tokens of raw entries. The summary is the "big picture." Raw entries are the "source of truth."
Search-accurate — When the summary isn't enough, the agent searches raw entries via FTS5. Full-text search over actual content is always more accurate than searching summaries.
Zero-subagent — No curator, no librarian, no extractor. The main agent reads and writes memory directly. Summarization runs in the background, not as a blocking subagent.
Save liberally, load smartly — Writing is cheap (single INSERT, <1ms). Loading is where the intelligence lives (what to summarize, when to summarize, how to prime).
Graceful degradation — If AI summarization fails (network down, API error), fall back to loading raw entries with a token budget. The system never breaks — it just gets temporarily less efficient.
Single-file simplicity — One memory.db file. Copy it to a new machine, back it up, inspect it with standard tools.

Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                       PERSISTENT MEMORY SYSTEM                              │
│                                                                             │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │                        MAIN AGENT                                    │   │
│   │   bot0 daemon │ Claude Code │ Cursor │ any agent                    │   │
│   │                                                                      │   │
│   │   Tools:                                                             │   │
│   │   ├── memory_save(content, tags?, pin?)                             │   │
│   │   └── memory_search(query, tags?, project?)                         │   │
│   │                                                                      │   │
│   │   Auto-injected at session start:                                    │   │
│   │   └── <memory> block in system prompt (summary + pinned entries)    │   │
│   │                                                                      │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│                         │                       │                           │
│            Local disk   │                       │  Haiku (background)       │
│                         ▼                       ▼                           │
│   ┌──────────────────────────┐   ┌──────────────────────────────────┐     │
│   │  LOCAL SQLITE STORAGE     │   │  AI SUMMARIZATION (async)        │     │
│   │                           │   │                                  │     │
│   │  ~/.bot0/memory.db        │   │  Triggered by:                   │     │
│   │                           │   │  • Entry count threshold (20+)   │     │
│   │  ┌─────────────────────┐ │   │  • Session end                   │     │
│   │  │ memories             │ │   │  • Session start (catch-up)      │     │
│   │  │ memories_fts (FTS5)  │ │   │  • Periodic interval (5 min)     │     │
│   │  │ memory_summaries     │ │   │  • Daily compaction cron         │     │
│   │  │ sync_meta            │ │   │  • Auto-save hooks (bus event)   │     │
│   │  └─────────────────────┘ │   │  Cost: ~$0.0001/summary (Haiku)  │     │
│   │                           │   │  Fallback: load raw if fails     │     │
│   │  WAL mode, <1ms r/w      │   └──────────────────────────────────┘     │
│   └────────────┬─────────────┘                                             │
│                │                                                            │
│                │  Cloud sync (fire-and-forget push, session-start pull)     │
│                ▼                                                            │
│   ┌──────────────────────────────────────────────────────────────────┐     │
│   │  SUPABASE: ctx0_memories (optional, multi-daemon sync)            │     │
│   │                                                                    │     │
│   │  Daemon A (laptop) ──push──▶ ctx0_memories ◀──push── Daemon B    │     │
│   │  Daemon A ◀──pull (session start)──▶ Daemon B                    │     │
│   │                                                                    │     │
│   │  • Row-level sync (append-only, nanoid PKs = no conflicts)       │     │
│   │  • Memories sync bidirectionally; summaries stay local-only       │     │
│   │  • Graceful degradation: unauthenticated → local-only mode       │     │
│   └──────────────────────────────────────────────────────────────────┘     │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Database Schema

Single SQLite file at ~/.bot0/memory.db, using better-sqlite3 in synchronous mode with WAL journaling.

`memories` — The Raw Log

Every memory the agent saves. This is the source of truth — summaries are derived from this.

sql

CREATE TABLE memories (
  id TEXT PRIMARY KEY,                        -- nanoid (e.g., "m_a1b2c3")
  content TEXT NOT NULL,                      -- The memory itself (natural language)
  tags TEXT NOT NULL DEFAULT '[]',            -- JSON array of strings
  source TEXT NOT NULL DEFAULT 'agent',       -- 'user' | 'agent' | 'auto'
  project TEXT,                               -- Project context (null = global)
  pinned INTEGER NOT NULL DEFAULT 0,          -- 1 = always load into context
  superseded_by TEXT,                         -- ID of newer version (soft-update)
  summarized INTEGER NOT NULL DEFAULT 0,      -- 1 = included in a summary
  created_at TEXT NOT NULL DEFAULT (datetime('now'))  -- ISO 8601
);

-- Loading: pinned memories (always loaded first)
CREATE INDEX idx_memories_pinned ON memories(pinned, created_at DESC)
  WHERE pinned = 1 AND superseded_by IS NULL;

-- Loading: recent active memories
CREATE INDEX idx_memories_active ON memories(created_at DESC)
  WHERE superseded_by IS NULL;

-- Loading: project-specific memories
CREATE INDEX idx_memories_project ON memories(project, created_at DESC)
  WHERE project IS NOT NULL AND superseded_by IS NULL;

-- Summarization: unsummarized entries
CREATE INDEX idx_memories_unsummarized ON memories(created_at ASC)
  WHERE summarized = 0 AND superseded_by IS NULL;

-- Search: by source
CREATE INDEX idx_memories_source ON memories(source);

`memories_fts` — Full-Text Search

SQLite FTS5 gives us fast, ranked full-text search over memory content — no external dependencies, no embedding pipelines.

sql

-- FTS5 virtual table for content search
CREATE VIRTUAL TABLE memories_fts USING fts5(
  content,
  content=memories,
  content_rowid=rowid
);

-- Keep FTS in sync with triggers
CREATE TRIGGER memories_ai AFTER INSERT ON memories BEGIN
  INSERT INTO memories_fts(rowid, content) VALUES (new.rowid, new.content);
END;

CREATE TRIGGER memories_ad AFTER DELETE ON memories BEGIN
  INSERT INTO memories_fts(memories_fts, rowid, content)
    VALUES ('delete', old.rowid, old.content);
END;

CREATE TRIGGER memories_au AFTER UPDATE ON memories BEGIN
  INSERT INTO memories_fts(memories_fts, rowid, content)
    VALUES ('delete', old.rowid, old.content);
  INSERT INTO memories_fts(rowid, content) VALUES (new.rowid, new.content);
END;

`memory_summaries` — AI-Generated Summaries

Summaries are the primary artifact loaded into context at session start. Two types:

incremental — Summarizes a batch of recent entries (created throughout the day)
compacted — Rolls up multiple incremental summaries into a higher-level digest (created by daily cron)

sql

CREATE TABLE memory_summaries (
  id TEXT PRIMARY KEY,                        -- nanoid (e.g., "ms_x7y8z9")
  type TEXT NOT NULL DEFAULT 'incremental',   -- 'incremental' | 'compacted'
  summary TEXT NOT NULL,                      -- AI-generated summary text
  entry_count INTEGER NOT NULL,               -- How many entries/summaries were summarized
  entry_ids TEXT NOT NULL DEFAULT '[]',        -- JSON array of memory IDs included
  period_start TEXT NOT NULL,                  -- Earliest entry timestamp
  period_end TEXT NOT NULL,                    -- Latest entry timestamp
  created_at TEXT NOT NULL DEFAULT (datetime('now'))
);

CREATE INDEX idx_summaries_type ON memory_summaries(type, created_at DESC);
CREATE INDEX idx_summaries_period ON memory_summaries(period_end DESC);

`sync_meta` — Cloud Sync State

Tracks sync watermarks for the cloud sync system. See Cloud Sync for details.

sql

CREATE TABLE sync_meta (
  key TEXT PRIMARY KEY,                          -- 'last_sync_ts', 'device_id'
  value TEXT NOT NULL
);

Initialization

typescript

import Database from 'better-sqlite3';
import { join } from 'path';
import { homedir } from 'os';

const DB_PATH = join(homedir(), '.bot0', 'memory.db');

function initMemoryDb(): Database.Database {
  const db = new Database(DB_PATH);

  // WAL mode for better concurrent read/write performance
  db.pragma('journal_mode = WAL');

  // Run schema creation (idempotent)
  db.exec(`
    CREATE TABLE IF NOT EXISTS memories (
      id TEXT PRIMARY KEY,
      content TEXT NOT NULL,
      tags TEXT NOT NULL DEFAULT '[]',
      source TEXT NOT NULL DEFAULT 'agent',
      project TEXT,
      pinned INTEGER NOT NULL DEFAULT 0,
      superseded_by TEXT,
      summarized INTEGER NOT NULL DEFAULT 0,
      created_at TEXT NOT NULL DEFAULT (datetime('now'))
    );

    CREATE INDEX IF NOT EXISTS idx_memories_pinned ON memories(pinned, created_at DESC)
      WHERE pinned = 1 AND superseded_by IS NULL;
    CREATE INDEX IF NOT EXISTS idx_memories_active ON memories(created_at DESC)
      WHERE superseded_by IS NULL;
    CREATE INDEX IF NOT EXISTS idx_memories_project ON memories(project, created_at DESC)
      WHERE project IS NOT NULL AND superseded_by IS NULL;
    CREATE INDEX IF NOT EXISTS idx_memories_unsummarized ON memories(created_at ASC)
      WHERE summarized = 0 AND superseded_by IS NULL;
    CREATE INDEX IF NOT EXISTS idx_memories_source ON memories(source);

    CREATE VIRTUAL TABLE IF NOT EXISTS memories_fts USING fts5(
      content, content=memories, content_rowid=rowid
    );

    CREATE TABLE IF NOT EXISTS memory_summaries (
      id TEXT PRIMARY KEY,
      type TEXT NOT NULL DEFAULT 'incremental',
      summary TEXT NOT NULL,
      entry_count INTEGER NOT NULL,
      entry_ids TEXT NOT NULL DEFAULT '[]',
      period_start TEXT NOT NULL,
      period_end TEXT NOT NULL,
      created_at TEXT NOT NULL DEFAULT (datetime('now'))
    );
    CREATE INDEX IF NOT EXISTS idx_summaries_type
      ON memory_summaries(type, created_at DESC);
    CREATE INDEX IF NOT EXISTS idx_summaries_period
      ON memory_summaries(period_end DESC);

    CREATE TABLE IF NOT EXISTS sync_meta (
      key TEXT PRIMARY KEY,
      value TEXT NOT NULL
    );
  `);

  return db;
}

File on Disk

~/.bot0/
├── config.json       # Existing daemon config
└── memory.db         # SQLite database (single file)

That's it. One file.

Size estimates:

100 memories ≈ 50-100 KB
1,000 memories ≈ 500 KB - 1 MB
10,000 memories ≈ 5-10 MB
FTS5 index adds ~30% overhead

SQLite handles millions of rows. We will never be the bottleneck.

When Memories Are Saved

Memories enter the log through three channels:

1. Explicit User Request

The user says "remember this" or "save this for later."

User: "Remember that the deploy key for staging is in 1Password under 'staging-deploy'"
Agent: [calls memory_save]
→ INSERT into memories: { content: "...", source: "user", tags: ["infra", "credentials"] }

2. Agent Self-Save

The agent encounters information during work that seems worth persisting. The system prompt instructs it to recognize and save:

Surprising facts — Information that contradicts assumptions or is non-obvious
Learned patterns — "This codebase uses X pattern for Y"
User preferences — Implicit preferences revealed through corrections or choices
Environmental facts — API key locations, deployment targets, team structure
Decision rationale — Why a particular approach was chosen over alternatives

Agent: [reading code, discovers unconventional pattern]
Agent: [calls memory_save]
→ INSERT: { content: "bot0 wraps Drizzle in OrmClient (packages/db/orm.ts)...",
            source: "agent", tags: ["codebase", "pattern"] }

3. Auto-Save on Significant Events

Hooks subscribe to SessionStateChanged events on the daemon event bus. When a session transitions from working → completed, three hooks fire concurrently (fire-and-forget, never blocks the session):

Event	What Gets Saved	Tags	Example
Task completion	1-2 sentence summary of what was accomplished	`session_summary`	"Migrated auth from JWT to session tokens."
User correction	The preference/correction as a standalone fact	`preference`, `correction`	"User prefers tabs over spaces."
Error resolution	Error + fix pattern	`debug`, `error_resolution`	"ECONNREFUSED on 5432 → brew services start postgresql"

All auto-saves use source: 'auto' (vs 'agent' for explicit saves). Significance filtering skips trivial conversations (< 2 tool calls AND < 3 user messages) and sessions where the agent already called memory_save 2+ times. Each hook uses Haiku for summarization with a "SKIP"/"NONE" escape hatch to avoid noise.

See Auto-Save Hooks for full architecture details.

Context Loading (Session Start)

At session start, the daemon loads a <memory> block into the system prompt. This uses summary priming — loading the AI-generated summary instead of raw entries, keeping context cost low.

Loading Algorithm

LOAD_MEMORY_CONTEXT(project):

  blocks = []

  ── Phase 1: Pinned memories (always loaded, raw content) ──────────

  pinned = SELECT id, content, tags, created_at
           FROM memories
           WHERE pinned = 1
             AND superseded_by IS NULL
           ORDER BY created_at DESC

  for entry in pinned:
    blocks.push({ section: "pinned", content: entry.content })

  ── Phase 2: Latest compacted summary (big picture) ────────────────

  compacted = SELECT summary FROM memory_summaries
              WHERE type = 'compacted'
              ORDER BY created_at DESC
              LIMIT 1

  if compacted:
    blocks.push({ section: "context", content: compacted.summary })

  ── Phase 3: Incremental summaries since last compaction ───────────

  incrementals = SELECT summary FROM memory_summaries
                 WHERE type = 'incremental'
                   AND created_at > COALESCE(
                     (SELECT MAX(created_at) FROM memory_summaries WHERE type = 'compacted'),
                     '1970-01-01'
                   )
                 ORDER BY created_at DESC

  for s in incrementals:
    blocks.push({ section: "recent", content: s.summary })

  ── Phase 4: Unsummarized entries (newest, not yet in any summary) ─

  unsummarized = SELECT id, content, created_at
                 FROM memories
                 WHERE summarized = 0
                   AND superseded_by IS NULL
                   AND pinned = 0
                 ORDER BY created_at DESC
                 LIMIT 10

  if unsummarized.length > 0:
    blocks.push({ section: "latest", content: format_entries(unsummarized) })

  RETURN format_memory_block(blocks)

Context Window Injection

xml

<memory>
You have persistent memory from previous sessions.

## Pinned
- Deploy key for staging is in 1Password under 'staging-deploy' [2026-01-15]
- User prefers Bun over Node for new TypeScript projects [2026-01-20]

## Context
The user works on bot0, a local-first AI agent system using pnpm workspaces with
Turborepo. The daemon is the core component. Authentication uses hardware-bound
device keys via Secure Enclave/TPM 2.0. The project uses Drizzle ORM for database
schema management with Supabase PostgreSQL. The user prefers TypeScript strict mode
and monospace terminal aesthetics.

## Recent
Last few sessions focused on designing a persistent memory system for the daemon.
Evaluated flat memory log vs git-like context controller approaches. Decided on a
hybrid: flat log simplicity with AI summary priming. SQLite with FTS5 for search.

## Latest (not yet summarized)
- ctx0-persistent-memory.md spec finalized with hybrid approach [2026-02-26]
- Summarization triggers: entry threshold, session end, session start catch-up,
  periodic interval, daily compaction cron [2026-02-26]
</memory>

Token Budget

Component	Tokens	Notes
Header + section markers	~30	`<memory>` tags, section headings
Pinned entries (3-5)	~100-200	Raw content, always loaded
Compacted summary	~100-200	AI-generated, concise
Recent incremental summaries (1-3)	~100-200	Since last compaction
Unsummarized entries (0-10)	~0-200	Only if any exist
Total	~300-500	<1% of 80K context window

Compare: the original flat log spec loaded ~4,000 tokens of raw entries. This is 8-13x more efficient.

Fallback: No Summaries Yet

If no summaries exist (new user, fresh install), fall back to loading raw entries with a token budget — identical to the original flat log approach:

if no summaries exist:
  load pinned (up to 1000 tokens)
  load recent entries (up to 2000 tokens)
  load project entries (up to 500 tokens)
  total: up to ~3500 tokens (one-time cost until first summarization runs)

This ensures the system works immediately on first install, before any summarization has occurred.

Agent Tools

`memory_save`

typescript

interface MemorySaveTool {
  name: 'memory_save';
  description: 'Save information to your persistent memory. Use when you learn something worth remembering across sessions: user preferences, project patterns, environmental facts, surprising discoveries, or anything the user explicitly asks you to remember.';
  parameters: {
    /** The memory content. Clear, concise, standalone. Max 2000 chars. */
    content: string;
    /** Tags for categorization: "preference", "codebase", "contact", "infra", "decision", "pattern", "debug" */
    tags?: string[];
    /** Pin to always load into context. Use sparingly — pinned entries consume budget every session. */
    pin?: boolean;
  };
}

Implementation:

typescript

import { nanoid } from 'nanoid';

const insertStmt = db.prepare(`
  INSERT INTO memories (id, content, tags, source, project, pinned, created_at)
  VALUES (?, ?, ?, ?, ?, ?, datetime('now'))
`);

function memorySave(params: {
  content: string;
  tags?: string[];
  pin?: boolean;
}, context: { source: 'user' | 'agent' | 'auto'; project?: string }): string {
  const { content, tags = [], pin = false } = params;

  // Enforce max content length (~500 tokens ≈ ~2000 chars)
  if (content.length > 2000) {
    return 'Memory too long. Keep under ~500 tokens (2000 chars). Be more concise.';
  }

  const id = `m_${nanoid(6)}`;

  insertStmt.run(
    id,
    content.trim(),
    JSON.stringify(tags),
    context.source,
    context.project ?? null,
    pin ? 1 : 0,
  );

  // Check if summarization should trigger (entry count threshold)
  checkSummarizationTrigger();

  return `Saved${pin ? ' (pinned)' : ''}: "${content.slice(0, 60)}..."`;
}

`memory_search`

Full-text search over the raw memory log using FTS5. Available when the pre-loaded summary isn't detailed enough.

typescript

interface MemorySearchTool {
  name: 'memory_search';
  description: 'Search your memory for specific information. Use when the summary in your context doesn\'t contain what you need — this searches raw entries for exact details.';
  parameters: {
    /** Search query — natural language or keywords */
    query: string;
    /** Filter by tags */
    tags?: string[];
    /** Filter by project */
    project?: string;
    /** Maximum results (default 10) */
    limit?: number;
  };
}

Implementation:

typescript

const ftsSearchStmt = db.prepare(`
  SELECT m.id, m.content, m.tags, m.created_at
  FROM memories m
  JOIN memories_fts fts ON m.rowid = fts.rowid
  WHERE memories_fts MATCH ?
    AND m.superseded_by IS NULL
  ORDER BY rank
  LIMIT ?
`);

const keywordSearchStmt = db.prepare(`
  SELECT id, content, tags, created_at
  FROM memories
  WHERE content LIKE ?
    AND superseded_by IS NULL
  ORDER BY created_at DESC
  LIMIT ?
`);

function memorySearch(params: {
  query: string;
  tags?: string[];
  project?: string;
  limit?: number;
}): string {
  const { query, tags, project, limit = 10 } = params;

  // Try FTS5 first (handles phrases, boolean operators, ranking)
  let results: MemoryRow[];
  try {
    results = ftsSearchStmt.all(query, limit);
  } catch {
    // FTS5 syntax error (e.g., unbalanced quotes) — fallback to LIKE
    results = keywordSearchStmt.all(`%${query}%`, limit);
  }

  // Post-filter by tags and project (small result set, fast in JS)
  if (tags?.length) {
    results = results.filter(r => {
      const entryTags: string[] = JSON.parse(r.tags);
      return tags.some(t => entryTags.includes(t));
    });
  }
  if (project) {
    results = results.filter(r => r.project === project);
  }

  if (results.length === 0) {
    return 'No memories found matching your search.';
  }

  return results.map(r => {
    const entryTags: string[] = JSON.parse(r.tags);
    return `- ${r.content} [${entryTags.join(', ')}] (${r.created_at.slice(0, 10)})`;
  }).join('\n');
}

Memory Lifecycle

Writing

┌─────────────────────────────────────────────────────────────────┐
│                    MEMORY WRITE PATH                              │
│                                                                   │
│   Any save trigger                                               │
│   → Build MemoryEntry object                                     │
│   → db.prepare(INSERT).run(...)              (<1ms, local)       │
│   → FTS5 trigger auto-updates search index   (<1ms, local)       │
│   → Check summarization trigger              (entry count check)  │
│   → Push to cloud (fire-and-forget)          (non-blocking)      │
│   → Done. <1ms locally. Cloud push is async.                     │
│                                                                   │
└─────────────────────────────────────────────────────────────────┘

Reading (Session Start)

┌─────────────────────────────────────────────────────────────────┐
│                    MEMORY READ PATH                               │
│                                                                   │
│   New conversation starts                                        │
│   → Pull new memories from cloud (other daemons' writes)        │
│     (INSERT OR IGNORE into local SQLite, ~50-150ms)             │
│   → Check for unsummarized entries from crashed sessions         │
│     (self-healing catch-up — summarize if needed)                │
│   → Load pinned entries (raw content)                            │
│   → Load latest compacted summary                                │
│   → Load incremental summaries since last compaction             │
│   → Load any unsummarized entries (raw, newest)                  │
│   → Format into <memory> block string                            │
│   → Inject into system prompt                                    │
│   → Done. ~300-500 tokens.                                       │
│                                                                   │
└─────────────────────────────────────────────────────────────────┘

Updating (Supersede Pattern)

Memories are never edited in place. To update, insert a new entry and mark the old one:

typescript

const supersedeTx = db.transaction((oldId: string, newContent: string, project?: string) => {
  const old = db.prepare('SELECT * FROM memories WHERE id = ?').get(oldId);
  if (!old) return;

  const newId = `m_${nanoid(6)}`;

  // Insert replacement
  insertStmt.run(newId, newContent, old.tags, 'agent', project ?? old.project, old.pinned);

  // Mark old as superseded
  db.prepare('UPDATE memories SET superseded_by = ? WHERE id = ?').run(newId, oldId);
});

Using a transaction ensures atomicity — both the insert and the update happen together or not at all.

Summarization System

Summarization is the bridge between raw entries and context-efficient loading. It runs asynchronously — never blocking the conversation — and uses Haiku via the existing LLM proxy.

Two-Tier Summary Model

Tier	Type	Contains	Created by	Lifespan
Incremental	`incremental`	Summary of a batch of recent entries	Entry threshold, session end, periodic timer	Accumulates throughout the day
Compacted	`compacted`	Summary of multiple incremental summaries	Daily cron job	Rolls up incrementals, represents longer-term knowledge

This ensures the memory_summaries table stays bounded regardless of how long the system has been running. After 6 months of daily use, context loading still involves one compacted summary + a few recent incrementals — not hundreds of summaries.

Summarization Triggers

Five trigger points ensure memories are summarized at the right times:

1. Entry Count Threshold (Primary)

After every memory_save, check if unsummarized entries exceed a threshold. If so, trigger summarization.

typescript

const SUMMARIZE_THRESHOLD = 20;

function checkSummarizationTrigger(): void {
  const { count } = db.prepare(`
    SELECT COUNT(*) as count FROM memories
    WHERE summarized = 0 AND superseded_by IS NULL
  `).get() as { count: number };

  if (count >= SUMMARIZE_THRESHOLD) {
    // Run async — don't block the conversation
    summarizeEntries().catch(err =>
      console.error('[memory] Summarization failed:', err)
    );
  }
}

This is self-regulating: heavy sessions trigger more summaries, quiet sessions don't waste API calls.

2. Session End

When a session ends, summarize any remaining unsummarized entries. Ensures nothing is left dangling.

typescript

async function onSessionEnd(sessionId: string): Promise<void> {
  try {
    const { count } = db.prepare(`
      SELECT COUNT(*) as count FROM memories
      WHERE summarized = 0 AND superseded_by IS NULL
    `).get() as { count: number };

    if (count > 0) {
      await summarizeEntries();
      console.log(`[memory] Summarized ${count} entries at session end`);
    }
  } catch (err) {
    // Non-fatal — log but don't break session cleanup
    console.error('[memory] Session-end summarization failed:', err);
  }
}

3. Session Start (Self-Healing Catch-Up)

If the previous session crashed or summarization failed, session start detects orphaned unsummarized entries and catches up. This is the self-healing mechanism.

typescript

async function onSessionStart(): Promise<void> {
  const { count } = db.prepare(`
    SELECT COUNT(*) as count FROM memories
    WHERE summarized = 0 AND superseded_by IS NULL
  `).get() as { count: number };

  // If there are unsummarized entries older than the latest summary,
  // a previous session likely crashed before summarizing
  if (count > SUMMARIZE_THRESHOLD) {
    console.log(`[memory] Catch-up: ${count} unsummarized entries from previous session`);
    await summarizeEntries();
  }
}

4. Periodic Interval (Long Sessions)

A timer in the daemon catches the case of very long sessions where the entry count threshold hasn't been hit but time has passed.

typescript

const SUMMARIZE_INTERVAL_MS = 5 * 60 * 1000; // 5 minutes

let summarizeTimer: NodeJS.Timeout | null = null;

function startPeriodicSummarization(): void {
  summarizeTimer = setInterval(async () => {
    const { count } = db.prepare(`
      SELECT COUNT(*) as count FROM memories
      WHERE summarized = 0 AND superseded_by IS NULL
    `).get() as { count: number };

    if (count > 0) {
      await summarizeEntries();
    }
  }, SUMMARIZE_INTERVAL_MS);
}

function stopPeriodicSummarization(): void {
  if (summarizeTimer) {
    clearInterval(summarizeTimer);
    summarizeTimer = null;
  }
}

5. Daily Compaction Cron

A daily job rolls up incremental summaries into a single compacted summary. This prevents the memory_summaries table from growing unbounded and keeps context loading fast.

typescript

async function dailyCompaction(): Promise<void> {
  // Get all incremental summaries older than 24 hours
  const incrementals = db.prepare(`
    SELECT id, summary, entry_count, period_start, period_end
    FROM memory_summaries
    WHERE type = 'incremental'
      AND created_at < datetime('now', '-1 day')
    ORDER BY created_at ASC
  `).all() as SummaryRow[];

  if (incrementals.length < 2) return; // Nothing to compact

  // AI-summarize the incremental summaries into one compacted summary
  const compactedText = await generateCompactionSummary(
    incrementals.map(s => s.summary)
  );

  const totalEntries = incrementals.reduce((sum, s) => sum + s.entry_count, 0);
  const periodStart = incrementals[0].period_start;
  const periodEnd = incrementals[incrementals.length - 1].period_end;

  // Atomic: insert compacted + delete old incrementals
  const compactTx = db.transaction(() => {
    db.prepare(`
      INSERT INTO memory_summaries (id, type, summary, entry_count, entry_ids, period_start, period_end)
      VALUES (?, 'compacted', ?, ?, '[]', ?, ?)
    `).run(`ms_${nanoid(6)}`, compactedText, totalEntries, periodStart, periodEnd);

    const ids = incrementals.map(s => s.id);
    db.prepare(`
      DELETE FROM memory_summaries WHERE id IN (${ids.map(() => '?').join(',')})
    `).run(...ids);
  });

  compactTx();
  console.log(`[memory] Compacted ${incrementals.length} summaries into 1`);
}

Trigger Priority

Priority	Trigger	When	Why
Must have	Entry count threshold	After every `memory_save`	Self-regulating, handles all session patterns
Must have	Session start catch-up	On new session	Self-healing, covers crashes and failures
Should have	Session end	On session cleanup	Ensures nothing is left unsummarized
Should have	Daily compaction cron	Once daily (e.g., 3am)	Keeps summaries bounded over months
Nice to have	Periodic interval	Every 5 minutes	Extra safety net for marathon sessions

AI Summarization Prompts

Incremental Summary (entries → summary)

typescript

async function generateIncrementalSummary(
  entries: Array<{ content: string; tags: string[]; source: string }>
): Promise<string> {
  const entryText = entries
    .map((e, i) => `${i + 1}. ${e.content}`)
    .join('\n');

  const response = await haiku.messages.create({
    model: 'claude-haiku-4-5-20251001',
    max_tokens: 300,
    messages: [{
      role: 'user',
      content: `Summarize these memories into a concise paragraph (2-5 sentences). Preserve key facts, decisions, and preferences. Drop redundancies. Write in third person present tense ("The user prefers...", "The project uses...").

Memories:
${entryText}

Summary:`,
    }],
  });

  return response.content[0].text.trim();
}

Compaction Summary (summaries → summary)

typescript

async function generateCompactionSummary(
  summaries: string[]
): Promise<string> {
  const summaryText = summaries
    .map((s, i) => `${i + 1}. ${s}`)
    .join('\n\n');

  const response = await haiku.messages.create({
    model: 'claude-haiku-4-5-20251001',
    max_tokens: 500,
    messages: [{
      role: 'user',
      content: `Merge these incremental memory summaries into a single comprehensive summary. Preserve all key facts, preferences, and decisions. Remove duplicates. Resolve any contradictions (prefer more recent information). Write in third person present tense. Keep it concise but complete.

Summaries to merge:
${summaryText}

Merged summary:`,
    }],
  });

  return response.content[0].text.trim();
}

Core Summarization Function

typescript

async function summarizeEntries(): Promise<void> {
  // Get unsummarized entries
  const entries = db.prepare(`
    SELECT id, content, tags, source, created_at
    FROM memories
    WHERE summarized = 0 AND superseded_by IS NULL
    ORDER BY created_at ASC
  `).all() as MemoryRow[];

  if (entries.length === 0) return;

  // Generate AI summary
  const summary = await generateIncrementalSummary(
    entries.map(e => ({
      content: e.content,
      tags: JSON.parse(e.tags),
      source: e.source,
    }))
  );

  const entryIds = entries.map(e => e.id);
  const periodStart = entries[0].created_at;
  const periodEnd = entries[entries.length - 1].created_at;

  // Atomic: insert summary + mark entries as summarized
  const summarizeTx = db.transaction(() => {
    db.prepare(`
      INSERT INTO memory_summaries (id, type, summary, entry_count, entry_ids, period_start, period_end)
      VALUES (?, 'incremental', ?, ?, ?, ?, ?)
    `).run(`ms_${nanoid(6)}`, summary, entries.length, JSON.stringify(entryIds), periodStart, periodEnd);

    db.prepare(`
      UPDATE memories SET summarized = 1
      WHERE id IN (${entryIds.map(() => '?').join(',')})
    `).run(...entryIds);
  });

  summarizeTx();
  console.log(`[memory] Summarized ${entries.length} entries`);
}

Cost Analysis

Operation	Model	Input tokens	Output tokens	Cost
Incremental summary (20 entries)	Haiku	~1000	~150	~$0.0002
Compaction (5 incrementals)	Haiku	~1500	~300	~$0.0003
Typical day (2-3 incrementals + 0-1 compaction)	Haiku	~3500	~600	~$0.0007
Heavy day (10 incrementals + 1 compaction)	Haiku	~12000	~2000	~$0.003

Negligible. A heavy month of daily use costs ~$0.10.

How SQLite, FTS5, and Summarization Work Together

Each component serves a different moment in the workflow:

SESSION START
│
├─ Pull new memories from cloud (other daemons' writes)
│  → SELECT WHERE synced_at > last_sync_ts
│  → INSERT OR IGNORE into local SQLite
│  → Trigger summarization catch-up if new entries arrived
│
├─ SQLite query: load summaries + pinned entries
│  → inject ~300-500 tokens into <memory> system prompt
│  → agent starts with "big picture" context (includes other daemons' knowledge)
│
MID-CONVERSATION (on demand)
│
├─ Agent calls memory_save("user wants dark mode in all UIs")
│  → SQLite INSERT into memories (<1ms)
│  → FTS5 index auto-updated (SQLite trigger)
│  → Check entry count threshold → maybe trigger summarization
│  → Push to cloud (fire-and-forget, non-blocking)
│
├─ Agent calls memory_search("dark mode preference")
│  → FTS5 MATCH query → returns ranked raw entries (<1ms)
│  → Agent sees actual content, high fidelity
│
BACKGROUND (async, non-blocking)
│
├─ Summarization triggers (entry threshold / timer / session end)
│  → Read unsummarized entries from SQLite
│  → Send to Haiku for summary generation (~500-2000ms)
│  → INSERT summary + mark entries as summarized (atomic transaction)
│
SESSION END
│
├─ Final summarization of remaining unsummarized entries
│  → Next session loads fresh summary
│
DAILY (cron)
│
├─ Compaction: roll up incremental summaries into one compacted summary
│  → Keeps context loading bounded regardless of history length

Moment	Engine	What it does	Token cost
Session start	SQLite (plain queries)	Load summaries + pinned entries	~300-500 tokens
Agent saves	SQLite (INSERT)	Write memory, update FTS5 index	0 tokens
Agent searches	FTS5 (MATCH)	Find relevant raw entries by full-text search	Only results returned
Summarization	SQLite + Haiku	Summarize batch of entries	0 context tokens (background)
Daily compaction	SQLite + Haiku	Roll up summaries	0 context tokens (background)

SQLite is the storage engine — reads and writes are <1ms, crash-safe, zero network.

FTS5 is the accuracy layer — when the summary isn't enough, search raw entries with full-text ranking. The agent always has access to the original content.

Summarization is the efficiency layer — keeps context costs low by distilling raw entries into concise summaries. Runs in the background, fails gracefully, never blocks.

Latency Profile

Operation	Latency	Notes
Load context (summaries + pinned)	<1ms	Local SQLite, indexed
Format memory block	<1ms	String concatenation
Save memory (INSERT)	<1ms	Single row, WAL mode
FTS5 index update (trigger)	<1ms	Automatic
Full-text search (FTS5 MATCH)	<1ms	Built-in, pre-indexed
Update/supersede	<1ms	Transaction, by primary key
Incremental summarization	500-2000ms	Only network call (Haiku)
Compaction	500-2000ms	Haiku + SQLite transaction

Compare with remote DB: Supabase takes 50-150ms per query due to network. Local SQLite is 100-1000x faster.

System Prompt Instructions

The agent's system prompt includes instructions for memory usage:

markdown

## Memory

You have persistent memory stored locally on this machine. A summary of what you've learned is shown in the <memory> block above.

### When to save memories

Save a memory when you encounter:
- **User requests**: "Remember that...", "Save this for later", "Don't forget..."
- **User preferences**: Implicit or explicit (coding style, tools, communication)
- **Surprising facts**: Information that contradicts assumptions or is non-obvious
- **Project patterns**: Codebase conventions, architecture decisions, deploy configs
- **People info**: Names, roles, preferences, relationships
- **Resolutions**: How a tricky problem was solved (error + fix)
- **Environment**: API endpoints, config locations, access patterns

### When NOT to save
- Transient information (temp debugging steps, one-off commands)
- Information already in the codebase (README content, code comments)
- Obvious facts any LLM would know
- Duplicate of something already in your <memory>

### How to write good memories
- Clear, standalone sentences — useful without the original conversation
- Specific: "User prefers Bun over Node for new TS projects" not "User likes Bun"
- Include the 'why' when relevant: "Chose Supabase over Firebase for pgvector support"
- Tag appropriately: preference, codebase, contact, infra, decision, pattern, debug

### Using your memories
- Check <memory> before asking the user things you might already know
- If the summary doesn't have what you need, use memory_search for exact details
- When you notice a memory is outdated, save a corrected version

Edge Cases & Mitigations

Memory Spam

Problem: Agent saves too many memories per conversation.

Mitigation:

Rate limit: max 5 saves per conversation turn (enforced in tool)
Dedup: FTS5 search for similar content before inserting — skip if top match has high rank
System prompt: "Be selective. Not every fact is worth remembering."

Stale Memories

Problem: Old memories become outdated but still appear in summaries.

Mitigation:

Supersede pattern: new saves mark old entries via superseded_by
Agent instruction: "When a loaded memory is wrong, save a corrected version"
Compaction naturally ages out stale information as summaries get re-rolled

Large Individual Memories

Problem: Single memory consumes too much budget.

Mitigation:

Hard limit: 2000 chars (~500 tokens) per entry, enforced at save time
Rejection message guides agent to be more concise

Summarization Failure

Problem: Haiku API is down or unreachable.

Mitigation:

Entries still save normally (no network needed for writes)
Context loading falls back to raw entries with token budget
Next session start runs catch-up summarization
System is fully functional without summaries — just slightly less context-efficient

Summary Quality Drift

Problem: Compaction over many cycles loses nuance.

Mitigation:

Raw entries are never deleted — they remain searchable via FTS5
memory_search always searches raw content, not summaries
Agent can always find exact details even if the summary omits them

Database Corruption

Problem: Crash could corrupt the database.

Mitigation:

WAL mode: SQLite's Write-Ahead Logging provides automatic crash recovery
Transactions: summarization and supersede operations are atomic
PRAGMA integrity_check can be run as a health check
Users can back up by copying memory.db (safe to copy when no writers are active)

Cross-Project Pollution

Problem: Memories from Project A load when working on Project B.

Mitigation:

project column on each entry + indexed query
Summaries include project context naturally (Haiku includes it in the summary text)
memory_search supports project filtering
project IS NULL entries are global — useful for cross-cutting knowledge

Comparison with Other Approaches

Aspect	This System	Flat Memory Log	Git Context Controller	Full Vault (ctx0 DB)	Claude Code Auto-Memory
Storage	Local SQLite + cloud sync	Local SQLite	Local SQLite	Supabase PostgreSQL	Local MEMORY.md
Context cost	~300-500 tokens	~4,000 tokens	~130-330 tokens	On-demand retrieval	~200 lines
Write latency	<1ms (local)	<1ms	<1ms	20-50ms (network)	<1ms
Read latency	<1ms (local)	<1ms	<1ms	50-150ms (network)	<1ms
Search	FTS5 over raw entries	FTS5 over raw entries	Summaries only (no FTS5)	pg_trgm / tsvector	String grep
AI cost	~$0.001/day	$0	~$0.001/commit	~$0.01/session	$0
Offline	Yes (except sync + summarization)	Fully offline	Yes (except summarization)	No	Fully offline
Cross-device	Yes (via cloud sync)	No (manual export)	No (manual export)	Built-in	No (manual)
Multi-daemon	Yes (row-level sync)	No	No	Yes	No
Complexity	2 tools, ~400 LOC	2 tools, ~200 LOC	5 tools, ~400 LOC	8+ tables, 3 subagents	1 file, built-in
Self-healing	Yes (session start catch-up)	No	Orphaned observations persist	Server-side	No
Scales	Millions of rows + compaction	Millions of rows	Millions of rows	Unlimited	~200 lines

Implementation Plan

Phase 1: Core (MVP) ✅

Goal: Agent can save and search memories from local SQLite. Context loading uses raw entries (no summarization yet).

Task	Files	Effort
Add `better-sqlite3` dependency to daemon	`packages/daemon/package.json`	Trivial
Create `MemoryEntry`, `SummaryEntry` types	`packages/daemon/src/memory/types.ts`	Trivial
Implement SQLite init + schema creation	`packages/daemon/src/memory/db.ts`	Small
Implement `memory_save` tool	`packages/daemon/src/tools/memory-save.ts`	Small
Implement `memory_search` tool (FTS5)	`packages/daemon/src/tools/memory-search.ts`	Small
Implement raw-entry memory loader (fallback mode)	`packages/daemon/src/memory/loader.ts`	Medium
Inject `<memory>` block into system prompt	`packages/daemon/src/agent/loop.ts`	Small
Add memory instructions to system prompt	`packages/daemon/src/agent/agents.ts`	Small
Token estimation utility	`packages/daemon/src/memory/tokens.ts`	Small

Phase 2: Summary Priming ✅

Goal: Context loading uses AI-generated summaries instead of raw entries. Summarization triggers on entry count threshold and session end.

Task	Files	Effort
Implement `generateIncrementalSummary` (Haiku call)	`packages/daemon/src/memory/summarize.ts`	Medium
Implement `summarizeEntries` (core function)	`summarize.ts`	Medium
Implement entry count threshold trigger	`summarize.ts`	Small
Implement session-end summarization hook	`summarize.ts`	Small
Update loader to use summary-primed loading	`loader.ts`	Medium
Implement fallback to raw entries when no summaries exist	`loader.ts`	Small

Phase 3: Self-Healing + Periodic ✅

Goal: System handles crashes, long sessions, and edge cases gracefully.

Task	Files	Effort
Implement session-start catch-up summarization	`summarize.ts`	Small
Implement periodic interval timer (5 min)	`summarize.ts`	Small
Supersede pattern (update existing memories)	`db.ts`	Small
Dedup check before saving (FTS5 similarity)	`db.ts`	Small
Rate limiting (max 5 saves per turn)	`memory-save.ts`	Small

Phase 4: Daily Compaction

Goal: System handles long-term growth. Incremental summaries are rolled up into compacted summaries daily.

Status: Not implemented. Deferred — intended to run on the centralized cloud computer. Without this, incremental summaries accumulate over weeks/months and eventually consume too much context budget. The existing loader.ts already supports consuming compacted summaries — Phase 4 only needs to produce them.

Why it matters: After 6 months of daily use, context loading should still involve one compacted summary + a few recent incrementals — not hundreds of summaries.

Task	Files	Effort
Implement `generateCompactionSummary` (Haiku call, max 500 tokens)	`packages/daemon/src/memory/summarize.ts`	Small
Implement `dailyCompaction` (query incrementals > 24h old, AI-merge, atomic insert + delete)	`summarize.ts`	Medium
Register daily cron trigger in daemon lifecycle (e.g., `setInterval` at 24h or integrate with `ScheduleExecutor`)	`packages/daemon/src/index.ts`	Small

Key behaviors:

Only compacts incremental summaries older than 24 hours
Requires at least 2 incrementals to trigger (otherwise no-op)
Atomic SQLite transaction: insert compacted summary + delete old incrementals
Old incrementals are deleted (raw entries they summarized remain in memories table)
Cost: ~$0.0003 per compaction (Haiku, ~1500 input tokens, ~300 output tokens)
Reference implementations: dailyCompaction() and generateCompactionSummary() in the AI Summarization Prompts section

Phase 5: Auto-Save Hooks ✅

Goal: Agent automatically saves memories on significant events without explicit tool calls.

Status: Implemented. Hooks subscribe to SessionStateChanged events on the daemon event bus, fire concurrently via Promise.allSettled on session completion, and use Haiku for summarization.

Task	Files	Status
Create hooks module with event bus subscription	`packages/daemon/src/memory/hooks.ts`	Done
Task completion summary (Haiku + digest builder)	`hooks.ts`	Done
User correction detection (last 6 messages → Haiku)	`hooks.ts`	Done
Error resolution detection (heuristic + Haiku)	`hooks.ts`	Done
Significance filtering (tool call + message thresholds)	`hooks.ts`	Done
Dedup check (skip if agent saved 2+ memories)	`hooks.ts`	Done
Export from memory index	`packages/daemon/src/memory/index.ts`	Done
Wire into daemon lifecycle (init + cleanup)	`packages/daemon/src/index.ts`	Done

Phase 6: Cloud Sync

Goal: Multiple daemons (different devices) can share the same memory via Supabase.

Status: Not implemented. Deferred — requires the ctx0 Drizzle schema (Supabase side) and proxy DB infrastructure. The sync_meta table already exists in the local SQLite schema. The existing setMemoryToolDeps pattern for signing provider/config can be extended to include getDb for the ProxiedSupabaseClient.

Note on dependency injection: Phase 6 uses a separate getDb: () => ProxiedSupabaseClient | null dep (following the todo.ts/plan.ts pattern), NOT the existing MemoryToolDeps which carries getSigningProvider/getConfig for Haiku calls. The cloud sync deps wire into the Daemon constructor alongside setTodoToolDeps and setPlanToolDeps.

Note on auto-save hook integration: Auto-saved memories (Phase 5, source: 'auto') should also be pushed to cloud. The syncMemoryToCloud() call should be added alongside the insertMemory() calls in hooks.ts, or the push can be wired at the insertMemory() level in db.ts to cover all sources.

Task	Files	Effort
Add `ctx0_memories` Drizzle table + register	`packages/ctx0/src/schema/memory.ts`, `index.ts`	Small
Create `packages/daemon/src/memory/sync.ts` with `MemorySyncDeps` interface	`sync.ts` (new)	Trivial
Implement `syncMemoryToCloud()` (fire-and-forget upsert)	`sync.ts`	Small
Implement `pullMemoriesFromCloud()` (watermark-based incremental pull)	`sync.ts`	Medium
Implement `bootstrapFromCloud()` (new device full pull, paginated)	`sync.ts`	Medium
Implement `syncSupersedeToCloud()` (push supersede updates)	`sync.ts`	Small
Wire push into `memory_save` tool (fire-and-forget `.catch(() => {})`)	`packages/daemon/src/tools/memory-save.ts`	Trivial
Wire push into auto-save hooks (fire-and-forget)	`packages/daemon/src/memory/hooks.ts`	Trivial
Wire pull/bootstrap into `initMemoryWithSync()` at session start	`packages/daemon/src/index.ts`	Small
Wire `MemorySyncDeps` in Daemon constructor	`packages/daemon/src/index.ts`	Trivial

Auto-Save Hooks

When a session completes, lightweight hooks automatically create memories for significant events — without relying on the LLM to explicitly call memory_save. This ensures the memory system captures useful information even when the agent doesn't think to save it.

Architecture

Session completes → SessionStateChanged event (bus)
                          │
                    hooks.ts subscriber
                          │
              ┌───────────┼───────────┐
              │           │           │
              ▼           ▼           ▼
        Task Summary  Correction  Error Resolution
         (Haiku)      (Haiku)     (heuristic+Haiku)
              │           │           │
              ▼           ▼           ▼
           insertMemory(source: 'auto')
              │
              ▼
        checkSummarizationTrigger()

All three hooks run concurrently via Promise.allSettled. Each is fire-and-forget — failures are logged but never block the session or each other.

Why Event Bus Over Direct IPC Call

Decoupled — hooks know nothing about sockets or IPC
Universal — fires for local, remote, scheduled, and trigger sessions
handleRunTask in server.ts is already 600+ lines; adding more logic there increases coupling

Significance Filtering (Avoid Noise)

Before running any hooks, the system applies filters to avoid saving trivial conversations:

Filter	Threshold	Rationale
Tool call count	< 2 tool calls	Conversations with no tool use are typically greetings or simple Q&A
User message count	< 3 user messages	Combined with above (both must be below threshold to skip)
Agent self-saves	≥ 2 `memory_save` calls	Agent was already deliberate about saving — don't duplicate
Haiku "SKIP"/"NONE"	Per-hook	Final safety net — Haiku decides if the content is worth remembering
Inflight guard	Set per session ID	Prevents duplicate processing if event fires twice

Hook 1: Task Completion Summary

Builds a condensed digest of the session (first user message, tools used, error count, final assistant response, truncated to ~2000 chars) and asks Haiku for a 1-2 sentence factual summary. Haiku responds "SKIP" if trivial.

Saved as: { tags: ['session_summary'], source: 'auto' }

Hook 2: User Correction Detection

Only runs if the session had 3+ user messages (corrections require back-and-forth). Extracts the last 6 messages and asks Haiku: "Did the user correct the agent? If yes, state the preference. If no, respond 'NONE'."

Saved as: { tags: ['preference', 'correction'], source: 'auto' }

Hook 3: Error Resolution Detection

Heuristic scan: looks for tool_result messages with toolError set, followed by a later successful result from the same tool. If the pattern is found, asks Haiku to summarize the error → fix pattern in one sentence.

Saved as: { tags: ['debug', 'error_resolution'], source: 'auto' }

Haiku Access

Uses createProxiedModel(config, signingProvider, HAIKU_MODEL_ID, 'memory-auto-save') — same pattern as memory/summarize.ts. Requires authentication (no-ops silently if not authenticated).

Cost

~$0.0005–$0.001 per session (1-2 Haiku calls, ~500 input tokens + 50-100 output tokens each). Most sessions trigger 1 call (task summary). Correction and error hooks only fire when their heuristics match.

Files

File	Description
`packages/daemon/src/memory/hooks.ts`	Hook implementations, event bus subscription, lifecycle
`packages/daemon/src/memory/index.ts`	Exports `initAutoSaveHooks`, `stopAutoSaveHooks`, `AutoSaveHookDeps`
`packages/daemon/src/index.ts`	Wires `initAutoSaveHooks` in `Daemon.initialize()`, `stopAutoSaveHooks` in `Daemon.cleanup()`

Dependency Injection

typescript

export interface AutoSaveHookDeps {
  getSessionManager: () => SessionManager | null;
  getSigningProvider: () => SigningProvider | null;
  getConfig: () => DaemonConfig;
}

Wired in Daemon.initialize():

typescript

initAutoSaveHooks({
  getSessionManager: () => this.getOrCreateSessionManager(),
  getSigningProvider: () => this.signingProvider,
  getConfig: () => loadConfig(),
});

Cloud Sync

When a user runs multiple daemons (e.g., laptop + desktop), each daemon should see the same memories. Cloud sync makes this work by using Supabase as a shared merge point while keeping local SQLite as the fast path.

Design Principles

Local-first, cloud-optional — All reads and writes hit local SQLite (<1ms). Cloud sync is fire-and-forget. If the user isn't authenticated or the cloud is unreachable, the system works exactly as before.
Append-only sync — Memories use nanoid primary keys, so two daemons can never produce the same ID. This means INSERT OR IGNORE handles all deduplication with zero conflicts.
Memories sync, summaries don't — Raw memory entries are the source of truth and sync bidirectionally. Summaries are local-only — each daemon generates its own from its merged memory set. This avoids summary conflicts entirely.
Timestamp watermark — Each daemon tracks when it last synced. On pull, it only fetches rows newer than its watermark. Efficient even with thousands of memories.

Architecture

┌──────────────────────────────────────────────────────────────────────────┐
│                        MULTI-DAEMON CLOUD SYNC                            │
│                                                                           │
│   DAEMON A (laptop)                    DAEMON B (desktop)                │
│   ~/.bot0/memory.db                    ~/.bot0/memory.db                 │
│                                                                           │
│   memory_save("user prefers Bun")      memory_save("deploy key in 1PW")  │
│        │                                       │                         │
│        │ 1. INSERT locally (<1ms)              │ 1. INSERT locally       │
│        │ 2. Push to cloud (fire & forget)      │ 2. Push to cloud        │
│        ▼                                       ▼                         │
│   ┌──────────────────────────────────────────────────────┐               │
│   │           Supabase: ctx0_memories                     │               │
│   │                                                       │               │
│   │  id       │ content                    │ synced_at    │               │
│   │  m_a1b2c3 │ "user prefers Bun"         │ 2026-02-27   │               │
│   │  m_x7y8z9 │ "deploy key in 1PW"        │ 2026-02-27   │               │
│   │                                                       │               │
│   │  Proxy auto-injects user_id on all queries           │               │
│   └──────────────────────────────────────────────────────┘               │
│        │                                       │                         │
│        │ 3. Pull on session start              │ 3. Pull on session start│
│        │    (WHERE synced_at > last_sync_ts)   │                         │
│        ▼                                       ▼                         │
│   memory.db now has both memories      memory.db now has both memories   │
│   → local summarization catches up     → local summarization catches up  │
│                                                                           │
└──────────────────────────────────────────────────────────────────────────┘

Supabase Table: `ctx0_memories`

Mirrors the local memories table with user_id for RLS. Defined as a Drizzle schema in packages/ctx0/src/schema/memory.ts:

typescript

export const ctx0Memories = pgTable('ctx0_memories', {
  id: text('id').notNull(),
  userId: uuid('user_id').notNull().references(() => ctx0Users.id),
  content: text('content').notNull(),
  tags: jsonb('tags').notNull().default([]),
  source: text('source').notNull().default('agent'),
  project: text('project'),
  pinned: boolean('pinned').notNull().default(false),
  supersededBy: text('superseded_by'),
  createdAt: timestamp('created_at', { withTimezone: true }).notNull().defaultNow(),
  syncedAt: timestamp('synced_at', { withTimezone: true }).notNull().defaultNow(),
}, (table) => ({
  pk: primaryKey({ columns: [table.id, table.userId] }),
  userSyncIdx: index('idx_ctx0_memories_user').on(table.userId, table.syncedAt),
}));

Equivalent SQL:

sql

CREATE TABLE ctx0_memories (
  id TEXT NOT NULL,
  user_id UUID NOT NULL REFERENCES auth.users(id),
  content TEXT NOT NULL,
  tags JSONB NOT NULL DEFAULT '[]'::jsonb,
  source TEXT NOT NULL DEFAULT 'agent',
  project TEXT,
  pinned BOOLEAN NOT NULL DEFAULT false,
  superseded_by TEXT,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  synced_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  PRIMARY KEY (id, user_id)
);

CREATE INDEX idx_ctx0_memories_user ON ctx0_memories(user_id, synced_at DESC);

Dependency Injection

Follows the todo.ts pattern — the daemon injects its ProxiedSupabaseClient at startup:

typescript

interface MemoryToolDeps {
  getDb: () => ProxiedSupabaseClient | null;
}

let _deps: MemoryToolDeps | null = null;

export function setMemoryToolDeps(deps: MemoryToolDeps): void {
  _deps = deps;
}

If _deps is null or getDb() returns null (user not authenticated), all sync operations silently no-op. The system is fully functional in local-only mode.

Push: Fire-and-Forget After Save

After every local memory_save, push the new row to cloud. Non-blocking, non-fatal.

typescript

async function syncMemoryToCloud(memory: {
  id: string;
  content: string;
  tags: string[];
  source: string;
  project: string | null;
  pinned: boolean;
}): Promise<void> {
  const db = _deps?.getDb();
  if (!db) return; // Not authenticated — skip silently

  try {
    await db.from('ctx0_memories').upsert({
      id: memory.id,
      content: memory.content,
      tags: memory.tags,
      source: memory.source,
      project: memory.project,
      pinned: memory.pinned,
      synced_at: new Date().toISOString(),
    }, { onConflict: 'id,user_id' });
  } catch (err) {
    console.error('[memory] Cloud push failed:', err);
    // Non-fatal — local SQLite is the source of truth during execution
  }
}

Wired into the save path:

typescript

function memorySave(params: { content: string; tags?: string[]; pin?: boolean }, context) {
  // ... existing local INSERT ...

  // Fire-and-forget cloud push
  syncMemoryToCloud({ id, content, tags, source: context.source, project, pinned: pin })
    .catch(() => {});

  return `Saved: "${content.slice(0, 60)}..."`;
}

Push: Supersede Updates

When a memory is superseded locally, push the update:

typescript

async function syncSupersedeToCloud(oldId: string, newId: string): Promise<void> {
  const db = _deps?.getDb();
  if (!db) return;

  try {
    await db.from('ctx0_memories')
      .update({ superseded_by: newId, synced_at: new Date().toISOString() })
      .eq('id', oldId);
  } catch (err) {
    console.error('[memory] Cloud supersede sync failed:', err);
  }
}

Pull: Session Start

On session start, pull memories created by other daemons since the last sync. This is the mechanism that gives Daemon B access to Daemon A's memories.

typescript

async function pullMemoriesFromCloud(): Promise<number> {
  const db = _deps?.getDb();
  if (!db) return 0;

  try {
    // Read last sync timestamp
    const meta = localDb.prepare(
      'SELECT value FROM sync_meta WHERE key = ?'
    ).get('last_sync_ts') as { value: string } | undefined;

    const lastSyncTs = meta?.value ?? '1970-01-01T00:00:00Z';

    // Fetch new/updated memories from cloud
    const result = await db.from('ctx0_memories')
      .select('id,content,tags,source,project,pinned,superseded_by,created_at,synced_at')
      .gt('synced_at', lastSyncTs)
      .order('synced_at', { ascending: true })
      .limit(1000);

    if (!result.data || result.data.length === 0) return 0;

    const rows = result.data as CloudMemoryRow[];

    // INSERT OR IGNORE new memories into local SQLite
    const insertOrIgnore = localDb.prepare(`
      INSERT OR IGNORE INTO memories (id, content, tags, source, project, pinned, superseded_by, created_at)
      VALUES (?, ?, ?, ?, ?, ?, ?, ?)
    `);

    // UPDATE superseded_by for existing memories (if cloud has a newer supersede)
    const updateSupersede = localDb.prepare(`
      UPDATE memories SET superseded_by = ?
      WHERE id = ? AND superseded_by IS NULL AND ? IS NOT NULL
    `);

    let newCount = 0;

    const pullTx = localDb.transaction(() => {
      for (const row of rows) {
        const changes = insertOrIgnore.run(
          row.id,
          row.content,
          JSON.stringify(row.tags),
          row.source,
          row.project,
          row.pinned ? 1 : 0,
          row.superseded_by,
          row.created_at,
        ).changes;

        if (changes > 0) newCount++;

        // Apply supersede if it exists on cloud but not locally
        if (row.superseded_by) {
          updateSupersede.run(row.superseded_by, row.id, row.superseded_by);
        }
      }

      // Update watermark
      const latestSyncedAt = rows[rows.length - 1].synced_at;
      localDb.prepare(
        'INSERT OR REPLACE INTO sync_meta (key, value) VALUES (?, ?)'
      ).run('last_sync_ts', latestSyncedAt);
    });

    pullTx();

    if (newCount > 0) {
      console.log(`[memory] Pulled ${newCount} new memories from cloud`);
      // Trigger summarization catch-up for new entries
      checkSummarizationTrigger();
    }

    return newCount;
  } catch (err) {
    console.error('[memory] Cloud pull failed:', err);
    return 0; // Non-fatal — continue with local state
  }
}

New Device Bootstrap

When ~/.bot0/memory.db doesn't exist but the user is authenticated, bootstrap from cloud:

typescript

async function bootstrapFromCloud(): Promise<boolean> {
  const db = _deps?.getDb();
  if (!db) return false;

  try {
    // Pull ALL memories for this user (paginated for large sets)
    let offset = 0;
    const pageSize = 500;
    let totalPulled = 0;

    while (true) {
      const result = await db.from('ctx0_memories')
        .select('id,content,tags,source,project,pinned,superseded_by,created_at,synced_at')
        .order('created_at', { ascending: true })
        .range(offset, offset + pageSize - 1);

      if (!result.data || result.data.length === 0) break;

      const rows = result.data as CloudMemoryRow[];

      const insertStmt = localDb.prepare(`
        INSERT OR IGNORE INTO memories (id, content, tags, source, project, pinned, superseded_by, created_at)
        VALUES (?, ?, ?, ?, ?, ?, ?, ?)
      `);

      const insertTx = localDb.transaction(() => {
        for (const row of rows) {
          insertStmt.run(
            row.id, row.content, JSON.stringify(row.tags), row.source,
            row.project, row.pinned ? 1 : 0, row.superseded_by, row.created_at,
          );
        }
      });

      insertTx();
      totalPulled += rows.length;
      offset += pageSize;

      if (rows.length < pageSize) break; // Last page
    }

    if (totalPulled > 0) {
      console.log(`[memory] Bootstrapped ${totalPulled} memories from cloud`);

      // Set sync watermark
      localDb.prepare(
        'INSERT OR REPLACE INTO sync_meta (key, value) VALUES (?, ?)'
      ).run('last_sync_ts', new Date().toISOString());

      // Generate summaries for bootstrapped entries
      await summarizeEntries();
    }

    return totalPulled > 0;
  } catch (err) {
    console.error('[memory] Cloud bootstrap failed:', err);
    return false;
  }
}

Session Lifecycle Integration

typescript

// In session start (packages/daemon/src/session/manager.ts or memory init)
async function initMemoryWithSync(): Promise<void> {
  const isNewDb = !existsSync(DB_PATH);

  // Initialize local SQLite (creates tables if needed)
  initMemoryDb();

  if (isNewDb) {
    // New device — try bootstrapping from cloud
    await bootstrapFromCloud();
  } else {
    // Existing device — pull new memories from other daemons
    await pullMemoriesFromCloud();
  }

  // Existing session start logic: catch-up summarization, etc.
  await onSessionStart();
}

Conflict Resolution

Scenario	Resolution	Why it works
Two daemons save different memories	No conflict — different nanoid PKs	INSERT OR IGNORE on both sides
Both daemons supersede different memories	Both supersedes apply independently	Each targets a different `id`
Both daemons supersede the SAME memory	Last-write-wins on `superseded_by`	Rare, harmless — both point to valid replacements
Summary divergence between daemons	No conflict — summaries are local-only	Each daemon summarizes its own merged set
Daemon offline for days	Pulls all missed rows via timestamp watermark	`synced_at > last_sync_ts` catches everything
Cloud unreachable on push	Fire-and-forget, memory exists locally	Retry happens on next save or session start
Cloud unreachable on pull	Skip, use local state	System works fully offline
User not authenticated	All sync operations silently no-op	`_deps?.getDb()` returns null
Very large memory set (10K+)	Paginated pull with batch INSERT OR IGNORE	`range()` pagination, transaction batches

Cost Analysis

Operation	Network calls	Latency	Notes
Push (per memory_save)	1 upsert	Fire-and-forget (~20-50ms)	Non-blocking
Pull (per session start)	1 select	~50-150ms	Only fetches new rows
Bootstrap (new device)	N selects (paginated)	~100-500ms per page	One-time
Supersede sync	1 update	Fire-and-forget (~20-50ms)	Rare

Supabase cost for a typical user:

~50 memories/day × 30 days = 1,500 rows/month
Each row ≈ 200 bytes → ~300 KB/month storage
~60 push operations/day + ~2 pull operations/day → negligible

Future Extensions

Vector Search Upgrade

When logs grow past ~1000 entries, add local vector search:

sqlite-vec extension (SQLite native vector similarity)
Generate embeddings via local model (ONNX) or API call
Add embedding BLOB column to memories table
Use for memory_search alongside FTS5

Memory CLI

bash

# List recent memories
bot0 memory list --limit 20

# Search
bot0 memory search "deploy key"

# Export
bot0 memory export --format json > backup.json
bot0 memory export --format md > readable.md

# Import
bot0 memory import backup.json

# Stats
bot0 memory stats
# → 342 memories, 8 summaries, 156 KB, oldest: 2025-12-01

Graduation to Full Vault

When a user outgrows the flat log:

Each memory → ctx0_entries with path derived from tags
Tags map to folders: preference → /preferences/, contact → /contacts/
Summaries → high-level vault entries
SQLite stays as local cache, vault becomes primary

ctx0 System Architecture — Full vault architecture
ctx0 Sessions — Session/conversation storage
Flat Memory Log — Original flat log spec (predecessor)
Git-Like Context Controller — Git-like alternative (evaluated, complexity rejected)
ctx0 Supabase Data Architecture — Database schema

ctx0 — Persistent Memory

Overview

Why This Approach

Why Local SQLite

Design Principles

Architecture

Database Schema

memories — The Raw Log

memories_fts — Full-Text Search

memory_summaries — AI-Generated Summaries

sync_meta — Cloud Sync State

Initialization

File on Disk

When Memories Are Saved

1. Explicit User Request

2. Agent Self-Save

3. Auto-Save on Significant Events

Context Loading (Session Start)

Loading Algorithm

Context Window Injection

Token Budget

Fallback: No Summaries Yet

Agent Tools

memory_save

memory_search

Memory Lifecycle

Writing

Reading (Session Start)

Updating (Supersede Pattern)

Summarization System

Two-Tier Summary Model

Summarization Triggers

1. Entry Count Threshold (Primary)

2. Session End

3. Session Start (Self-Healing Catch-Up)

4. Periodic Interval (Long Sessions)

5. Daily Compaction Cron

Trigger Priority

AI Summarization Prompts

Incremental Summary (entries → summary)

Compaction Summary (summaries → summary)

Core Summarization Function

Cost Analysis

How SQLite, FTS5, and Summarization Work Together

Latency Profile

System Prompt Instructions

Edge Cases & Mitigations

Memory Spam

Stale Memories

Large Individual Memories

Summarization Failure

Summary Quality Drift

Database Corruption

Cross-Project Pollution

Comparison with Other Approaches

Implementation Plan

Phase 1: Core (MVP) ✅

Phase 2: Summary Priming ✅

Phase 3: Self-Healing + Periodic ✅

Phase 4: Daily Compaction

Phase 5: Auto-Save Hooks ✅

Phase 6: Cloud Sync

Auto-Save Hooks

Architecture

Why Event Bus Over Direct IPC Call

Significance Filtering (Avoid Noise)

Hook 1: Task Completion Summary

Hook 2: User Correction Detection

Hook 3: Error Resolution Detection

Haiku Access

Cost

Files

Dependency Injection

Cloud Sync

Design Principles

Architecture

Supabase Table: ctx0_memories

Dependency Injection

Push: Fire-and-Forget After Save

Push: Supersede Updates

`memories` — The Raw Log

`memories_fts` — Full-Text Search

`memory_summaries` — AI-Generated Summaries

`sync_meta` — Cloud Sync State

`memory_save`

`memory_search`

Supabase Table: `ctx0_memories`