ctx0 Skills (DB-as-Skill)
This document defines the skill system for ctx0, which enables agents to query sync tables using the DB-as-Skill pattern.
Overview
External data syncs (Gmail, Slack, Calendar, etc.) are stored in Supabase tables. Rather than exporting thousands of rows as files, we treat database tables as skills - the agent learns the schema and writes SQL queries to extract what it needs.
┌─────────────────────────────────────────────────────────────────────────────┐
│ DB AS SKILL PATTERN │
│ │
│ Traditional approach: DB as Skill: │
│ ──────────────────── ──────────── │
│ │
│ 50,000 emails SKILL.md (schema + examples) │
│ ↓ ↓ │
│ Export to 50,000 .md files Agent reads schema │
│ ↓ ↓ │
│ Agent browses files Agent writes SQL query │
│ ↓ ↓ │
│ Slow, stale, overwhelming Fast, fresh, precise │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Benefits:
- Agent understands data shape (schema)
- Agent sees examples (knows what data looks like)
- Agent writes precise SQL queries
- No need to export thousands of rows
- Always fresh data (queries live DB)
- Agent decides what's relevant to extract/store
Skill File Structure
Each skill lives in its own folder under /skills/:
skills/
├── _index.md # Overview of connected syncs
├── gmail/
│ ├── SKILL.md # Schema + examples + query patterns
│ └── _results/ # Agent-created query result cache
│ ├── investor-threads.md
│ └── term-sheet-emails.md
├── slack/
│ ├── SKILL.md
│ └── _results/
├── calendar/
│ ├── SKILL.md
│ └── _results/
└── {user-skills}/
SKILL.md Format
Each skill has a SKILL.md file that teaches the agent how to query that data source.
--- name: gmail description: Query email data from Gmail sync. Use when user asks about emails, contacts from email, or communication history. table: sync_gmail disable-model-invocation: false --- # Gmail Skill Query email data using `ctx0_sql`. Save useful results to `_results/`. ## Connection Info - **Service:** Gmail - **Last sync:** 2026-01-29T10:30:00Z - **Total records:** 47,832 - **Sync frequency:** Every 15 minutes ## Schema \`\`\`sql CREATE TABLE sync_gmail ( id UUID PRIMARY KEY, user_id UUID NOT NULL, -- Gmail identifiers message_id TEXT UNIQUE, thread_id TEXT, -- Participants from_email TEXT, from_name TEXT, to_emails TEXT[], cc_emails TEXT[], -- Content subject TEXT, body_preview TEXT, -- First 500 chars body_full TEXT, -- Full body (use sparingly, large) -- Metadata labels TEXT[], -- ['INBOX', 'IMPORTANT', 'STARRED', 'SENT'] is_read BOOLEAN, is_starred BOOLEAN, has_attachments BOOLEAN, attachment_names TEXT[], -- Timestamps received_at TIMESTAMPTZ, created_at TIMESTAMPTZ, -- Vector (for semantic search) embedding VECTOR(1536) ); \`\`\` ## Example Data | from_email | from_name | subject | labels | received_at | |------------|-----------|---------|--------|-------------| | sarah@sequoia.com | Sarah Chen | RE: Term sheet follow-up | ['INBOX', 'IMPORTANT'] | 2026-01-29 08:30 | | mike@ycombinator.com | Mike Seibel | Demo day prep | ['INBOX'] | 2026-01-28 14:22 | | notifications@github.com | GitHub | [bot0] PR #42 merged | ['INBOX'] | 2026-01-28 09:15 | ## Query Patterns ### Recent emails from a contact \`\`\`sql SELECT subject, body_preview, received_at FROM sync_gmail WHERE from_email ILIKE '%sequoia%' ORDER BY received_at DESC LIMIT 20; \`\`\` ### Unread important emails \`\`\`sql SELECT from_name, subject, received_at FROM sync_gmail WHERE 'IMPORTANT' = ANY(labels) AND NOT is_read ORDER BY received_at DESC; \`\`\` ### Emails mentioning a topic \`\`\`sql SELECT from_name, subject, body_preview, received_at FROM sync_gmail WHERE body_full ILIKE '%term sheet%' OR subject ILIKE '%term sheet%' ORDER BY received_at DESC LIMIT 20; \`\`\` ## Notes - `body_full` is large - use `body_preview` when possible - RLS automatically filters to current user's emails - Only SELECT queries are allowed (read-only) - Results are limited to 1000 rows max
Frontmatter Fields
| Field | Required | Description |
|---|---|---|
name | Yes | Skill identifier |
description | Yes | When to use this skill |
table | Yes | The sync table this skill queries |
disable-model-invocation | No | If true, only manual invocation |
Agent Workflow
When an agent needs data from a sync source:
┌─────────────────────────────────────────────────────────────────────────────┐
│ AGENT QUERY WORKFLOW │
│ │
│ 1. Agent needs email context │
│ ↓ │
│ 2. Reads /skills/gmail/SKILL.md │
│ "I see the schema: from_email, subject, body_preview, labels..." │
│ ↓ │
│ 3. Writes appropriate SQL query │
│ SELECT from_name, subject, body_preview │
│ FROM sync_gmail │
│ WHERE from_email ILIKE '%investor%' │
│ ORDER BY received_at DESC LIMIT 20 │
│ ↓ │
│ 4. Executes via ctx0_sql │
│ ↓ │
│ 5. Analyzes results │
│ ↓ │
│ 6. Decision: Save to _results/ or just use in context? │
│ - Important/reusable → Save to _results/investor-emails.md │
│ - One-time use → Just use in current response │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Example Agent Interaction
User: "What have investors said about our valuation?"
Agent thinking:
1. Need to search emails from investors about valuation
2. Check /skills/gmail/SKILL.md for schema
3. Also check /contacts/ to identify investor emails
Agent actions:
1. ctx0_retrieve("list of investor contacts")
→ Returns: sarah@sequoia.com, mike@a16z.com, etc.
2. ctx0_sql(`
SELECT from_name, subject, body_preview, received_at
FROM sync_gmail
WHERE from_email IN ('sarah@sequoia.com', 'mike@a16z.com')
AND (body_full ILIKE '%valuation%' OR subject ILIKE '%valuation%')
ORDER BY received_at DESC
LIMIT 20
`)
→ Returns: 8 relevant emails
3. Synthesizes answer from results
4. ctx0_remember({
content: "Summary of investor valuation feedback...",
path: "/skills/gmail/_results/investor-valuation-feedback.md"
})
Security Model
Agent queries must be secure. Four layers of protection:
┌─────────────────────────────────────────────────────────────────────────────┐
│ SECURITY LAYERS │
│ │
│ Layer 1: Application Validation │
│ └── Block dangerous keywords before query reaches DB │
│ │
│ Layer 2: PostgreSQL Function │
│ └── Only execute SELECT, reject all mutations │
│ │
│ Layer 3: Row-Level Security (RLS) │
│ └── user_id filter automatically applied │
│ │
│ Layer 4: Service Role Context │
│ └── Daemon uses service_role, agent queries scoped to user │
│ │
│ Result: Agent can ONLY read their own user's data │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Layer 1: Application Validation
class Ctx0SqlTool { private forbiddenPatterns = [ /\bDROP\b/i, /\bDELETE\b/i, /\bTRUNCATE\b/i, /\bINSERT\b/i, /\bUPDATE\b/i, /\bALTER\b/i, /\bCREATE\b/i, /\bGRANT\b/i, /\bREVOKE\b/i, ]; validateQuery(sql: string): void { if (!/^\s*SELECT\b/i.test(sql)) { throw new Error('Only SELECT queries are allowed'); } for (const pattern of this.forbiddenPatterns) { if (pattern.test(sql)) { throw new Error(`Forbidden keyword in query`); } } } }
Layer 2: PostgreSQL Function
The ctx0_query function wraps all agent queries:
CREATE OR REPLACE FUNCTION ctx0_query(query_text TEXT) RETURNS JSONB LANGUAGE plpgsql SECURITY DEFINER SET search_path = public AS $$ DECLARE result JSONB; BEGIN -- Validate: only SELECT allowed IF NOT (query_text ~* '^\s*SELECT\b') THEN RAISE EXCEPTION 'Only SELECT queries allowed'; END IF; -- Block dangerous patterns IF query_text ~* '\b(DROP|DELETE|TRUNCATE|INSERT|UPDATE|ALTER|CREATE|GRANT|REVOKE|EXEC)\b' THEN RAISE EXCEPTION 'Forbidden keyword in query'; END IF; -- Block multiple statements IF query_text ~ ';\s*\S' THEN RAISE EXCEPTION 'Multiple statements not allowed'; END IF; -- Execute with row limit EXECUTE format( 'SELECT jsonb_agg(row_to_json(t)) FROM (SELECT * FROM (%s) sub LIMIT 1000) t', query_text ) INTO result; RETURN COALESCE(result, '[]'::jsonb); END; $$;
Layer 3: RLS Auto-Filter
Row-Level Security automatically filters queries:
-- Agent writes this query: SELECT * FROM sync_gmail WHERE subject ILIKE '%term sheet%' -- RLS automatically transforms it to: SELECT * FROM sync_gmail WHERE subject ILIKE '%term sheet%' AND user_id = 'current-user-uuid' -- ← Injected by RLS
Sync Table Schemas
sync_gmail
CREATE TABLE sync_gmail ( id UUID PRIMARY KEY, user_id UUID NOT NULL, message_id TEXT UNIQUE, thread_id TEXT, from_email TEXT, from_name TEXT, to_emails TEXT[], cc_emails TEXT[], subject TEXT, body_preview TEXT, body_full TEXT, labels TEXT[], is_read BOOLEAN, is_starred BOOLEAN, has_attachments BOOLEAN, attachment_names TEXT[], received_at TIMESTAMPTZ, created_at TIMESTAMPTZ, embedding VECTOR(1536) );
sync_slack
CREATE TABLE sync_slack ( id UUID PRIMARY KEY, user_id UUID NOT NULL, message_id TEXT UNIQUE, channel_id TEXT, channel_name TEXT, thread_ts TEXT, sender_id TEXT, sender_name TEXT, sender_email TEXT, text TEXT, message_type TEXT, is_thread_reply BOOLEAN, has_attachments BOOLEAN, reactions JSONB, sent_at TIMESTAMPTZ, created_at TIMESTAMPTZ, embedding VECTOR(1536) );
sync_calendar
CREATE TABLE sync_calendar ( id UUID PRIMARY KEY, user_id UUID NOT NULL, event_id TEXT UNIQUE, calendar_id TEXT, title TEXT, description TEXT, location TEXT, organizer_email TEXT, attendees JSONB, start_at TIMESTAMPTZ, end_at TIMESTAMPTZ, is_all_day BOOLEAN, timezone TEXT, status TEXT, recurrence TEXT, meeting_link TEXT, created_at TIMESTAMPTZ, updated_at TIMESTAMPTZ, embedding VECTOR(1536) );
sync_notion
CREATE TABLE sync_notion ( id UUID PRIMARY KEY, user_id UUID NOT NULL, page_id TEXT UNIQUE, parent_id TEXT, title TEXT, content_preview TEXT, content_full TEXT, page_type TEXT, properties JSONB, notion_created TIMESTAMPTZ, notion_updated TIMESTAMPTZ, created_at TIMESTAMPTZ, embedding VECTOR(1536) );
Creating Custom Skills
Step 1: Set Up Sync
First, create the sync table and populate it with data from your integration.
Step 2: Create SKILL.md
--- name: my-crm description: Query CRM data for leads, deals, and customer information. table: sync_crm --- # CRM Skill Query CRM data using `ctx0_sql`. ## Schema \`\`\`sql CREATE TABLE sync_crm ( id UUID PRIMARY KEY, user_id UUID NOT NULL, contact_id TEXT UNIQUE, name TEXT, email TEXT, company TEXT, deal_stage TEXT, deal_value NUMERIC, last_contact TIMESTAMPTZ, notes TEXT, created_at TIMESTAMPTZ, embedding VECTOR(1536) ); \`\`\` ## Query Patterns ### Find high-value deals \`\`\`sql SELECT name, company, deal_stage, deal_value FROM sync_crm WHERE deal_value > 10000 ORDER BY deal_value DESC; \`\`\` ### Contacts needing follow-up \`\`\`sql SELECT name, email, last_contact FROM sync_crm WHERE last_contact < NOW() - INTERVAL '7 days' ORDER BY last_contact ASC; \`\`\`
Step 3: Add to Agent Skills
In your agent's AGENT.md:
--- skills: - gmail - my-crm ---
_results/ Folder
Each skill has a _results/ folder where agents save useful query results.
Purpose:
- Cache frequently-accessed data
- Store synthesized insights
- Avoid re-running expensive queries
Example structure:
skills/gmail/_results/
├── investor-threads.md # Summary of investor email threads
├── term-sheet-emails.md # Emails about term sheets
└── meeting-requests-jan.md # Meeting requests from January
Example result file:
--- query: "investor emails about valuation" created_at: 2026-01-29T10:30:00Z source_table: sync_gmail row_count: 8 --- # Investor Emails About Valuation ## Summary 8 emails from investors mentioning valuation: - Sarah Chen (Sequoia): Comfortable with $50M - Mike at A16Z: Wants to discuss further - ... ## Raw Results | from_name | subject | received_at | |-----------|---------|-------------| | Sarah Chen | RE: Valuation discussion | 2026-01-28 | | ... | ... | ... |
Best Practices
Writing Good Skill Documentation
- Complete schema - Include all columns with types
- Example data - Show what real data looks like
- Query patterns - Provide templates for common queries
- Notes - Document gotchas (large columns, performance tips)
Query Optimization
- Use
body_previewinstead ofbody_fullwhen possible - Always include
LIMITclause - Use indexes (from_email, received_at, labels)
- Filter early with
WHEREclause
Caching Results
- Save reusable insights to
_results/ - Include metadata (query, date, row count)
- Update periodically if data changes
Related Documentation
- ctx0 Agents - Agent system design
- ctx0 Vault Structure - Folder organization
- ctx0 Supabase Data Architecture - Database schema