On this page

AI intelligence

Last updated July 22, 2026

Query vTilt's AI memory layer through MCP — semantic search inside VQL via the :embed() macro, plus the dedicated query_ai_memory and get_person_memory tools.

vTilt continuously distils raw events, sessions, conversations, and recordings into structured intelligence. That layer — AI memory — is exposed to your MCP client as five virtual tables you can read through VQL (run_query), plus two dedicated tools (query_ai_memory, get_person_memory) for cases where SQL is the wrong shape.

There is no separate "AI search" tool. Vector semantic search is part of VQL via the :embed('text') macro and the cosineDistance family. The point is composability: combining "find me semantically similar sessions" with "where person is on the enterprise plan and signed up via Twitter" is one query, not two.

#The catalog

run_query and validate_query accept SQL against the following AI memory tables alongside the existing analytics core (events, persons, person_distinct_ids, person_overrides). Call get_query_schema to see the curated column list for each — the column descriptions list the hot ai JSON keys so you don't have to guess what's inside the blob.

Table	What it holds	Permission
`session_summaries`	LLM-generated narrative + structured `ai` JSON per session (`intent_signals`, `topics`, `sentiment`, `engagement_score`, `key_facts`, `key_quotes`, `pages_visited`, `forms_`, `entities.`). 1536-d `embedding`.	`analytics:read`
`conversation_summaries`	LLM-generated narrative + structured `ai` JSON per chat channel (`user_intent`, `resolution`, `objections`, `buying_signals`, `feature_requests`). 1536-d `embedding`.	`chat:read`
`recording_summaries`	Deterministic behavioural analysis of replay. One `summary` column: the clean page markdown the visitor actually saw (headings, lists, links, images — nav/chrome dropped), then a `--- BEHAVIOR ---` overlay (scroll/engagement stats, the sections they focused on, key clicks, behaviour classification). Use the full `summary` for answers. Metrics columns + 1536-d behaviour-anchored `embedding`.	`replay:read`
`recording_chunks`	Per-heading slices of a recording's page content with behavioural signals bound to each (`reading_quality`, `attention_ms`, `dwell_ms`, `was_revisited`). 1536-d `embedding`. Use for section-level "content + engagement" — vector-search `chunk_text`, rank/filter by engagement.	`replay:read`
`person_memory_latest`	Rolling current-state intelligence per person (`intent_level`, `lifecycle_stage`, `engagement_score`, `recommended_action`, `key_facts`, `topics_of_interest`, plus rolling `summary` and `chat_summary` narratives). 1536-d `embedding`.	`person:read`
`ai_usage_log`	Append-only log of every LLM + embedding call: `operation`, `model`, `input_tokens`, `output_tokens`, `embedding_tokens`, `duration_ms`, `success`. No embedding column.	`settings:read`

#Site knowledge tables (customer website corpus)

Indexed from HTTP crawl + session recordings — not AI memory, but the same :embed('text') pattern. For crawl pipeline, dashboard controls, and when to use site knowledge vs replay transcripts, see Site knowledge. For MCP tools and example queries, see Site knowledge (MCP).

Table	What it holds	Permission
`site_page_chunks`	Latest text chunks (~2k chars) with `heading_path`, `chunk_kind`, 1536-d `embedding`	`analytics:read`
`site_pages`	One row per pathname — title, summary, `auth_gated`, `source`, `embedding`	`analytics:read`
`site_page_elements`	Structured CTAs/links/forms/nav for semantic event enrichment + CRO	`analytics:read`
`site_page_versions`	Append-on-change page history — change detection → impact attribution (no embedding)	`analytics:read`
`site_overview`	Project-level site synopsis + taxonomy JSON	`analytics:read`

Every table is tenant-scoped automatically — project_id filtering happens server-side. Don't add it to your WHERE clause.

The ai column is JSON. Read keys with JSONExtractString(ai, 'intent_level') / JSONExtractInt(ai, 'engagement_score'). The most-used keys are hoisted in each table's get_query_schema description so you can plan a query without a second round trip.

#Vector semantic search — the `:embed('text')` macro

VQL exposes a single macro, :embed('your search phrase'). The compiler generates the query embedding once per distinct literal, deduplicates identical calls, and substitutes a native ClickHouse parameter — the float array never appears in the SQL string (so it doesn't bloat your logs or the wide-log).

SELECT
  session_id,
  narrative,
  JSONExtractInt(ai, 'engagement_score') AS engagement,
  cosineDistance(embedding, :embed('rage clicks on checkout')) AS distance
FROM session_summaries
WHERE started_at > now() - INTERVAL 30 DAY
ORDER BY distance ASC
LIMIT 20

sql

Two rules you cannot break:

:embed() takes a string literal only. :embed('hot leads') is fine; :embed(some_column) is rejected with embed_invalid_argument. The literal cap is 2 000 characters — past that you get embed_text_too_long. This is deliberate: it stops agents from accidentally embedding a 50 KB blob and burning $0.40 of inference per query.
Pair cosineDistance(... :embed(...)) with ORDER BY ... LIMIT N. That's the only pattern that hits the HNSW vector index. A bare WHERE distance < 0.3 predicate causes a full table scan and will hit the 50 M-row read cap on a busy project.

The macro is supported on every embedding column in the catalog (session_summaries, conversation_summaries, recording_summaries, recording_chunks, person_memory_latest).

#Combine vector similarity with filters and joins

This is why the macro lives inside VQL. Compose semantic similarity with whatever WHERE / JOIN you'd write for a normal analytics query:

SELECT
  s.session_id,
  p.public_id,
  JSONExtractString(p.properties, '$initial_utm_source') AS first_utm,
  cosineDistance(s.embedding, :embed('confused about pricing')) AS d
FROM session_summaries s
JOIN persons p ON p.public_id = s.person_id
WHERE JSONExtractString(p.properties, '$initial_utm_source') = 'twitter'
  AND s.started_at > now() - INTERVAL 7 DAY
ORDER BY d ASC
LIMIT 25

sql

Or join two AI memory tables:

SELECT
  r.session_id,
  r.engagement_score,
  r.behavior_pattern,
  JSONExtractString(s.ai, 'sentiment') AS session_sentiment,
  cosineDistance(r.embedding, :embed('deep reading documentation')) AS d
FROM recording_summaries r
JOIN session_summaries s ON s.session_id = r.session_id
WHERE r.engagement_level IN ('deep-read', 'read')
ORDER BY d ASC
LIMIT 20

sql

#Cross-table semantic search

When you want one ranking across multiple tables, use explicit UNION ALL:

SELECT 'session' AS source, session_id AS id,
       cosineDistance(embedding, :embed('integrating with our API')) AS d
FROM session_summaries
UNION ALL
SELECT 'conversation', channel_id,
       cosineDistance(embedding, :embed('integrating with our API'))
FROM conversation_summaries
ORDER BY d ASC LIMIT 30

sql

The two :embed('integrating with our API') calls deduplicate to one embedding API call — the compiler caches identical literals per query.

#Natural-language Q&A — `query_ai_memory`

When the question is fuzzy and you don't yet know which tables / filters to use, call query_ai_memory instead of guessing a VQL query. The server:

Classifies the intent (person snapshot, semantic similarity, lifecycle funnel, cross-table correlation, …).
Generates parameterised ClickHouse SQL from a deterministic template per intent.
Executes the SQL.
Synthesises a prose answer with the project's allowed model tier.

The response contains the synthesised answer, the underlying sql, the raw rows, the intent the planner inferred, and the synthesisModelUsed. The SQL is there so you can iterate via run_query for a different cut, or verify the planner reasoned correctly.

// query_ai_memory({ question: "Who are the hottest leads from Twitter this week?" })
{
  "question": "Who are the hottest leads from Twitter this week?",
  "answer": "Three persons stood out — …",
  "intent": { "target": "person_current", "filters": { "intent_level": "hot" }, ... },
  "sql": "SELECT person_id, ... FROM person_memory_latest FINAL ...",
  "rows": [ ... ],
  "synthesisModelUsed": "openai/gpt-4o-mini"
}

json

#Canonical person snapshot — `get_person_memory`

The single most common AI memory question is "tell me everything we know about this person". get_person_memory is the optimised read for exactly that: the latest person_memory_latest row with the hot ai JSON fields hoisted to top level.

// get_person_memory({ id: 'person_abc' })
{
  "personId": "person_abc",
  "version": 7,
  "changedAt": "2026-05-14T13:15:42Z",
  "triggerType": "session_summary",
  "intentLevel": "hot",
  "lifecycleStage": "MQL",
  "engagementScore": 78,
  "recommendedAction": "Schedule a demo — buying signals on pricing page",
  "keyFacts": ["5-person team", "Compares us with PostHog and Mixpanel"],
  "topicsOfInterest": ["session replay", "GDPR", "self-host"],
  "summary": "...",
  "chatSummary": "...",
  "ai": {
    /* full structured blob */
  },
  "sessionsSummarized": 12,
  "conversationsSummarized": 3,
  "modelUsed": "openai/gpt-4o-mini"
}

json

For everything else on persons — history of all versions, everyone-in-stage-X queries, lifecycle funnels — use run_query against person_memory_latest (or join it with persons for property filters).

#When to use which surface

You want to...	Best surface
Find sessions / conversations / recordings semantically similar to a phrase	`run_query` with `cosineDistance(... :embed(...))`
Search indexed website content (pricing, docs, policies)	`search_site_knowledge` or VQL on `site_page_chunks` — see Site knowledge (MCP)
Combine vector similarity with explicit WHERE/JOIN	`run_query` with VQL (vector + tabular in one query)
Get "everything you know about" one person	`get_person_memory`
Ask a fuzzy NL question; let the planner pick the table	`query_ai_memory`
All persons in lifecycle stage X with property Y	`run_query` against `person_memory_latest` JOIN `persons`
Average engagement score by behavior_pattern	`run_query` against `recording_summaries`
How much have we spent on LLM calls this month?	`run_query` against `ai_usage_log` (needs `settings:read`)

#Error codes

The macro adds three codes on top of the existing VQL error taxonomy:

Code	Meaning	What to do
`embed_invalid_argument`	`:embed(...)` got a non-literal arg (column ref, double-quoted string, expression) or an empty literal	Pass a single-quoted string literal: `:embed('your phrase')`
`embed_text_too_long`	Literal exceeded the 2 000-char cap	Use a short search phrase, not a paragraph
`embed_disabled`	AI memory is not enabled for this project	Enable AI memory in project settings, or remove `:embed(...)`

The full existing VQL error catalog (parse_error, unknown_table, unknown_function, unsupported_construct, permission_denied, invalid_project) still applies. Self-correction loops: validate_query is free — call it first to fix structural errors before paying for run_query. validate_query also reports embedCount so you can see how many embedding API calls a query will cost before you run it.

#Cost model

Component	Cost
One distinct `:embed('text')` literal	One `text-embedding-3-small` API call (≈ $0.00004)
Multiple identical literals in one SQL string	One call total (deduplicated)
`query_ai_memory`	One classify call (`gpt-4o-mini`, ≈ $0.0001) + one ClickHouse query + an optional synthesis call (model decided by project's `allowed_models`)
`get_person_memory`	One ClickHouse query (cheap; uses primary key)
`run_query` against AI memory tables	No LLM cost — pure ClickHouse

Every embedding + LLM call writes a row to ai_usage_log with the operation, model, and token counts so you can audit spend per project.

Agent skills (prompts) — slash-command walkthroughs. The /vtilt:ai-memory slash command is the agent's quick-reference version of this page.
Authentication — how to get a personal API key for MCP.
OAuth — browser-flow auth for Claude Desktop / Cursor / ChatGPT Connector.

On this page

AI intelligence

Last updated July 22, 2026

Query vTilt's AI memory layer through MCP — semantic search inside VQL via the :embed() macro, plus the dedicated query_ai_memory and get_person_memory tools.

#The catalog

Table	What it holds	Permission
`session_summaries`	LLM-generated narrative + structured `ai` JSON per session (`intent_signals`, `topics`, `sentiment`, `engagement_score`, `key_facts`, `key_quotes`, `pages_visited`, `forms_`, `entities.`). 1536-d `embedding`.	`analytics:read`
`conversation_summaries`	LLM-generated narrative + structured `ai` JSON per chat channel (`user_intent`, `resolution`, `objections`, `buying_signals`, `feature_requests`). 1536-d `embedding`.	`chat:read`
`recording_summaries`	Deterministic behavioural analysis of replay. One `summary` column: the clean page markdown the visitor actually saw (headings, lists, links, images — nav/chrome dropped), then a `--- BEHAVIOR ---` overlay (scroll/engagement stats, the sections they focused on, key clicks, behaviour classification). Use the full `summary` for answers. Metrics columns + 1536-d behaviour-anchored `embedding`.	`replay:read`
`recording_chunks`	Per-heading slices of a recording's page content with behavioural signals bound to each (`reading_quality`, `attention_ms`, `dwell_ms`, `was_revisited`). 1536-d `embedding`. Use for section-level "content + engagement" — vector-search `chunk_text`, rank/filter by engagement.	`replay:read`
`person_memory_latest`	Rolling current-state intelligence per person (`intent_level`, `lifecycle_stage`, `engagement_score`, `recommended_action`, `key_facts`, `topics_of_interest`, plus rolling `summary` and `chat_summary` narratives). 1536-d `embedding`.	`person:read`
`ai_usage_log`	Append-only log of every LLM + embedding call: `operation`, `model`, `input_tokens`, `output_tokens`, `embedding_tokens`, `duration_ms`, `success`. No embedding column.	`settings:read`

#Site knowledge tables (customer website corpus)

Table	What it holds	Permission
`site_page_chunks`	Latest text chunks (~2k chars) with `heading_path`, `chunk_kind`, 1536-d `embedding`	`analytics:read`
`site_pages`	One row per pathname — title, summary, `auth_gated`, `source`, `embedding`	`analytics:read`
`site_page_elements`	Structured CTAs/links/forms/nav for semantic event enrichment + CRO	`analytics:read`
`site_page_versions`	Append-on-change page history — change detection → impact attribution (no embedding)	`analytics:read`
`site_overview`	Project-level site synopsis + taxonomy JSON	`analytics:read`

Every table is tenant-scoped automatically — project_id filtering happens server-side. Don't add it to your WHERE clause.

#Vector semantic search — the `:embed('text')` macro

SELECT
  session_id,
  narrative,
  JSONExtractInt(ai, 'engagement_score') AS engagement,
  cosineDistance(embedding, :embed('rage clicks on checkout')) AS distance
FROM session_summaries
WHERE started_at > now() - INTERVAL 30 DAY
ORDER BY distance ASC
LIMIT 20

sql

Two rules you cannot break:

:embed() takes a string literal only. :embed('hot leads') is fine; :embed(some_column) is rejected with embed_invalid_argument. The literal cap is 2 000 characters — past that you get embed_text_too_long. This is deliberate: it stops agents from accidentally embedding a 50 KB blob and burning $0.40 of inference per query.
Pair cosineDistance(... :embed(...)) with ORDER BY ... LIMIT N. That's the only pattern that hits the HNSW vector index. A bare WHERE distance < 0.3 predicate causes a full table scan and will hit the 50 M-row read cap on a busy project.

The macro is supported on every embedding column in the catalog (session_summaries, conversation_summaries, recording_summaries, recording_chunks, person_memory_latest).

#Combine vector similarity with filters and joins

This is why the macro lives inside VQL. Compose semantic similarity with whatever WHERE / JOIN you'd write for a normal analytics query:

SELECT
  s.session_id,
  p.public_id,
  JSONExtractString(p.properties, '$initial_utm_source') AS first_utm,
  cosineDistance(s.embedding, :embed('confused about pricing')) AS d
FROM session_summaries s
JOIN persons p ON p.public_id = s.person_id
WHERE JSONExtractString(p.properties, '$initial_utm_source') = 'twitter'
  AND s.started_at > now() - INTERVAL 7 DAY
ORDER BY d ASC
LIMIT 25

sql

Or join two AI memory tables:

SELECT
  r.session_id,
  r.engagement_score,
  r.behavior_pattern,
  JSONExtractString(s.ai, 'sentiment') AS session_sentiment,
  cosineDistance(r.embedding, :embed('deep reading documentation')) AS d
FROM recording_summaries r
JOIN session_summaries s ON s.session_id = r.session_id
WHERE r.engagement_level IN ('deep-read', 'read')
ORDER BY d ASC
LIMIT 20

sql

#Cross-table semantic search

When you want one ranking across multiple tables, use explicit UNION ALL:

SELECT 'session' AS source, session_id AS id,
       cosineDistance(embedding, :embed('integrating with our API')) AS d
FROM session_summaries
UNION ALL
SELECT 'conversation', channel_id,
       cosineDistance(embedding, :embed('integrating with our API'))
FROM conversation_summaries
ORDER BY d ASC LIMIT 30

sql

The two :embed('integrating with our API') calls deduplicate to one embedding API call — the compiler caches identical literals per query.

#Natural-language Q&A — `query_ai_memory`

When the question is fuzzy and you don't yet know which tables / filters to use, call query_ai_memory instead of guessing a VQL query. The server:

Classifies the intent (person snapshot, semantic similarity, lifecycle funnel, cross-table correlation, …).
Generates parameterised ClickHouse SQL from a deterministic template per intent.
Executes the SQL.
Synthesises a prose answer with the project's allowed model tier.

// query_ai_memory({ question: "Who are the hottest leads from Twitter this week?" })
{
  "question": "Who are the hottest leads from Twitter this week?",
  "answer": "Three persons stood out — …",
  "intent": { "target": "person_current", "filters": { "intent_level": "hot" }, ... },
  "sql": "SELECT person_id, ... FROM person_memory_latest FINAL ...",
  "rows": [ ... ],
  "synthesisModelUsed": "openai/gpt-4o-mini"
}

json

#Canonical person snapshot — `get_person_memory`

// get_person_memory({ id: 'person_abc' })
{
  "personId": "person_abc",
  "version": 7,
  "changedAt": "2026-05-14T13:15:42Z",
  "triggerType": "session_summary",
  "intentLevel": "hot",
  "lifecycleStage": "MQL",
  "engagementScore": 78,
  "recommendedAction": "Schedule a demo — buying signals on pricing page",
  "keyFacts": ["5-person team", "Compares us with PostHog and Mixpanel"],
  "topicsOfInterest": ["session replay", "GDPR", "self-host"],
  "summary": "...",
  "chatSummary": "...",
  "ai": {
    /* full structured blob */
  },
  "sessionsSummarized": 12,
  "conversationsSummarized": 3,
  "modelUsed": "openai/gpt-4o-mini"
}

json

#When to use which surface

You want to...	Best surface
Find sessions / conversations / recordings semantically similar to a phrase	`run_query` with `cosineDistance(... :embed(...))`
Search indexed website content (pricing, docs, policies)	`search_site_knowledge` or VQL on `site_page_chunks` — see Site knowledge (MCP)
Combine vector similarity with explicit WHERE/JOIN	`run_query` with VQL (vector + tabular in one query)
Get "everything you know about" one person	`get_person_memory`
Ask a fuzzy NL question; let the planner pick the table	`query_ai_memory`
All persons in lifecycle stage X with property Y	`run_query` against `person_memory_latest` JOIN `persons`
Average engagement score by behavior_pattern	`run_query` against `recording_summaries`
How much have we spent on LLM calls this month?	`run_query` against `ai_usage_log` (needs `settings:read`)

#Error codes

The macro adds three codes on top of the existing VQL error taxonomy:

Code	Meaning	What to do
`embed_invalid_argument`	`:embed(...)` got a non-literal arg (column ref, double-quoted string, expression) or an empty literal	Pass a single-quoted string literal: `:embed('your phrase')`
`embed_text_too_long`	Literal exceeded the 2 000-char cap	Use a short search phrase, not a paragraph
`embed_disabled`	AI memory is not enabled for this project	Enable AI memory in project settings, or remove `:embed(...)`

#Cost model

Component	Cost
One distinct `:embed('text')` literal	One `text-embedding-3-small` API call (≈ $0.00004)
Multiple identical literals in one SQL string	One call total (deduplicated)
`query_ai_memory`	One classify call (`gpt-4o-mini`, ≈ $0.0001) + one ClickHouse query + an optional synthesis call (model decided by project's `allowed_models`)
`get_person_memory`	One ClickHouse query (cheap; uses primary key)
`run_query` against AI memory tables	No LLM cost — pure ClickHouse

Every embedding + LLM call writes a row to ai_usage_log with the operation, model, and token counts so you can audit spend per project.

Agent skills (prompts) — slash-command walkthroughs. The /vtilt:ai-memory slash command is the agent's quick-reference version of this page.
Authentication — how to get a personal API key for MCP.
OAuth — browser-flow auth for Claude Desktop / Cursor / ChatGPT Connector.

#The catalog

#Site knowledge tables (customer website corpus)

#Vector semantic search — the :embed('text') macro

#Combine vector similarity with filters and joins

#Cross-table semantic search

#Natural-language Q&A — query_ai_memory

#Canonical person snapshot — get_person_memory

#When to use which surface

#Error codes

#Cost model

#Related

#The catalog

#Site knowledge tables (customer website corpus)

#Vector semantic search — the :embed('text') macro

#Combine vector similarity with filters and joins

#Cross-table semantic search

#Natural-language Q&A — query_ai_memory

#Canonical person snapshot — get_person_memory

#When to use which surface

#Error codes

#Cost model

#Related

#Vector semantic search — the `:embed('text')` macro

#Natural-language Q&A — `query_ai_memory`

#Canonical person snapshot — `get_person_memory`

#Vector semantic search — the `:embed('text')` macro

#Natural-language Q&A — `query_ai_memory`

#Canonical person snapshot — `get_person_memory`