AI intelligence
Query vTilt's AI memory layer through MCP — semantic search inside VQL via the :embed() macro, plus the dedicated memory-ask and person-memory-get tools.
vTilt continuously distils raw events, sessions, conversations, and recordings into structured intelligence. That layer — AI memory — is exposed to your MCP client as five virtual tables you can read through VQL (query-run), plus two dedicated tools (memory-ask, person-memory-get) for cases where SQL is the wrong shape.
There is no separate "AI search" tool. Vector semantic search is part of VQL via the :embed('text') macro and the cosineDistance family. The point is composability: combining "find me semantically similar sessions" with "where person is on the enterprise plan and signed up via Twitter" is one query, not two.
#The catalog
query-run and query-validate accept SQL against the following AI memory tables alongside the existing analytics core (events, persons, person_distinct_ids, person_overrides). Call schema-get to see the curated column list for each — the column descriptions list the hot ai JSON keys so you don't have to guess what's inside the blob.
| Table | What it holds | Permission |
|---|---|---|
session_summaries | LLM-generated narrative + structured ai JSON per session (intent_signals, topics, sentiment, engagement_score, key_facts, key_quotes, pages_visited, forms_*, entities.*). 1536-d embedding. | analytics:read |
conversation_summaries | LLM-generated narrative + structured ai JSON per chat channel (user_intent, resolution, objections, buying_signals, feature_requests). 1536-d embedding. | chat:read |
recording_summaries | Deterministic behavioural analysis of replay. One summary column: lines 1–2 are a stable header (Session: + TAKEAWAY:); the rest is a detailed interaction transcript (timeline, scroll dwells, clicks, reading journey). Use the full summary for answers — the header alone is not enough. Metrics columns + 1536-d embedding from the full transcript. | replay:read |
person_memory_latest | Rolling current-state intelligence per person (intent_level, lifecycle_stage, engagement_score, recommended_action, key_facts, topics_of_interest, plus rolling summary and chat_summary narratives). 1536-d embedding. | person:read |
ai_usage_log | Append-only log of every LLM + embedding call: operation, model, input_tokens, output_tokens, embedding_tokens, duration_ms, success. No embedding column. | settings:read |
Every table is tenant-scoped automatically — project_id filtering happens server-side. Don't add it to your WHERE clause.
The ai column is JSON. Read keys with JSONExtractString(ai, 'intent_level') / JSONExtractInt(ai, 'engagement_score'). The most-used keys are hoisted in each table's schema-get description so you can plan a query without a second round trip.
#Vector semantic search — the :embed('text') macro
VQL exposes a single macro, :embed('your search phrase'). The compiler generates the query embedding once per distinct literal, deduplicates identical calls, and substitutes a native ClickHouse parameter — the float array never appears in the SQL string (so it doesn't bloat your logs or the wide-log).
SELECT
session_id,
narrative,
JSONExtractInt(ai, 'engagement_score') AS engagement,
cosineDistance(embedding, :embed('rage clicks on checkout')) AS distance
FROM session_summaries
WHERE started_at > now() - INTERVAL 30 DAY
ORDER BY distance ASC
LIMIT 20Two rules you cannot break:
:embed()takes a string literal only.:embed('hot leads')is fine;:embed(some_column)is rejected withembed_invalid_argument. The literal cap is 2 000 characters — past that you getembed_text_too_long. This is deliberate: it stops agents from accidentally embedding a 50 KB blob and burning $0.40 of inference per query.- Pair
cosineDistance(... :embed(...))withORDER BY ... LIMIT N. That's the only pattern that hits the HNSW vector index. A bareWHERE distance < 0.3predicate causes a full table scan and will hit the 50 M-row read cap on a busy project.
The macro is supported on every embedding column in the catalog (session_summaries, conversation_summaries, recording_summaries, person_memory_latest).
#Combine vector similarity with filters and joins
This is why the macro lives inside VQL. Compose semantic similarity with whatever WHERE / JOIN you'd write for a normal analytics query:
SELECT
s.session_id,
p.public_id,
JSONExtractString(p.properties, '$initial_utm_source') AS first_utm,
cosineDistance(s.embedding, :embed('confused about pricing')) AS d
FROM session_summaries s
JOIN persons p ON p.public_id = s.person_id
WHERE JSONExtractString(p.properties, '$initial_utm_source') = 'twitter'
AND s.started_at > now() - INTERVAL 7 DAY
ORDER BY d ASC
LIMIT 25Or join two AI memory tables:
SELECT
r.session_id,
r.engagement_score,
r.behavior_pattern,
JSONExtractString(s.ai, 'sentiment') AS session_sentiment,
cosineDistance(r.embedding, :embed('deep reading documentation')) AS d
FROM recording_summaries r
JOIN session_summaries s ON s.session_id = r.session_id
WHERE r.engagement_level IN ('deep-read', 'read')
ORDER BY d ASC
LIMIT 20#Cross-table semantic search
When you want one ranking across multiple tables, use explicit UNION ALL:
SELECT 'session' AS source, session_id AS id,
cosineDistance(embedding, :embed('integrating with our API')) AS d
FROM session_summaries
UNION ALL
SELECT 'conversation', channel_id,
cosineDistance(embedding, :embed('integrating with our API'))
FROM conversation_summaries
ORDER BY d ASC LIMIT 30The two :embed('integrating with our API') calls deduplicate to one embedding API call — the compiler caches identical literals per query.
#Natural-language Q&A — memory-ask
When the question is fuzzy and you don't yet know which tables / filters to use, call memory-ask instead of guessing a VQL query. The server:
- Classifies the intent (person snapshot, semantic similarity, lifecycle funnel, cross-table correlation, …).
- Generates parameterised ClickHouse SQL from a deterministic template per intent.
- Executes the SQL.
- Synthesises a prose answer with the project's allowed model tier.
The response contains the synthesised answer, the underlying sql, the raw rows, the intent the planner inferred, and the synthesisModelUsed. The SQL is there so you can iterate via query-run for a different cut, or verify the planner reasoned correctly.
// memory-ask({ question: "Who are the hottest leads from Twitter this week?" })
{
"question": "Who are the hottest leads from Twitter this week?",
"answer": "Three persons stood out — …",
"intent": { "target": "person_current", "filters": { "intent_level": "hot" }, ... },
"sql": "SELECT person_id, ... FROM person_memory_latest FINAL ...",
"rows": [ ... ],
"synthesisModelUsed": "openai/gpt-4o-mini"
}#Canonical person snapshot — person-memory-get
The single most common AI memory question is "tell me everything we know about this person". person-memory-get is the optimised read for exactly that: the latest person_memory_latest row with the hot ai JSON fields hoisted to top level.
// person-memory-get({ id: 'person_abc' })
{
"personId": "person_abc",
"version": 7,
"changedAt": "2026-05-14T13:15:42Z",
"triggerType": "session_summary",
"intentLevel": "hot",
"lifecycleStage": "MQL",
"engagementScore": 78,
"recommendedAction": "Schedule a demo — buying signals on pricing page",
"keyFacts": ["5-person team", "Compares us with PostHog and Mixpanel"],
"topicsOfInterest": ["session replay", "GDPR", "self-host"],
"summary": "...",
"chatSummary": "...",
"ai": {
/* full structured blob */
},
"sessionsSummarized": 12,
"conversationsSummarized": 3,
"modelUsed": "openai/gpt-4o-mini"
}For everything else on persons — history of all versions, everyone-in-stage-X queries, lifecycle funnels — use query-run against person_memory_latest (or join it with persons for property filters).
#When to use which surface
| You want to... | Best surface |
|---|---|
| Find sessions / conversations / recordings semantically similar to a phrase | query-run with cosineDistance(... :embed(...)) |
| Combine vector similarity with explicit WHERE/JOIN | query-run with VQL (vector + tabular in one query) |
| Get "everything you know about" one person | person-memory-get |
| Ask a fuzzy NL question; let the planner pick the table | memory-ask |
| All persons in lifecycle stage X with property Y | query-run against person_memory_latest JOIN persons |
| Average engagement score by behavior_pattern | query-run against recording_summaries |
| How much have we spent on LLM calls this month? | query-run against ai_usage_log (needs settings:read) |
#Error codes
The macro adds three codes on top of the existing VQL error taxonomy:
| Code | Meaning | What to do |
|---|---|---|
embed_invalid_argument | :embed(...) got a non-literal arg (column ref, double-quoted string, expression) or an empty literal | Pass a single-quoted string literal: :embed('your phrase') |
embed_text_too_long | Literal exceeded the 2 000-char cap | Use a short search phrase, not a paragraph |
embed_disabled | AI memory is not enabled for this project | Enable AI memory in project settings, or remove :embed(...) |
The full existing VQL error catalog (parse_error, unknown_table, unknown_function, unsupported_construct, permission_denied, invalid_project) still applies. Self-correction loops: query-validate is free — call it first to fix structural errors before paying for query-run. query-validate also reports embedCount so you can see how many embedding API calls a query will cost before you run it.
#Cost model
| Component | Cost |
|---|---|
One distinct :embed('text') literal | One text-embedding-3-small API call (≈ $0.00004) |
| Multiple identical literals in one SQL string | One call total (deduplicated) |
memory-ask | One classify call (gpt-4o-mini, ≈ $0.0001) + one ClickHouse query + an optional synthesis call (model decided by project's allowed_models) |
person-memory-get | One ClickHouse query (cheap; uses primary key) |
query-run against AI memory tables | No LLM cost — pure ClickHouse |
Every embedding + LLM call writes a row to ai_usage_log with the operation, model, and token counts so you can audit spend per project.
#Related
- Agent skills (prompts) — slash-command walkthroughs. The
/vtilt:ai-memoryslash command is the agent's quick-reference version of this page. - Authentication — how to get a personal API key for MCP.
- OAuth — browser-flow auth for Claude Desktop / Cursor / ChatGPT Connector.