The Knowledge Base

Documentation as a Brain

These docs are not just pages to read. They are the source of truth for the entire Avolve game — and they are machine-readable.

Every document in this collection is embedded as vectors in Supabase pgvector. This means the content can be searched semantically (by meaning, not just keywords) and used as context for AI-generated answers.

How It Works

The pipeline has three stages:

1. Content → Chunks

Each MDX document is split into chunks by section heading. A document with five ## headings becomes six chunks: the introduction plus five sections. Each chunk carries its source file, slug, heading, and frontmatter metadata.

2. Chunks → Vectors

Each chunk is embedded using Google's text-embedding-005 model via Vercel AI Gateway, producing a 768-dimensional vector that captures the semantic meaning of the text. Similar concepts produce similar vectors, regardless of exact wording.

3. Vectors → Answers

When you search or ask a question:

Your query is embedded into the same vector space
The database finds the most similar document chunks using cosine similarity
For search: the matching chunks are returned directly with relevance scores
For chat: the matching chunks become context for Claude, which generates a grounded answer citing specific documents

Who This Serves

The brain serves multiple stakeholders simultaneously:

Stakeholder	How They Access It	What They Get
Players	Search bar on /docs, chat on /docs/ask	Find answers by meaning, ask questions in natural language
Search engines	Structured MDX with Schema.org JSON-LD	Clean indexable content with rich metadata
AI agents	Vector search API, Supabase RPC	Any agent can query the knowledge base programmatically
Admin	Supabase dashboard, embedding scripts	Full control over what is indexed and how

The same content serves a human reading a doc page, Google indexing the site, an AI agent building context, and an admin debugging the system. One source of truth, multiple access patterns.

Architecture

MDX files (content/docs/)
    ↓ POST /api/embed
Google text-embedding-005 via Vercel AI Gateway (768 dimensions)
    ↓ upsert
Supabase pgvector (documents table, HNSW index)
    ↓ query
/api/search (vector similarity → ranked chunks)
/api/chat (vector similarity → Claude context → streamed answer)

The documents table stores:

Column	Purpose
`source`	Origin path (e.g., `docs/game-theory`)
`slug`	URL-friendly identifier
`heading`	Section heading (null for introduction)
`content`	The chunk text
`embedding`	768-dim vector
`metadata`	Frontmatter as JSON (title, tags, category)

An HNSW index enables fast approximate nearest-neighbor search. The match_documents function handles similarity queries with configurable threshold and result count.

What Gets Embedded

Only public documentation in content/docs/ is embedded. Internal notes, skill files, and admin references are not included. This is intentional — the brain contains exactly what players should be able to find and what AI agents should be able to cite.

If a piece of information is important enough to be in the brain, it belongs in a doc. If it is internal-only, it stays in skill references or admin notes.

Keeping It Current

The embedding pipeline runs as an API route: POST /api/embed with the service role key in the x-embed-secret header. It deletes existing rows for each source before reinserting, ensuring the database always reflects the current state of the docs. There is no drift between what you read on the page and what the brain knows.

Limitations

Embedding model: Google text-embedding-005 is fast and cost-effective ($0.025/1M tokens) via Vercel AI Gateway. Sufficient for a focused documentation set.
Chunk granularity: Splitting by ## headings means very long sections become single chunks. Short sections may lack context. This works well for the current doc structure.
Latency: Search adds an embedding call (~100ms) plus a database query. Chat adds a Claude inference step on top. Both are acceptable for the use case.
Scope: Only docs are embedded. Future iterations could include Genius entry patterns, quest descriptions, or community knowledge.