The brain: KB graph

Notion and GitBook indexed into embeddings, queried via RAG, visible as a similarity graph between articles.

By ChristopherUpdated May 14, 20264 min read

The brain: KB graph

The "brain" is what the AI knows. It is your knowledge base, indexed into embeddings, queried at reply time using retrieval-augmented generation, and visible to you as a graph of similarity between articles.

If voice is how the AI sounds, the brain is what it knows. Most AI quality issues are brain problems, not model problems.

What gets indexed

Four sources today:

Articles authored in Ochre. Anything you write in the in-app editor. See Authoring articles.
Notion. Connect a workspace and pick which pages to expose. See Connecting Notion.
GitBook. Connect a space and pick which sections. See Connecting GitBook.
A public URL or a Markdown upload. Point the crawler at a docs site, or drop in .md files directly.

You can mix all four. The AI does not care where the article came from. It cares about what it says.

Confluence is not supported today. If you need it, the practical workaround is to mirror the relevant Confluence pages into Notion or GitBook.

How indexing works

When you add a source:

Each page is fetched.
Pages are split into chunks at heading boundaries.
Each chunk is converted into a 1536-dim embedding via OpenAI's embedding model.
Embeddings are stored in Postgres with an HNSW index for fast similarity search (pgvector).

When a customer message arrives, the message is also embedded, and the system pulls the most similar chunks back into the AI's context. That is RAG.

Updates are pulled on a schedule (about every 15 minutes for connected sources). Native Ochre articles re-index the moment you publish.

Why embeddings, not full-text search

Embeddings find conceptually related content, not just keyword matches. A customer asking "Why isn't my plan upgrading?" finds the article called "Changing your plan" even though the words don't match.

Full-text search misses that pairing. RAG catches it.

The KB graph view

Open AI → Brain. The graph view shows your articles as nodes and similarity as edges. You see:

Clusters of related content (e.g. all your billing articles).
Outliers (an article unconnected to the rest).
Duplicates (two articles describing the same thing).

The graph is the fastest way to spot brain problems:

Outliers usually mean an article the AI will rarely retrieve.
Duplicates split the AI's attention. Merge them.
Sparse clusters mean a topic where you need more articles.

Edges are computed from per-article centroid embeddings, so the graph is cheap to recompute and stays roughly in sync with the live retrieval index.

What the AI cites

Every AI reply lists the articles it pulled from. They show in AI receipts and, optionally, as a visible footer in the customer-facing reply ("more info: [link]").

If the AI cites the same article on every reply, that article is doing too much work. Split it.

If the AI cites no articles on a reply, two things may be true: the answer is fully general (greetings, light feedback) or your KB has a gap. The Playground coverage view surfaces those.

Authoring for the brain

Articles that work well for the AI tend to:

Have a specific, scoped title ("Refund a single invoice", not "About billing").
Start with a one-line answer.
Include the actual exact phrasing customers use, not internal jargon.
Stay under 1500 words.

Long, sprawling, internal-language articles are harder for the AI to retrieve and harder to draft from. See Authoring articles.

Re-scanning the brain

Re-indexing happens automatically when:

A connected source page changes.
An Ochre article is published or updated.
You connect a new source.

You can force a full re-scan with the Re-scan help center button on the AI overview page. This streams progress and re-embeds every published article. It is rarely needed in normal use; turn to it after a big content rewrite or a model change.

What is not in the brain

Past conversations. Those feed the Learned Q&A library separately.
Customer data (Stripe, HubSpot). That feeds the AI through the customer 360 sidebar, not the brain.
Drafts and archived articles. Only published, public articles are indexed.

Privacy

The brain is per-workspace. Your articles never inform another workspace's AI. Embeddings live in your Postgres rows behind RLS, scoped to your org_id.

If you remove an article, its chunks are deleted on the next index pass.

How brain quality shows up

Drafts cite the right article. Brain working.
Drafts cite the wrong article. Brain has a confusing duplicate or a missing canonical version.
Drafts cite no article. KB gap. The Playground coverage view surfaces this.
Drafts make up facts. Rare with good brain coverage. When it happens, the missing article is usually the fix.

Recommended setup

Connect Notion or GitBook (or both), or write directly in Ochre.
Pick the actual help-center sections, not internal-only spaces.
Add 5 to 20 starter articles in Ochre's KB for the topics most often missing from your existing docs.
Open the graph view weekly for the first month. Merge duplicates. Fill gaps.
Re-check the Playground coverage view after each round of edits.

Was this article helpful?

← Back to Ochre Help

The brain: KB graph

What gets indexed

How indexing works

Why embeddings, not full-text search

The KB graph view

What the AI cites

Authoring for the brain

Re-scanning the brain

What is not in the brain

Privacy

How brain quality shows up

Recommended setup

Related