We had a knowledge layer and it worked. You asked it about an artist, it answered, it cited a source, it felt smart. Underneath, it was two SQL queries and a model.
A dense arm over pgvector, cosine distance across a 1536 dimensional embedding:
// semantic arm: nearest chunks by cosine distance
SELECT c.id, c.text,
(1 - (c.embedding <=> $vec::vector)) AS similarity
FROM "KnowledgeChunk" c
JOIN "KnowledgeDocument" d ON d.id = c."documentId"
WHERE c."artistId" = ANY($artistIds)
AND c.embedding IS NOT NULL
ORDER BY c.embedding <=> $vec::vector ASC
LIMIT $k
And a sparse arm over a Postgres tsvector, for the exact proper nouns that dense vectors blur into mush, a collaborator's name, a song title, a city:
// keyword arm: full-text rank, for exact names dense search smears
SELECT c.id, c.text,
ts_rank(c.tsv, plainto_tsquery('english', $q)) AS similarity
FROM "KnowledgeChunk" c
WHERE c.tsv @@ plainto_tsquery('english', $q)
ORDER BY similarity DESC
LIMIT $k
Merge the two, hand the best chunks to an LLM, let it write a paragraph. That is pgvector ⊕ tsvector → LLM, which is the polite way of saying we built the same RAG chatbot everyone else built.
Then I reread the design doc I had written a year earlier, which prescribed vector ⊕ keyword ⊕ graph, and I found the embarrassing part.
The graph arm was never built.
It was not missing from the plan. It was missing from production. The type an extractor returns has the field, marked optional, and we never once filled it:
export interface ExtractorResult {
documents: DraftDocument[]
entities?: { type: string; name: string }[]
edges?: { dstType: KnowledgeAssetType; dstId: string
relation: string; weight?: number }[] // never populated
}
The table to hold those edges existed too, with a unique constraint and an index, sitting empty and waiting:
model KnowledgeEdge {
srcType KnowledgeAssetType
srcId String
dstType KnowledgeAssetType
dstId String
relation String @db.VarChar(40)
weight Float @default(1)
@@unique([srcType, srcId, dstType, dstId, relation])
}
A ? on a field and an empty table. That is the entire distance between the company I described and the company I had shipped. We owned an authoritative record of which song grew from which demo grew from which jam, entered by hand by the artist as they worked, and we were running cosine similarity over English sentences that happened to mention it.
The embeddings are the commodity
This is the part that should make any AI founder twitch. The embeddings are not the moat. Anyone can embed the corpus. The vendors are racing to embed it cheaper next quarter, and the songs themselves are downloadable by anyone with the link.
The moat is the part that cannot be reconstructed from the artifacts. Which demo a track was generated from. Which sample sits inside which release. Which version supersedes which, and which release is canonical, the spine I argued for in "An Operating System for Creative Assets." Scrape the catalog and you get the audio. You do not get the derivation, because the derivation was never in the audio. A person put it there, through the surface where the work was made.
So when the question is "where did this come from," the wrong move is to embed the question and see what floats back. The right move is to walk an edge.
Derivation is not rights, and that took me a year to say cleanly
Here is the depth I missed the first time I wrote this. "Who made what from what" is really two graphs, and collapsing them into one is how you ship either a lawsuit or a hallucination.
One graph is creative lineage: what grew from what. It is for discovery, for the honest answer to "show me everywhere this idea has appeared." The other is rights: who is owed. They look alike and they are not alike, so in the rebuild they are deliberately separate systems. The lineage file says so in its first comment:
// NON-RIGHTS-BEARING creative/discovery graph.
// The authoritative rights record stays in the Release layer.
The lineage layer is a small, closed vocabulary on purpose. Five node kinds, six relations, nothing open-ended that a model could invent:
const LINEAGE_NODE_TYPES = ["TRACK", "SAMPLE", "INSTRUMENT", "RELEASE", "ORIGIN"]
const LINEAGE_RELATIONS = ["GENERATED_FROM", "INSTRUMENT_OF", "REMIX_OF",
"SAMPLE_OF", "RENDERED_TO", "DERIVED_FROM"]
Rights live somewhere stricter: an append-only ProvenanceEvent log, each event hash-chained to the one before it and anchored on chain, so the history of who touched a release cannot be quietly rewritten after the fact. A lineage edge can be wrong, and when it is you fix it. A provenance event is supposed to be expensive to forge, which is exactly why it is a hash chain and not a column you can UPDATE. Discovery is allowed to be loose. Rights have to be hard. The original mistake was ever letting one model answer for both.
Graph-first, semantic-fallback, for real this time
A read is one hop, two indexed queries, no model anywhere in the loop:
export async function getNodeLineage(type, id) {
const [parents, children] = await Promise.all([
prisma.lineageEdge.findMany({ where: { childType: type, childId: id } }),
prisma.lineageEdge.findMany({ where: { parentType: type, parentId: id } }),
])
return { node: { type, id }, parents, children }
}
parents is what a thing came from. children is what came from it. When the question is "what was this track generated from," the engine resolves the GENERATED_FROM edge to its origin and returns a card. If no edge exists yet it falls back to the old column on the row, and if there is nothing there either it returns thin, the engine drops to RAG, and the listener never sees the seam:
const { parents } = await getNodeLineage("TRACK", track.id)
const edge = parents.find(e =>
e.relation === "GENERATED_FROM" && e.parentType === "ORIGIN")
if (edge) return resolveOrigin(edge.parentId, scope) // graph hit
if (track.sourceDemoUrl) return fromColumn(track) // column fallback
return null // thin: drop to RAG
Two rules keep the walk from lying. An edge can point across artists, a collaboration or a third-party sample, so the graph is never trusted to imply ownership; every node a walk surfaces is rechecked against the viewer's scope before it leaves the function. And the walk is bounded, sixteen nodes and six hops, because a real creator's lineage is small and an unbounded graph read is a denial of service waiting to happen.
The LLM still has a job. It goes last, on a short leash, and it narrates a structured result it is not permitted to expand. When the answer is "every version of this track," there is a SQL query and a card, and no embedding within a mile of it. That, I think, is the entire point.
The honest part
I will say the unflattering thing plainly, because a post that only flatters is not a post. The graph is the part of Casset I have built the least. I shipped the beautiful surface first and the defensible record last, which is exactly backwards, and the edge table sat empty in production for as long as production has existed. The code above is not a victory lap. It is me closing the distance between a design doc and a database, one relation at a time, with the old RAG path still running underneath so nothing breaks while the graph fills in.
The direction is finally right. The thing that is hard to copy is the thing starting to do the work. The embeddings can stay a commodity. The graph is ours, and now, slowly, it is being written down.