From Vector Search to Graph Reasoning

Vector search finds things that sound similar. Graph search finds things that are connected. Bayesian inference finds things that are probable. I built a system that uses all three.

The standard RAG architecture — embed documents, store vectors, retrieve by cosine similarity — works fine for simple question-answering. But the moment you need multi-hop reasoning, temporal awareness, or probabilistic judgment, pure vector retrieval falls apart.

I spent months building a triple-layer retrieval system that weights 40% vector, 40% graph, and 20% Bayesian.

Where Vector Search Breaks

Vector search operates on one principle: semantic similarity. It fails in three scenarios.

First, multi-hop reasoning. “What decisions led to the Q3 architecture change?” requires connecting a chain of memories that might share zero vocabulary.

Second, temporal reasoning. “What changed between our January and March risk assessments?” requires understanding sequence, not similarity.

Third, probabilistic inference. “Given our current deployment pattern, what is likely to fail next?” requires combining evidence from multiple observations into a probability estimate.

Layer One: ChromaDB (40% Weight)

The vector layer handles bread-and-butter retrieval. I over-retrieve at this layer — top-20 instead of top-5 — and let downstream layers filter. Every vector entry carries the WHO/WHEN/PROJECT/WHY metadata from my tagging protocol, enabling pre-filtering by project scope and time window.

Layer Two: HippoRAG over NetworkX (40% Weight)

HippoRAG models memory as a knowledge graph where entities are nodes and relationships are edges. When a new memory enters, it gets parsed for entities and relationships.

“Agent-7 recommended migrating the auth service to OAuth2 for Project Alpha” creates nodes for Agent-7, auth-service, OAuth2, and Project-Alpha, with typed edges connecting them.

Multi-hop queries become graph traversals. “What led to the auth migration?” walks backward through incoming edges. Vector search cannot do this reliably.

Layer Three: Bayesian Inference via pgmpy (20% Weight)

The Bayesian layer maintains a directed acyclic graph of causal relationships learned from DECISION and OBSERVATION entries. When enough observations accumulate around a pattern, the network captures that conditional probability.

I weight this layer at 20% because Bayesian inference requires substantial data. I prune nodes with fewer than five observations and edges below 0.3 probability.

The Hybrid Scoring Formula

Results below 0.35 get dropped, then deduplicated and re-ranked.

The Five-Step Nexus SOP

RECALL: Hit all three layers in parallel. 200-400ms typical. FILTER: Apply WHO/WHEN/PROJECT/WHY metadata filters. Cuts results 50-70%. DEDUPE: Content-hash deduplication. Highest score survives. RANK: Apply hybrid scoring. Sort descending. COMPRESS: Trim to target context window by mode.

What This Actually Buys You

An agent asks: “What security concerns were raised about the payment service refactor?”

Vector search returns three documents. Two relevant, one wrong project.

Graph search traverses from payment-service through CONCERN-typed edges, finding four entries including one about input validation gaps that never mentions “security” explicitly.

Bayesian inference surfaces a PCI compliance checklist based on the 0.7 probability that payment changes trigger compliance reviews.

Final result: six entries, zero false positives, 340ms. Single-layer RAG would have returned three results, one wrong, and missed compliance entirely.

Upgrading from simple RAG? I can architect your retrieval system.