The V3 Architecture
Six retrieval channels. Three mathematical layers. One principle: your data, your machine, mathematically grounded.
Six-Channel Hybrid Retrieval
No single retrieval method handles every query type. V3.3 runs six channels in parallel — including query completion where partial queries infer your full intent — and fuses them for maximum recall.
Semantic Channel
Fisher-Rao weighted embedding similarity. Models each memory as a probability distribution — not a flat vector. Graduated ramp from cosine to information-geometric distance over the first 10 accesses.
BM25 Channel
Classical keyword matching with persisted tokens. Handles exact names, rare terms, and technical identifiers that semantic similarity misses.
Entity Graph Channel
Spreading activation across the knowledge graph. 3-hop traversal with 0.7 decay. Finds relational connections between people, places, and concepts.
Temporal Channel
Date-aware retrieval with a 3-date model: authored, valid-from, valid-until. Handles questions like 'What was decided last Tuesday?'
Retrieval Pipeline
Query → Strategy Classification → 4 Parallel Channels
→ Weighted RRF Fusion (k=60)
→ Scene Expansion (pull all facts from matched scenes)
→ Bridge Discovery (multi-hop: Steiner tree + spreading activation)
→ Cross-Encoder Reranking (energy-weighted blending)
→ Top-K Results with per-channel scores
Three Novel Contributions
Each technique addresses a fundamental limitation in current memory systems. To our knowledge, none have been applied to agent memory before.
Fisher-Rao Geometry
RetrievalThe natural metric on statistical manifolds. Each memory embedding is modeled as a diagonal Gaussian with learned mean and variance. High-confidence memories score differently from uncertain ones — retrieval improves with use.
Sheaf Cohomology
ConsistencyAlgebraic topology for contradiction detection. The knowledge graph is modeled as a cellular sheaf. Computing H¹(G,F) reveals global inconsistencies that pairwise checking cannot detect — even when every local pair looks consistent.
Langevin Dynamics
LifecycleMemory lifecycle evolves via stochastic gradient flow on the Poincaré ball. The potential encodes access frequency, trust, and recency. Provable convergence to the stationary distribution — no hardcoded thresholds.
Full mathematical treatment with proofs and theorems in the V3 paper.
Read the Paper11-Step Ingestion Pipeline
Every memory is processed through structured encoding before storage. This transforms raw text into a rich, queryable knowledge structure.
Metadata extraction — timestamps, source, importance
Entity resolution — canonical names with alias tracking
Fact extraction — atomic, typed facts (world / experience / opinion / temporal)
Knowledge graph construction — entities as nodes, relationships as edges
Temporal parsing — 3-date model (authored, valid-from, valid-until)
Emotional signal extraction — sentiment and emotional context
Scene clustering — group facts by temporal-semantic coherence
Observation building — structured entity profiles
Foresight generation — anticipatory indexing for future queries
Entropy gating — information-theoretic filtering (low-entropy = skip)
Compression and storage — write to 21-table SQLite schema
You Choose the Privacy-Accuracy Tradeoff
Local Guardian
Zero cloud calls. All processing on your machine. EU AI Act compliant by architecture — data never leaves your device.
LoCoMo (data stays local)
Smart Local
Mode A plus a local LLM via Ollama. Answer synthesis stays on your machine. Still fully private — nothing sent to any cloud.
Full Power
Cloud LLM for maximum accuracy. Cross-encoder reranking and agentic retrieval with multi-round refinement.
LoCoMo (full power)
The Full Implementation is Open Source
Every algorithm, every test, every benchmark — available under AGPL v3.