Skip to content

The Investigator -- Spreading Activation at Scale

1,200 city case files with 5 hidden investigation chains. Each chain has 4 cases in different domains (building inspection, health, shipping, environmental) connected by a shared location or entity. Flat retrieval finds only the direct query match. Entity-linked spreading activation discovers the full chain.

What It Demonstrates

  • Spreading activation with dual-path expansion (FAISS + entity links)
  • Entity linking via spaCy NER
  • Cross-domain association that pure embedding similarity cannot capture
  • Scale -- 1,200 memories in the FAISS index

Results

Chain Flat Retrieval With Spreading Improvement
Thornfield (industrial contamination) 1/4 4/4 +3
Ravenswood (financial fraud) 2/4 4/4 +2
Westbrook (prescription ring) 1/4 4/4 +3
Harborview (construction corruption) 1/4 3/4 +2
Greenfield (data theft) 3/4 4/4 +1
Average 1.6/4 3.8/4 +2.2

Entity-linked spreading activation found 2.4x more connections than flat retrieval.

Spreading Activation Results

Key Finding

Entity-linked spreading traverses "Industrial Way" across a warehouse inspection, a hospital cluster, shipping records, and environmental readings -- four completely different city departments that no single query could connect. The LLM synthesized a coherent investigation brief from the scattered evidence.

Architecture Lesson

The original 384D embedding model (all-MiniLM-L6-v2) could not connect cross-domain cases -- "warehouse inspection" and "hospital patients" shared only 0.18 similarity. Upgrading to 768D (all-mpnet-base-v2) raised this to 0.41, and adding entity linking via spaCy NER connected all cases through the shared "Industrial Way" entity.

Running

python -m demos.investigator --backend anthropic  # ~8 min, ~$1.50