Scoring Formula
Retrieval scoring combines multiple signals multiplicatively:
Components
Similarity
Raw cosine similarity from the FAISS index between the query embedding and the memory embedding. Range: 0.0 to 1.0.
Decay
Temporal decay based on age and access frequency:
decay = 1.0 (if within grace period)
decay = e^(-lambda_eff * age) (after grace period)
decay = 1.0 (if access frequency >= lock threshold)
Where:
age= seconds since last access (not creation)lambda_eff = lambda / (1 + freq_weight * access_frequency_per_week)-- rehearsed memories decay slower- Grace period default: 2 weeks (1,209,600 seconds)
- Lock threshold default: 0.5 accesses/week
Importance
Auto-scored at storage time:
| Pattern | Score |
|---|---|
| Corrections ("actually", "I was wrong") | 2.0 |
| Instructions ("always", "never", "remember") | 2.0 |
| Novel information (new facts, preferences) | 1.5 |
| Normal conversation | 1.0 |
| Routine exchanges ("hello", "thanks") | 0.5 |
Priming Boost
Turn-decaying boost for recently activated memories:
Default: boost_strength = 0.3, decay_rate = 0.5. This gives a 1.3x boost at activation, decaying to 1.0 over ~10 turns.
Why Multiplicative?
CMM uses multiplicative scoring (sim * decay * importance * priming), not additive. This means a zero in any factor eliminates the memory entirely.
This is a deliberate design choice: relevance is a prerequisite, not a bonus. An important but irrelevant memory should not surface. A relevant but fully decayed memory should not surface. This better models human retrieval than additive approaches where high importance can compensate for low relevance.
Threshold Filtering
After scoring, only memories above a configurable threshold (default 0.3) are surfaced to the LLM. This prevents low-confidence noise from entering the context.
Spreading Activation (Post-Scoring)
After initial retrieval and scoring, spreading activation expands the result set:
- Each seed result spawns neighbor searches via FAISS and entity links
- Neighbor scores decay per hop:
neighbor_score = parent_score * spread_factor - Default
spread_factor = 0.5, so 1-hop neighbors start at 50% of parent score - Results are merged and de-duplicated with the seed results