Skip to content

Integrations Overview

How to connect the cognitive memory system to your LLM setup.

Autoassociative vs. Manual Memory

The key distinction:

  • Autoassociative (fully automatic): The memory system passively monitors all conversation turns -- both user messages and LLM responses -- and automatically surfaces relevant memories without anyone deciding to "look something up." This is the core thesis of the project. Every integration path supports autoassociative memory except MCP tools.

  • MCP tools (semi-automatic): The LLM has access to memory_recall and memory_store tools and decides when to use them. This is the only option for environments that don't support hooks or middleware wrapping (e.g., Claude Desktop without hooks).

Which Integration Should I Use?

I want to... Use this Autoassociative?
Add memory to Claude Code Claude Code Hooks Yes
Add memory to Claude Desktop / Cursor MCP Server Semi (tool-based)
Build a Python app with memory Python Middleware or Direct Library Yes
Build a non-Python app with memory HTTP Memory Server Yes
Run a multi-agent team with shared memory HTTP Memory Server (networked) Yes
Just try it out quickly Direct Python Library Yes

Integration Summary

Integration How Autoassociative? Language
HTTP Memory Server REST API on localhost or network Yes Any
Claude Code Hooks UserPromptSubmit + Stop hooks Yes Any
Python Middleware Wraps OpenAI/Anthropic API calls Yes Python
MCP Server memory_recall/store tools Semi (tool-based) Any MCP client
Direct Library CognitiveMemoryPipeline API Yes Python

Key Features Across All Integrations

  • Persistence: pipeline.save(dir) / pipeline.load(dir) -- memories survive restarts
  • Think-out-loud: capture LLM reasoning as THOUGHT-type memories via <thinking> tags
  • Multi-agent teams: agent_id/session_id tagging, scoped visibility, local-to-central merge, nightly consolidation
  • Context framing: recalled memories are clearly marked as "from memory, not current user input" with relevance scores and agent attribution

Gist Encoder Options

The gist encoder compresses conversation turns into summaries. Choose based on what you have available:

Encoder Requires Best For
OllamaGistEncoder Local Ollama server Privacy-first, free, no API key needed
OpenAIGistEncoder OpenAI API key (or compatible API) Broadest compatibility
AnthropicGistEncoder Anthropic API key Claude users
PassthroughGistEncoder Nothing Testing, minimal setup, embedding-only
from cmm.pipeline.conversation import CognitiveMemoryPipeline
from cmm.encoding.ollama_gist_encoder import OllamaGistEncoder

pipeline = CognitiveMemoryPipeline(gist_encoder=OllamaGistEncoder())