Skip to content

HTTP Memory Server

A lightweight HTTP API that any programming language can call. It maintains the full cognitive memory pipeline (FAISS index, entity index, decay, priming, working memory) in-process.

Starting the Server

# Single user, no persistence
python -m integrations.claude-code.memory_server

# With persistence (survives restarts)
python -m integrations.claude-code.memory_server --data-dir ./memory_data

# Multi-agent team (network-accessible, auto-save every 5 min)
python -m integrations.claude-code.memory_server --host 0.0.0.0 --data-dir ./memory_data --auto-save 300

Endpoints

Method Endpoint Description
Core
POST /ingest Store a message: {"role": "user", "content": "...", "agent_id": "..."}
POST /recall Search memory: {"query": "...", "top_k": 5, "agent_id": "..."}
POST /ingest_and_recall Ingest + recall in one call (for hooks)
CRUD
GET /memory/{id} Get a single memory by ID
PUT /memory/{id} Update a memory's gist, tags, or importance
DELETE /memory/{id} Delete a single memory
DELETE /memories/agent/{agent_id} Delete all memories from a specific agent
GET /memories?agent_id=X&limit=50 List memories (filterable, newest first)
Management
POST /save Save memory to disk
POST /consolidate Run nightly consolidation
POST /contradictions Detect cross-agent contradictions
POST /merge Merge a local agent's memory store into central
POST /end_of_day Full nightly workflow: consolidate + maintain + save
GET /health Memory system health check
GET /stats Per-agent memory counts

All endpoints accept optional agent_id and session_id fields for multi-agent use.

Autoassociative Pattern

To get autoassociative memory, wrap your LLM calls:

1. User sends message
2. POST /ingest_and_recall with user message -> get recalled memories
3. Prepend recalled memories to LLM prompt
4. Send augmented prompt to LLM -> get response
5. POST /ingest with LLM response (role: "assistant")
6. Return response to user

This pattern works with any LLM in any language. If your code can make HTTP requests and call an LLM API, you have autoassociative memory.

Example Usage

# Store a memory
curl -X POST http://localhost:7832/ingest \
  -H "Content-Type: application/json" \
  -d '{"role": "user", "content": "I am allergic to peanuts."}'

# Recall memories
curl -X POST http://localhost:7832/recall \
  -H "Content-Type: application/json" \
  -d '{"query": "ordering food for the team"}'