HTTP Memory Server
A lightweight HTTP API that any programming language can call. It maintains the full cognitive memory pipeline (FAISS index, entity index, decay, priming, working memory) in-process.
Starting the Server
# Single user, no persistence
python -m integrations.claude-code.memory_server
# With persistence (survives restarts)
python -m integrations.claude-code.memory_server --data-dir ./memory_data
# Multi-agent team (network-accessible, auto-save every 5 min)
python -m integrations.claude-code.memory_server --host 0.0.0.0 --data-dir ./memory_data --auto-save 300
Endpoints
| Method | Endpoint | Description |
|---|---|---|
| Core | ||
POST |
/ingest |
Store a message: {"role": "user", "content": "...", "agent_id": "..."} |
POST |
/recall |
Search memory: {"query": "...", "top_k": 5, "agent_id": "..."} |
POST |
/ingest_and_recall |
Ingest + recall in one call (for hooks) |
| CRUD | ||
GET |
/memory/{id} |
Get a single memory by ID |
PUT |
/memory/{id} |
Update a memory's gist, tags, or importance |
DELETE |
/memory/{id} |
Delete a single memory |
DELETE |
/memories/agent/{agent_id} |
Delete all memories from a specific agent |
GET |
/memories?agent_id=X&limit=50 |
List memories (filterable, newest first) |
| Management | ||
POST |
/save |
Save memory to disk |
POST |
/consolidate |
Run nightly consolidation |
POST |
/contradictions |
Detect cross-agent contradictions |
POST |
/merge |
Merge a local agent's memory store into central |
POST |
/end_of_day |
Full nightly workflow: consolidate + maintain + save |
GET |
/health |
Memory system health check |
GET |
/stats |
Per-agent memory counts |
All endpoints accept optional agent_id and session_id fields for multi-agent use.
Autoassociative Pattern
To get autoassociative memory, wrap your LLM calls:
1. User sends message
2. POST /ingest_and_recall with user message -> get recalled memories
3. Prepend recalled memories to LLM prompt
4. Send augmented prompt to LLM -> get response
5. POST /ingest with LLM response (role: "assistant")
6. Return response to user
This pattern works with any LLM in any language. If your code can make HTTP requests and call an LLM API, you have autoassociative memory.
Example Usage
# Store a memory
curl -X POST http://localhost:7832/ingest \
-H "Content-Type: application/json" \
-d '{"role": "user", "content": "I am allergic to peanuts."}'
# Recall memories
curl -X POST http://localhost:7832/recall \
-H "Content-Type: application/json" \
-d '{"query": "ordering food for the team"}'