Gist Encoders API
Gist encoders compress conversation turns into short summaries with extracted tags. All encoders implement the GistEncoder abstract base class.
GistEncoder (Base Class)
cmm.encoding.gist_encoder.GistEncoder
from cmm.encoding.gist_encoder import GistEncoder
class GistEncoder(ABC):
@abstractmethod
def encode(
self,
turn: ConversationTurn,
context: list[ConversationTurn] | None = None,
) -> Gist:
"""Compress a conversation turn into a gist.
Args:
turn: The conversation turn to encode.
context: Optional preceding turns for context.
Returns:
A Gist with compressed text and extracted tags.
"""
PassthroughGistEncoder
cmm.encoding.gist_encoder.PassthroughGistEncoder
Passes through the raw turn content as the gist. Extracts simple keyword tags. Useful as a baseline and for testing.
from cmm.encoding.gist_encoder import PassthroughGistEncoder
encoder = PassthroughGistEncoder(max_length=512)
| Parameter | Default | Description |
|---|---|---|
max_length |
512 |
Truncate content beyond this length |
OllamaGistEncoder
cmm.encoding.ollama_gist_encoder.OllamaGistEncoder
Uses a local Ollama model for gist compression. Falls back to passthrough if the Ollama server is unavailable.
from cmm.encoding.ollama_gist_encoder import OllamaGistEncoder
encoder = OllamaGistEncoder(
model="mistral:7b-instruct-q4_K_M",
base_url="http://localhost:11434",
timeout=30.0,
temperature=0.1,
)
| Parameter | Default | Description |
|---|---|---|
model |
"mistral:7b-instruct-q4_K_M" |
Ollama model name |
base_url |
"http://localhost:11434" |
Ollama server URL |
timeout |
30.0 |
Request timeout in seconds |
temperature |
0.1 |
LLM temperature |
OpenAIGistEncoder
cmm.encoding.openai_gist_encoder.OpenAIGistEncoder
Works with any OpenAI-compatible API: OpenAI, Together AI, Groq, Anyscale, vLLM, LM Studio.
from cmm.encoding.openai_gist_encoder import OpenAIGistEncoder
# OpenAI
encoder = OpenAIGistEncoder(api_key="sk-...", model="gpt-4o-mini")
# Together AI
encoder = OpenAIGistEncoder(
api_key="...",
base_url="https://api.together.xyz/v1",
model="meta-llama/Llama-3-8b-chat-hf",
)
# Local vLLM / LM Studio
encoder = OpenAIGistEncoder(
base_url="http://localhost:8000/v1",
model="local-model",
api_key="not-needed",
)
| Parameter | Default | Description |
|---|---|---|
model |
"gpt-4o-mini" |
Model name |
base_url |
"https://api.openai.com/v1" |
API base URL |
api_key |
None |
API key (falls back to OPENAI_API_KEY env var) |
timeout |
30.0 |
Request timeout in seconds |
temperature |
0.1 |
LLM temperature |
AnthropicGistEncoder
cmm.encoding.anthropic_gist_encoder.AnthropicGistEncoder
Uses the Anthropic Claude API.
from cmm.encoding.anthropic_gist_encoder import AnthropicGistEncoder
encoder = AnthropicGistEncoder() # uses ANTHROPIC_API_KEY env var
encoder = AnthropicGistEncoder(model="claude-haiku-4-5-20251001") # cheapest option
| Parameter | Default | Description |
|---|---|---|
model |
"claude-haiku-4-5-20251001" |
Anthropic model name |
api_key |
None |
API key (falls back to ANTHROPIC_API_KEY env var or .env file) |
temperature |
0.1 |
LLM temperature |
max_tokens |
256 |
Max tokens in response |
EmbeddingModel
cmm.encoding.embedding.EmbeddingModel
Wraps a sentence-transformers model for text-to-vector encoding. Not a gist encoder, but used alongside them.
from cmm.encoding.embedding import EmbeddingModel
model = EmbeddingModel(
model_name="all-mpnet-base-v2", # 768D embeddings
device=None, # Auto-detect (CUDA if available)
)
vector = model.embed("some text") # Single text -> np.ndarray
vectors = model.embed_batch(["a", "b"]) # Batch -> np.ndarray
| Parameter | Default | Description |
|---|---|---|
model_name |
"all-mpnet-base-v2" |
Sentence-transformers model name |
device |
None |
Device ("cuda", "cpu", or auto-detect) |
| Property | Type | Description |
|---|---|---|
dim |
int |
Embedding dimension (768 for the default model) |
Choosing an Encoder
| Encoder | Requires | Best For |
|---|---|---|
OllamaGistEncoder |
Local Ollama server | Privacy-first, free, no API key |
OpenAIGistEncoder |
OpenAI API key (or compatible) | Broadest compatibility |
AnthropicGistEncoder |
Anthropic API key | Claude users |
PassthroughGistEncoder |
Nothing | Testing, minimal setup |