Memory

The memory system gives the agent durable, scoped operational context across sessions. Prior incidents, runbooks, service patterns, user preferences, and verification checklists are stored in Postgres and automatically retrieved before each model turn.

Memory is advisory, not authoritative. It provides starting hypotheses and prior context. The agent is always required to verify live infrastructure state with tools before acting on what memory says.

Stores and Scopes

Memory is organized into stores. Each store belongs to one of three scopes.

Scope	Store examples	Who can write	Who can read
`workspace`	`workspace_conventions`, `workspace_runbooks`	Admin API + `memory_propose_promotion`	All users
`user`	User-owned stores	Agent (`user_preference` type only)	Store owner only
`conversation`	`conversation_memory`	Agent (any document type)	Conversation participants

Database Schema

memory_stores (workspace | user | conversation)
  └── memory_documents (path-addressed, content-sha256-versioned)
        └── memory_versions (immutable write-ahead log)
        └── memory_source_refs (evidence provenance: tool call IDs, run IDs)
conversation_memory_usage (audit: which docs were injected per run)
dream_jobs / dream_review_items (promotion workflow: agent → admin review → workspace)

Documents are path-addressed. A path like workspace_runbooks/nginx-crashloop-playbook uniquely identifies a document within its store. Content is SHA256-versioned; each write creates a new memory_version entry rather than overwriting.

Retrieval Pipeline

On every new user turn, the memory_prepare graph node runs before the LLM sees the request.

latest user message
  → MemoryRetrievalService.retrieve_for_turn()
     → get_accessible_store_ids()        # workspace + user + conversation stores
     → fts_search() via plainto_tsquery  # Postgres GIN-indexed tsvector column
     → over-fetch (max_docs × 3)
     → re-rank per document:
          score = 0.45 × text_score
                + 0.20 × trust_score
                + 0.15 × entity_match_score
                + 0.10 × doc_type_boost
                + 0.10 × recency_score
     → token budget trim (MEMORY_CONTEXT_TOKEN_BUDGET, default 2,200)
  → MemoryContextFormatter.format()
     → trusted hits   → <skyflo_memory_context> block
     → draft hits     → <skyflo_untrusted_memory_candidates> block
  → injected as system message before model turn
  → memory.context.loaded SSE event emitted

Re-Ranking Weights

Component	Weight	Details
Text relevance	0.45	Full-text match score from Postgres `plainto_tsquery`
Trust score	0.20	`admin_approved=1.0`, `system_seeded=0.95`, `user_authored=0.85`, `agent_draft=0.45`
Entity match	0.15	Namespace, resource name, environment, service extracted from the query
Document type boost	0.10	`runbook=0.9`, `checklist=0.85`, `incident=0.8`, other types lower
Recency	0.10	Exponential decay with 30-day half-life

Trusted hits (above a configurable threshold) go into the <skyflo_memory_context> block. Draft hits (agent-authored, unreviewed) go into <skyflo_untrusted_memory_candidates> so the model treats them with appropriate skepticism.

SSE Visibility

The memory.context.loaded event is emitted after each retrieval. It includes:

Document IDs and paths
Store slugs
Trust levels
Token count injected

This is visible in the Command Center and persisted to conversation_memory_usage for auditing.

Memory Tools

Read tools are always available to the agent. Write tools require load_toolset("memory", include_write_tools=true).

Tool	Write	Description
`memory_search`	No	Full-text search across accessible stores
`memory_read`	No	Read a specific document by ID or path
`memory_list`	No	Browse documents under a path prefix
`memory_history`	No	Inspect version history for a document
`memory_remember`	Yes	Create or update a memory document (policy + safety gated)
`memory_patch`	Yes	Update document content with SHA256 optimistic concurrency
`memory_propose_promotion`	Yes	Propose a conversation draft for admin promotion to workspace

Memory tool calls bypass the MCP approval gate. They are dispatched directly to MemoryVirtualToolExecutor in the gate node, and are excluded from the tools.pending display in the UI.

memory_remember and memory_patch run through both the safety scanner and the policy engine before any write is persisted. Neither check is skippable.

Safety Scanner

The safety scanner (memory/safety.py) runs before every write. It blocks content containing:

AWS access keys and secret keys
GitHub PATs (ghp_, github_pat_)
Private key blocks (-----BEGIN ... PRIVATE KEY-----)
Kubeconfigs and embedded certificates
JWTs (three-part base64url strings)
Database URLs with embedded passwords
Bearer and Basic auth header values
High-entropy strings (Shannon entropy ≥ 4.5 over 40+ characters)
Raw log volumes (more than 30 timestamped log lines)
Prompt injection phrases (ignore previous instructions, skip approval, disregard safety, etc.)

Prompt injection phrases are allowed in conversation_memory (ephemeral agent drafts) but blocked in all other stores. This prevents agent drafts from being used as injection vectors if they are promoted without review.

Sensitive data that passes through memory_remember will be rejected by the scanner. The rejection is non-fatal: the agent receives a memory.write.blocked event and can continue the workflow without the write.

Policy Engine

The policy engine (memory/policy.py) enforces per-scope write permissions independently of the safety scanner.

Scope	Agent write permission
`workspace`	Not directly writable. Use `memory_propose_promotion` to submit for admin review.
`conversation`	Any document type.
`user`	`user_preference` type only.

Denied writes emit memory.policy.denied as an SSE event. Non-fatal; the workflow continues.

Promotion Workflow

Agents cannot write directly to workspace stores. The promotion workflow provides a structured path:

Agent writes a draft to conversation_memory during a session.
Agent calls memory_propose_promotion with the document path and a rationale.
A dream_job record is created and a memory.promotion.proposed SSE event is emitted.
An admin reviews the proposal in the Command Center or API.
On approval, the document is copied to the target workspace store with admin_approved trust.
On rejection, the dream_job is marked rejected; the conversation draft remains intact.

This ensures workspace-level runbooks are always human-reviewed before they influence future agent turns.

Startup Seeding

On first engine startup (or when stores do not exist), seed_memory_stores() runs automatically after init_db(). It creates three default stores and seeds workspace_conventions with four operational documents:

Document	Purpose
Verification Policy	Every mutation must be verified with a read tool before declaring success
Mutation Approval Policy	All mutating actions require human approval enforced at the engine level
Memory Write Policy	What the agent should and must not save
Advisory Memory Disclaimer	Memory is not proof of current cluster state

Seeding is idempotent. Running it again on an existing deployment has no effect.

Configuration

Setting	Default	Purpose
`MEMORY_ENABLED`	`true`	Toggle memory system on/off
`MEMORY_CONTEXT_TOKEN_BUDGET`	`2200`	Max tokens injected from memory per turn

Setting MEMORY_ENABLED=false skips memory_prepare entirely. No retrieval, no injection, no SSE events. The memory tools remain available in the tool list but writes will fail policy checks when stores do not exist.

Advisory Model

Memory is a context signal, not ground truth.

The agent receives memory context formatted with explicit advisory framing. The system prompt instructs the agent to treat memory as a starting hypothesis, not a replacement for tool-confirmed evidence. Confidence scores derived solely from memory cannot reach the 90% threshold required to trigger mutations.

Every piece of memory-derived context must be validated against live infrastructure state before the agent acts on it. This is enforced by the system prompt, not a runtime check — which is why the prompt is treated as a behavioral contract, not a suggestion.