ConceptsMemory

Memory

The memory system gives the agent durable, scoped operational context across sessions. Prior incidents, runbooks, service patterns, user preferences, and verification checklists are stored in Postgres and automatically retrieved before each model turn.

Memory is advisory, not authoritative. It provides starting hypotheses and prior context. The agent is always required to verify live infrastructure state with tools before acting on what memory says.

Stores and Scopes

Memory is organized into stores. Each store belongs to one of three scopes.

ScopeStore examplesWho can writeWho can read
workspaceworkspace_conventions, workspace_runbooksAdmin API + memory_propose_promotionAll users
userUser-owned storesAgent (user_preference type only)Store owner only
conversationconversation_memoryAgent (any document type)Conversation participants

Database Schema

memory_stores (workspace | user | conversation)
  └── memory_documents (path-addressed, content-sha256-versioned)
        └── memory_versions (immutable write-ahead log)
        └── memory_source_refs (evidence provenance: tool call IDs, run IDs)
conversation_memory_usage (audit: which docs were injected per run)
dream_jobs / dream_review_items (promotion workflow: agent → admin review → workspace)

Documents are path-addressed. A path like workspace_runbooks/nginx-crashloop-playbook uniquely identifies a document within its store. Content is SHA256-versioned; each write creates a new memory_version entry rather than overwriting.

Retrieval Pipeline

On every new user turn, the memory_prepare graph node runs before the LLM sees the request.

latest user message
  → MemoryRetrievalService.retrieve_for_turn()
     → get_accessible_store_ids()        # workspace + user + conversation stores
     → fts_search() via plainto_tsquery  # Postgres GIN-indexed tsvector column
     → over-fetch (max_docs × 3)
     → re-rank per document:
          score = 0.45 × text_score
                + 0.20 × trust_score
                + 0.15 × entity_match_score
                + 0.10 × doc_type_boost
                + 0.10 × recency_score
     → token budget trim (MEMORY_CONTEXT_TOKEN_BUDGET, default 2,200)
  → MemoryContextFormatter.format()
     → trusted hits   → <skyflo_memory_context> block
     → draft hits     → <skyflo_untrusted_memory_candidates> block
  → injected as system message before model turn
  → memory.context.loaded SSE event emitted

Re-Ranking Weights

ComponentWeightDetails
Text relevance0.45Full-text match score from Postgres plainto_tsquery
Trust score0.20admin_approved=1.0, system_seeded=0.95, user_authored=0.85, agent_draft=0.45
Entity match0.15Namespace, resource name, environment, service extracted from the query
Document type boost0.10runbook=0.9, checklist=0.85, incident=0.8, other types lower
Recency0.10Exponential decay with 30-day half-life

Trusted hits (above a configurable threshold) go into the <skyflo_memory_context> block. Draft hits (agent-authored, unreviewed) go into <skyflo_untrusted_memory_candidates> so the model treats them with appropriate skepticism.

SSE Visibility

The memory.context.loaded event is emitted after each retrieval. It includes:

  • Document IDs and paths
  • Store slugs
  • Trust levels
  • Token count injected

This is visible in the Command Center and persisted to conversation_memory_usage for auditing.

Memory Tools

Read tools are always available to the agent. Write tools require load_toolset("memory", include_write_tools=true).

ToolWriteDescription
memory_searchNoFull-text search across accessible stores
memory_readNoRead a specific document by ID or path
memory_listNoBrowse documents under a path prefix
memory_historyNoInspect version history for a document
memory_rememberYesCreate or update a memory document (policy + safety gated)
memory_patchYesUpdate document content with SHA256 optimistic concurrency
memory_propose_promotionYesPropose a conversation draft for admin promotion to workspace

Memory tool calls bypass the MCP approval gate. They are dispatched directly to MemoryVirtualToolExecutor in the gate node, and are excluded from the tools.pending display in the UI.

memory_remember and memory_patch run through both the safety scanner and the policy engine before any write is persisted. Neither check is skippable.

Safety Scanner

The safety scanner (memory/safety.py) runs before every write. It blocks content containing:

  • AWS access keys and secret keys
  • GitHub PATs (ghp_, github_pat_)
  • Private key blocks (-----BEGIN ... PRIVATE KEY-----)
  • Kubeconfigs and embedded certificates
  • JWTs (three-part base64url strings)
  • Database URLs with embedded passwords
  • Bearer and Basic auth header values
  • High-entropy strings (Shannon entropy ≥ 4.5 over 40+ characters)
  • Raw log volumes (more than 30 timestamped log lines)
  • Prompt injection phrases (ignore previous instructions, skip approval, disregard safety, etc.)

Prompt injection phrases are allowed in conversation_memory (ephemeral agent drafts) but blocked in all other stores. This prevents agent drafts from being used as injection vectors if they are promoted without review.

Sensitive data that passes through memory_remember will be rejected by the scanner. The rejection is non-fatal: the agent receives a memory.write.blocked event and can continue the workflow without the write.

Policy Engine

The policy engine (memory/policy.py) enforces per-scope write permissions independently of the safety scanner.

ScopeAgent write permission
workspaceNot directly writable. Use memory_propose_promotion to submit for admin review.
conversationAny document type.
useruser_preference type only.

Denied writes emit memory.policy.denied as an SSE event. Non-fatal; the workflow continues.

Promotion Workflow

Agents cannot write directly to workspace stores. The promotion workflow provides a structured path:

  1. Agent writes a draft to conversation_memory during a session.
  2. Agent calls memory_propose_promotion with the document path and a rationale.
  3. A dream_job record is created and a memory.promotion.proposed SSE event is emitted.
  4. An admin reviews the proposal in the Command Center or API.
  5. On approval, the document is copied to the target workspace store with admin_approved trust.
  6. On rejection, the dream_job is marked rejected; the conversation draft remains intact.

This ensures workspace-level runbooks are always human-reviewed before they influence future agent turns.

Startup Seeding

On first engine startup (or when stores do not exist), seed_memory_stores() runs automatically after init_db(). It creates three default stores and seeds workspace_conventions with four operational documents:

DocumentPurpose
Verification PolicyEvery mutation must be verified with a read tool before declaring success
Mutation Approval PolicyAll mutating actions require human approval enforced at the engine level
Memory Write PolicyWhat the agent should and must not save
Advisory Memory DisclaimerMemory is not proof of current cluster state

Seeding is idempotent. Running it again on an existing deployment has no effect.

Configuration

SettingDefaultPurpose
MEMORY_ENABLEDtrueToggle memory system on/off
MEMORY_CONTEXT_TOKEN_BUDGET2200Max tokens injected from memory per turn

Setting MEMORY_ENABLED=false skips memory_prepare entirely. No retrieval, no injection, no SSE events. The memory tools remain available in the tool list but writes will fail policy checks when stores do not exist.

Advisory Model

Memory is a context signal, not ground truth.

The agent receives memory context formatted with explicit advisory framing. The system prompt instructs the agent to treat memory as a starting hypothesis, not a replacement for tool-confirmed evidence. Confidence scores derived solely from memory cannot reach the 90% threshold required to trigger mutations.

Every piece of memory-derived context must be validated against live infrastructure state before the agent acts on it. This is enforced by the system prompt, not a runtime check — which is why the prompt is treated as a behavioral contract, not a suggestion.