ConceptsExecution Model

Execution Model

Skyflo uses a graph-based workflow powered by LangGraph. The workflow enforces a deterministic loop for every infrastructure change. Not a single LLM call. A compiled graph with distinct phases: Plan → Approve → Execute → Verify.

Overview

The execution model is the core differentiator. Every operation flows through the same control loop. Read operations can auto-continue. Mutations always pause for explicit approval.

The graph has five nodes: entry → memory_prepare → model → gate → final. Memory context is loaded before each planning turn. Tools are loaded on-demand. History is windowed to keep context predictable.

Plan

Before the agent reasons about your request, memory_prepare runs. It queries Postgres for relevant prior incidents, runbooks, cluster conventions, and user preferences. Retrieved documents are injected as a system message before the LLM sees your request.

The agent then analyzes your natural language to determine intent. Performs lightweight discovery when needed. Produces structured tool calls for execution.

Memory is advisory. The agent is required to verify live cluster state with tools before acting on memory-derived hypotheses. See Memory.

Approve

Every mutating tool call requires explicit approval before execution. Read-only operations flow freely. The approval gate is driven by MCP tool annotations (readOnlyHint, destructiveHint). Approval enforcement is implemented in the Engine runtime and cannot be disabled through configuration.

When the agent proposes a mutation, the Command Center surfaces the plan. You approve or reject. No shortcuts. No "trust mode" that bypasses the gate.

Execute

Tools run via MCP inside the MCP server container. Kubernetes, Helm, Argo Rollouts, and Jenkins are available. Toolsets are loaded on-demand — the agent starts with Kubernetes read-only tools and requests additional schemas only when the query requires them. See Context Management.

Each tool receives schema-validated inputs matching only the parameters it declares. No raw shell injection. No arbitrary command execution.

Memory tool calls (memory_remember, memory_patch, etc.) are handled in-process by MemoryVirtualToolExecutor. They bypass the MCP approval gate but run through the safety scanner and policy engine.

Verify

The agent evaluates outcomes against original intent. Decides whether to auto-continue, request approval, or stop. Routes context back to the model for refinement if issues are detected.

Verification is not a separate human step. The model consumes tool results and determines next actions. For mutations, the next action is always "wait for approval" before any further writes.

Verified findings can be persisted to memory via memory_remember for future sessions.

Stop Condition

The loop uses native stop semantics:

  • Tool calls produced → route to gate (continue)
  • No tool calls (text-only response) → route to final (done)

No separate LLM call is made to determine whether to stop. This eliminates the 1–4 second latency a judge call would add after every final response.

Persistence

Every tool call, its parameters, and results are stored in Postgres. Supports audit and replay. Token usage, TTFT, and thinking segments are persisted per turn. Memory injection records (conversation_memory_usage) are stored separately and are exportable for compliance.

Persistence is append-only. No deletion of audit records.

Auto-Continue

The engine continues automatically for read operations. Discovery, logs, status checks. These flow without approval.

For mutations, the engine always pauses. Apply, scale, rollback, delete, upgrade, promote. Every write waits for explicit approval before the gate routes back to the model.