AI Agent for Kubernetes.
Approval-Gated by Design.
Self-hosted AI agent for Kubernetes and CI/CD. Converts intent into typed, approval-gated infrastructure operations inside your cluster.
Code Ships Faster Than Ever.
Your Operations Haven't Kept Up.
AI coding tools accelerated development. But deploying, operating, and keeping production alive? That's still manual, fragmented, and dangerously risky.
Brittle Scripts & Manual kubectl
Your deployment process is held together by tribal knowledge and shell scripts that break at 2 AM.
Fragmented Visibility
Prometheus in one tab, Grafana in another, kubectl in the terminal, Slack on fire. Five tools, zero unified context when it matters most.
Every Mutation Is a Blind Risk
kubectl apply with fingers crossed. No dry-run, no rollback plan, no verification that the change did what you intended.
Not a Black Box.
A Deterministic Control Loop.
Every infrastructure change follows four auditable steps. The agent plans, you decide, the outcome is verified.
Plan
Memory context loads automatically before each planning turn: prior incidents, runbooks, and cluster conventions. The agent then analyzes your intent and generates a concrete, evidence-grounded action plan.
Approve
Every mutating tool call pauses for explicit human approval. Read operations flow freely. The gate is enforced by the engine, not a UI toggle you can turn off.
Execute
Typed tools run via MCP. Schema-validated inputs. Sandboxed execution. Full audit trail for every operation — who asked, what was planned, who approved, what ran.
Verify
Agent validates the outcome against your original intent. If cluster state drifts from the plan, it flags the discrepancy and suggests remediation. Findings can be persisted to memory for future incidents.
The approval gate is enforced at the engine level. Not a setting. Not configurable off.
The Agent Learns Your Infrastructure.
Context That Outlasts the Incident.
Prior incidents, runbooks, and cluster conventions are retrieved from Postgres before each model turn. No context loss between sessions.
Runbooks and Incident History
Prior incidents, remediation playbooks, and cluster conventions retrieved automatically before each diagnostic. The agent starts informed, not from scratch.
Trust-Ranked Retrieval
Admin-approved workspace docs outrank user notes. Agent drafts are surfaced as candidates, not ground truth. Context quality is controlled, not just aggregated.
Safety-Scanned Writes
The safety scanner blocks credentials, API tokens, private keys, and high-entropy strings from being persisted. Sensitive data never reaches the memory store.
Advisory by Design
Memory provides starting hypotheses. The agent always verifies live cluster state with tools before acting on memory. Prior context never bypasses tool confirmation.
Admin-approved runbooks and conventions shared across the team.
Personal preferences and operational shortcuts owned by each operator.
Ephemeral agent drafts from the current session. Can be proposed for workspace promotion.
Memory is advisory. Live tool verification always takes precedence over what memory says.
Real Operations.
Against a Live Cluster.
Not mockups or scripted demos. Watch Skyflo handle the workflows your team runs every day.
Faster Diagnosis. Safer Changes.
Auditable Operations.
Architecture is table stakes. These are the operational outcomes that matter to your team.
Faster Incident Diagnosis
Agent correlates logs, events, and resource state in a single pass. No more context-switching across dashboards.
Consistent, Auditable Deployments
No more ad-hoc kubectl runs or untracked mutations. Every change is repeatable and auditable.
Approval Gates on Writes, Not Reads
Read operations flow freely. Mutating tool calls require explicit approval. Your developers move fast. Your infrastructure stays safe.
Your Cluster. Your Agent.
Running in Minutes.
Deploy on your cluster with your own LLM. No Skyflo telemetry or phone-home.
Ready for Your Team?
Scale with Confidence
Team adds collaboration, governance, and integrations. Same agent. Same control loop. Same approval gates.
Chat Integration
Operate from Slack, Microsoft Teams, and more
SCM Integration
Persist changes to GitHub, GitLab, Bitbucket
AI Alerting Agent
Anomaly correlation and proactive detection
RBAC & Governance
Team permissions, audit trails, SSO
An Execution Runtime.
Not a Chat Wrapper.
Every capability maps to an operational outcome.
Natural Language to Typed Execution
Describe what you need in plain English. Skyflo converts intent into schema-validated tool calls, not shell-injected strings.
Unified Cluster Context
Logs, events, resource state, and configuration correlated in one pass. Diagnose a CrashLoopBackOff without switching between five terminals.
Graph-Based Workflow Engine
A LangGraph-powered workflow with distinct phases: planning, approval gate, execution, verification. Deterministic. Replayable. Not a monolithic LLM call.
Live Agent Reasoning
Agent thoughts, tool progress, memory retrievals, and results streamed in real time via SSE. Full visibility into every decision.
Post-Action Verification
The agent validates outcomes against your original intent. Drifts are flagged. Verified findings can be saved to memory for future incidents.
Extensible via MCP
Every tool follows the Model Context Protocol. Typed inputs, sandboxed execution, defined safety model per tool. Community contributions welcome.
Persistent Memory Context
NewPrior incidents, runbooks, and cluster conventions are retrieved from Postgres before each planning turn. The agent starts informed. Writes are policy-gated and safety-scanned. Memory is advisory — live tool verification always takes precedence.
Every capability ships with open source.
Deep Coverage.
Efficient Context by Design.
Each toolset is loaded on-demand, not all at once. The agent starts lean and requests only the schemas it needs for your specific query.
Kubernetes
OrchestrationDiscovery, logs, exec, apply, diff
- Discover resources across namespaces
- Stream pod logs and exec into containers
- Drain and cordon nodes safely
- Preview changes with diff before apply
Helm
Package ManagementSearch, install, upgrade, rollback
- Install charts with custom values
- Upgrade releases with dry-run preview
- Roll back to any previous revision
- Manage chart repositories
Argo Rollouts
Progressive DeliveryPause, resume, promote, abort
- Run canary and blue-green deployments
- Promote or abort with human gate
- Monitor analysis runs and experiments
- Track full rollout history and status
Jenkins
CI/CDJobs, builds, logs, SCM, identity
- Manage and trigger build jobs
- Stream build logs in real time
- Inspect SCM configurations
- Authenticate via Kubernetes Secrets
The agent starts with Kubernetes read-only schemas in context. It calls load_toolset to add Helm, Argo, or Jenkins only when your query needs them. Deep coverage without bloating every turn with unused schemas.
On the Roadmap
Same typed, sandboxed pattern. All open source.
An AI Agent in Your Cluster
Should Be Yours to Audit.
Apache 2.0 licensed. The agent, the control loop, and the safety model are all inspectable and under your control.
Full Source Transparency
Every tool call, every decision path, every safety check is in the source.
Self-Hosted, In-Cluster
Runs inside your Kubernetes cluster. LLM calls go only to the provider you configure.
Bring Your Own LLM, No Lock-in
OpenAI, Anthropic, Gemini, Groq, or self-hosted models. Switch providers without changing workflows.
Safety Is Not a Premium Feature
Approval gates ship with open source. No feature gates on safety. No usage limits.
No black-box agent decisions. No Skyflo telemetry.
Built in the Open
Transparent, auditable, and built for operators managing production Kubernetes.
Open Source
Full source code available under the Apache 2.0 license. Audit every line. No black boxes in your production stack.
Join Our Channels
Connect with operators and developers building on Skyflo.
Frequently Asked Questions
Common questions about Skyflo and approval-gated operations.
Install and Run Your First Operation
Install Skyflo on your cluster and run your first operation today.
curl -fsSL https://skyflo.ai/install.sh | bash