Kubernetes Metrics for AI Agents: `kubectl top` Tools and What They Unlock

What Is `kubectl top` and Why Does It Matter for AI Agents?

kubectl top is a Kubernetes command that displays real-time CPU and memory usage for pods and nodes by querying the Metrics Server. For AI agents, these metrics provide immediate "ground truth" about cluster resource pressure.

The first question in most Kubernetes incidents isn't philosophical. It's:

"What's hot right now?"

CPU and memory usage are blunt metrics, but they quickly answer critical questions:

Is this a resource pressure incident?
Is one pod misbehaving while others are fine?
Are nodes approaching saturation?

How Does Skyflo Expose Kubernetes Metrics as AI Tools?

Skyflo's MCP server wraps kubectl top commands as read-only tools:

Tool	Command	Use Case
`k8s_top_pods`	`kubectl top pods -n <namespace>`	Find memory/CPU-heavy pods
`k8s_top_nodes`	`kubectl top nodes`	Check node-level saturation

Toolk8s_top_pods

Commandkubectl top pods -n <namespace>

Use CaseFind memory/CPU-heavy pods

Toolk8s_top_nodes

Commandkubectl top nodes

Use CaseCheck node-level saturation

Example tool output:

code

NAME                     CPU(cores)   MEMORY(bytes)
api-server-7c9d8f6b4-x   450m         512Mi
api-server-7c9d8f6b4-y   120m         256Mi
payment-svc-5f4d3-abc    980m         1024Mi  ← Potential issue

These tools are marked as readOnlyHint: true, meaning they execute immediately without approval, which is exactly what you want for diagnostic operations.

Why Are Read-Only Metrics Ideal "Grounding Tools" for AI Agents?

Metrics queries establish context before the agent proposes actions. The diagnostic flow becomes:

Agent calls `k8s_top_pods` → Sees api-7c9 has high memory
Agent calls `k8s_describe` → Finds OOMKilled events
Agent calls `k8s_logs` → Identifies memory leak in recent logs
Agent proposes → "Recommend rollback to previous version"

This "gather signals → analyze → propose" pattern only works when metrics tools are:

Low-friction: No approval required
Fast: Sub-second response
Safe: Cannot modify cluster state

How Do Metrics Tools Reduce Operator Cognitive Load?

With kubectl top tools available, operators can use natural language:

Without AI agent:

bash

kubectl top pods -n production --sort-by=memory | head -10
kubectl top nodes
kubectl describe pod api-server-7c9 -n production | grep -A5 "Last State"

With AI agent:

"Show me the hottest pods in production and check if we have a node CPU bottleneck"

The agent translates intent to commands, executes them, and synthesizes results. The operator gets answers without remembering flag syntax.

Related articles:

FAQ: Kubernetes Metrics for AI Agents

What is kubectl top? kubectl top is a Kubernetes command that shows real-time CPU and memory usage for pods and nodes by querying the Metrics Server API.

Do kubectl top tools require approval in Skyflo? No. Metrics queries are read-only operations that execute immediately without human approval, making them ideal for rapid diagnostics.

What's the difference between kubectl top pods and kubectl top nodes? kubectl top pods shows per-pod resource usage within a namespace. kubectl top nodes shows aggregate resource usage per node across the cluster.

How do AI agents use metrics for incident diagnosis? Agents use metrics as "grounding tools" to gather initial signals, then combine with describe, logs, and events to build a complete diagnostic picture before proposing remediation.