What Is Programmatic Tool Calling and When Should You Use It?
Programmatic Tool Calling (PTC) is when an AI agent generates and executes code to orchestrate multiple tool calls, rather than making sequential natural language requests.
There's a point where natural language orchestration becomes inefficient. Consider this request:
"Check the health of all pods in production across 12 namespaces and summarize the top failure modes."
Without PTC: The agent makes 12+ sequential tool calls, each requiring a full LLM turn, burning tokens and adding latency.
With PTC: The agent writes a small script that calls tools in parallel, aggregates results locally, and returns a concise summary.
PTC is ideal when you need:
- Loops over multiple resources
- Parallel execution across namespaces/clusters
- Local aggregation before returning results
- Reduced context pollution from intermediate outputs
Why Does Code Beat Natural Language for Loops and Batching?
LLMs have fundamental limitations when it comes to repetitive operations:
| Task | LLM via Natural Language | Code |
|---|---|---|
| Long sequences | Poor consistency, drift | Deterministic |
| Parallel execution | Sequential only | Native async support |
| Intermediate results | Pollute context window | Stay local to script |
| Batching | Inconsistent grouping | Precise control |
The architecture becomes:
- LLM writes a small, sandboxed script
- Script calls
call_tool(...)repeatedly (possibly in parallel) - Only the final printed output enters the conversation context
This keeps context clean while enabling complex orchestration.
How Do You Sandbox LLM-Generated Code Safely?
The moment you allow "execute code," aggressive sandboxing becomes mandatory:
| Security Control | Implementation |
|---|---|
| Import restrictions | Whitelist only safe modules (json, asyncio, etc.) |
| Execution timeouts | Hard limit (e.g., 30 seconds) per script |
| Tool call limits | Maximum number of tool invocations per script |
| Tool restrictions | Default to read-only tools; require explicit opt-in for writes |
| Resource limits | Memory and CPU constraints on execution environment |
PTC should be an orchestration mechanism, not a backdoor shell. The code can only call pre-defined tools—it cannot make arbitrary system calls or network requests.
What Does Programmatic Tool Calling Look Like in Practice?
Here's an example of PTC for checking pod health across namespaces:
async def main():
namespaces = ["default", "kube-system", "prod", "staging"]
# Parallel tool calls - doesn't pollute conversation context
results = await asyncio.gather(*[
call_tool("k8s_get", resource_type="pods", namespace=ns, output="json")
for ns in namespaces
])
# Local aggregation
unhealthy = []
for ns, pods in zip(namespaces, results):
for pod in pods.get("items", []):
if pod["status"]["phase"] != "Running":
unhealthy.append(f"{ns}/{pod['metadata']['name']}")
# Only this output enters the conversation
print(json.dumps({
"namespaces_checked": len(namespaces),
"unhealthy_pods": unhealthy[:10], # Limit output size
"total_unhealthy": len(unhealthy)
}, indent=2))The conversation sees only the final JSON summary, not 12 separate tool outputs.
Related articles:
- The Case for Tool Search: Shrinking Context Without Losing Capability
- MCP in Practice: Standardizing DevOps Tools So AI Can't Go Rogue
FAQ: Programmatic Tool Calling for AI Agents
What is programmatic tool calling (PTC)? PTC is when an AI agent generates executable code to orchestrate multiple tool calls, enabling parallel execution and local aggregation without polluting conversation context.
When should an AI agent use code instead of natural language? Use code for loops over multiple resources, parallel execution across namespaces, batch operations, and any task where intermediate results would bloat context.
How do you secure LLM-generated code execution? Sandbox execution with import restrictions, timeouts, tool call limits, and default read-only access. The code should only interact with pre-defined tools, never arbitrary system calls.
Does PTC replace regular tool calling? No. PTC complements regular tool calling. Simple, single-resource operations should use direct tool calls. PTC is for complex orchestration that would otherwise require many sequential turns.