How coding agents work - Agentic Engineering Patterns

Source

Summary

Simon Willison breaks down the exact mechanics of how coding agents (Claude Code, Codex, etc.) work under the hood: an LLM in a loop with a system prompt and callable tools. The piece demystifies the full stack from tokenisation and chat-templated prompts through tool calling, token caching optimisation, and reasoning/thinking modes.

Key Insight

  • A coding agent is just a harness around an LLM: system prompt + tools + a loop. The fundamental mechanics are surprisingly simple — a basic tool loop can be built in a few dozen lines of code on top of an LLM API.
  • Tool calling works by injecting tool definitions into the system prompt, parsing structured function calls from the model’s response (often via regex), executing them, and feeding results back as a new message. Most coding agents define 12+ tools; the most powerful are Bash() and Python() for arbitrary code execution.
  • Token caching is a critical cost optimisation: agents are deliberately designed to avoid modifying earlier conversation content so that the provider’s cached input token rate applies. This is an architectural constraint that shapes how agents manage state.
  • Conversations are stateless replays — the entire conversation is replayed on every turn, which means costs grow with conversation length. Understanding this explains why long agent sessions get expensive and why context window size matters.
  • Reasoning/thinking (introduced as a major advance in 2025) lets models spend extra tokens “thinking out loud” before responding. This is especially useful for debugging because it lets the model trace complex code paths and follow function calls back to the source of issues.
  • The system prompt is the invisible control layer — often hundreds of lines long. Willison links to the OpenAI Codex system prompt as a concrete example of what these look like.
  • Vision LLMs process images as tokens in the same pipeline as text (not via separate OCR/image analysis) — a common misconception worth correcting.