Prompting 101: Anthropic's 10-block prompt skeleton

2 min read
claudeprompt-engineeringsystem-promptfew-shotxml-tagsextended-thinkingprefillvision
View as Markdown
Originally from youtube.com
View source

My notes

Summary

Anthropic Applied AI engineers Hannah and Christian walk a Swedish car-insurance vision task from a one-line prompt (which thinks it’s a skiing accident on Chappangan street) to a production-grade prompt with confident, structured fault verdicts. The session is essentially a live tour of Anthropic’s recommended 10-block prompt skeleton plus the levers (XML, prefill, ordering, extended thinking) that move quality the most.

Key Insight

The 10-block prompt skeleton (in this exact order):

  1. Task context (role and what Claude is doing)
  2. Tone context (e.g. “stay factual and confident, refuse if unsure”)
  3. Background data / documents / images (the static stuff that never changes, perfect for prompt caching)
  4. Detailed task description and rules
  5. Examples (few-shot, ideally wrapped in XML)
  6. Conversation history (if user-facing)
  7. Immediate task / request
  8. Reminder of important guidelines (anti-hallucination instructions, refusal rules)
  9. Thinking step-by-step
  10. Output formatting

Specific tactics surfaced (not generic “be specific” advice):

  • Order matters more than people realise. The presenters explicitly told Claude to read the form before the sketch, same logic a human would use, since the doodle is meaningless without the form’s context. Reordering alone moved Claude from “unsure” to “Vehicle B at fault.” Whenever multiple inputs feed a judgment, prescribe the inspection order.
  • Static vs dynamic content split. The 17-row Swedish form description belongs in the system prompt (cacheable, never changes); the filled-in image goes in the user message. This is the biggest cost lever once you scale, and Claude stops re-narrating the form structure on every call.
  • XML tags beat Markdown for sectioning. Claude was fine-tuned heavily on XML, so wrappers like <form_structure>, <examples>, <final_verdict> give it explicit anchors to refer back to. They also make output parsing trivial.
  • Prefill = forced output shape. Stuffing [ or <final_verdict> as the first assistant token forces JSON or wrapped output without preamble. Use this when piping Claude into a SQL/JSON pipeline.
  • Extended thinking is a prompt-engineering crutch. Turn it on, read the thinking transcript, see how Claude reasons about your data, then bake the missing intuition back into the system prompt. Cheaper long-term than leaving thinking always-on.
  • Tone instruction = hallucination control. “Stay factual, only commit when confident, refuse if the sketch is unintelligible” is a cheap, reliable refusal lever. Pair it with “cite which checkbox you saw”, every factual claim must reference the source row.
  • Few-shot with images works. You can base64-encode example images inside the system prompt and label them with the right verdict. For ambiguous edge cases, the demo presenters recommend tens-to-hundreds of human-labeled examples, that’s the production scale, not 2-3.
  • “Carefully examine” has side effects. Adding “carefully examine each box” made Claude verbose, narrating every checkbox. Verbosity is a knob: dial it in via instructions like “list only checked boxes” if you don’t want the play-by-play.
  • Settings used in the demo: Claude 4 (Sonnet), temperature 0, generous max_tokens. Standard for deterministic structured-output tasks.
  • The reminder block matters. Repeating critical rules at the end of the prompt (after all the context) measurably reduces drift. “Answer only if confident. Cite your evidence. Wrap final answer in <final_verdict>.”