Part 2 — Context and Instruction Engineering

Instruction Engineering for Agents

Sections in this chapter

1The three tiers of agent context
2What belongs in the system prompt
3Structured outputs
4Constraint design
5Failure-aware prompting
6Prompt versioning
7Prompt libraries and policy variables
8Spotlighting: untrusted content marking
9Context window economics

Key Takeaways

Insight

A useful rule of thumb: if it can be authored by someone who shouldn't be able to change the agent's behaviour, it doesn't belong in the system prompt. Retrieved content, user messages, and tool outpu

Insight

The ten-failure-modes exercise is the single best prompt-quality investment a team can make. Do it on paper, then do it on traces (real production failures), then repeat monthly. Prompts that ignore f

Common Trap

Confidence fields are only useful if they are calibrated. A model that returns 0.9 on every response is not reporting confidence; it is reporting a constant. Calibrate by evaluating: sample 100 output

Common Trap

The most expensive prompt is the one no one can find the current version of. "I think Sarah updated it a few weeks ago" is a production-incident sentence. If the current live prompt is not the one a

Interview Questions

Your agent keeps hallucinating file paths. Fix it without changing the model.

▲

Frame: this is an instruction-layer and context problem, not a model problem. Prescribe: (a) enumerate the valid file paths in the system prompt or as retrieved context; (b) add a list_files tool and require the agent to call it before referencing any file; (c) add an output guardrail that validates any p

How do you version a prompt and prevent regressions when the model provider silently updates?

▲

Frame: prompts in source control with changelogs; prompt pinning by version; a regression eval suite in CI; scheduled daily evaluation runs against the golden dataset with alerting on regression; canary traffic for new model versions. The eval suite is the only actual line of defence; everything else is or

Explain spotlighting and when you'd use it.

▲

Frame: structural separation of untrusted content from instructions via unambiguous delimiters. Use it whenever any external content (retrieved documents, tool outputs, web pages, user messages from low-trust channels) is in the agent's context. Acknowledge it is a mitigation, not a cure — pair with output

What belongs in the system prompt versus the task prompt versus retrieved context?

▲

Frame: the three-tier model (trust decreasing top to bottom), concrete examples of each, and the rule that anything an untrusted party could author does not belong in the system prompt.

How do you design constraints in a prompt that the model will actually follow?

▲

Frame: concrete beats abstract; pair with fallback; ground the why when it matters; check for contradiction; verify with evals, not intuition.

What's the economics of the context window?

▲

Frame: tokens cost money, add latency, dilute attention, and have the lost-in-the-middle effect. Design rules: order by importance, evict aggressively, retrieve less-but-better, cache stable prefixes. Cache hit rate is a tracked metric.