AI Harness EngineeringChapter 2 of 19

Part 1Foundations

02

Core Agent Concepts

Sections in this chapter

  1. 1The agent loop
  2. 2State and continuation
  3. 3Tools and function schemas
  4. 4Handoffs and specialist sub-agents
  5. 5Single-agent versus multi-agent: a decision framework
  6. 6When not to use an agent at all
  7. 7Canonical loop architectures
  8. 8Agent looping failure

Key Takeaways

Insight

The hardest production bug class is state-shape inconsistency: the agent thinks the system is in state , but it is actually in state because a previous tool call partially succeeded. Mitigations: idem

Insight

These four are not alternatives you pick from a menu. They are primitives you compose. Most real agents are ReAct with a Plan-and-Execute phase at the top and a Reflexion-style memory update at the bo

Common Trap

Let's use a supervisor and three workers" is not an answer. It is a design starting point that you now have to defend with specific failure modes, specific coordination costs, and specific benefits

Interview Questions

1

Walk me through the ReAct loop. Where does it fail?

Frame: state the four operations (observe, plan, act, verify). Name the three classic failures: repeated-action loop, premature conclusion, action drift. Contrast with Plan-and-Execute (better plan artefact, worse plan staleness) and Reflexion (learns across attempts, risks reflection poisoning).

2

Your agent enters an infinite loop of the same tool call. Give me three layers of defence.

Frame: hard budgets (step/token/wall-clock) as the floor; behavioural duplicate-call detection in the middle; structural idempotency and forced exploration as the ceiling. Note that layer 1 alone is insufficient because it burns budget before firing.

3

When would you recommend a simple chain over an agent?

Frame: if the decision graph can be drawn in advance, use a chain. Agents pay for non-determinism; pay only when the next action genuinely depends on the previous result.

4

Design a handoff protocol between a reader agent and a writer agent.

Frame: what context is serialised at the boundary; what the reader's output contract looks like; how authority flips; how the handoff-depth limit is enforced; where the approval gate sits; who owns the final answer.

5

Given a task, how do you decide single-agent vs. multi-agent?

Frame: the single-agent defaults (one context, one permission scope, one trace) against the multi-agent triggers (role separation, parallel exploration, permission boundaries, scale, model routing). Insist on at least two triggers before adopting multi-agent.

6

What's the difference between a tool and a handoff?

Frame: a tool returns a result to the same agent; a handoff transfers control. Tools share the caller's context; handoffs serialise a new one. A handoff is typically used when the sub-problem wants its own permission scope or its own model.