Library
Prompt Engineering for LLMs · 9 of 12
Prompt Engineering for LLMs
ai HIGH

Collected Heuristics and Rules of Thumb

Prompt Engineering for LLMs John Berryman and Albert Ziegler
heuristics quick-reference best-practices pitfalls

Key Principle

These heuristics are distilled from across the entire book. Each follows from the core thesis that LLMs are text completion engines. Use this as a quick-reference checklist when designing, debugging, or reviewing prompt-based systems.

Why This Matters

Prompt engineering failures are predictable — the same mistakes recur because engineers don't internalize the model's constraints. These rules encode the most common failure modes and their fixes, saving iteration cycles.

Prompt Design

  • Think document, not conversation. Shape every prompt to resemble a document the model saw in training. (Ch. 2, 4)
  • Instructions before content. The model reads left to right; instructions placed after content are ignored. (Ch. 2)
  • Avoid the Valley of Meh. Place critical content at the beginning and end, not the middle. (Ch. 6)
  • Use inception. Write the first few tokens of the model's answer to control format and direction. (Ch. 6)
  • Refocus after long context. Restate the question before the transition to answering. (Ch. 6)
  • Use familiar formats. Homework problems, markdown, transcripts — not ad hoc structures. (Ch. 4)

Content Selection

  • Precision equals recall. Irrelevant context actively misleads via Chekhov's Gun — filter aggressively. (Ch. 5)
  • Instructions outrank context. In the token budget, task clarification always gets higher priority than retrieved content. (Ch. 5)
  • Shuffle few-shot examples. Prevent spurious pattern detection from incidental ordering. (Ch. 5)
  • Keep few-shot examples minimal when context is rich. They compete for the same token budget. (Ch. 5)

Reasoning and Output

  • Use chain-of-thought for complex tasks. The model has zero compute budget for deliberation without reasoning tokens. (Ch. 8)
  • Reasoning before answer, never after. Post-hoc explanations are rationalizations that cannot improve the committed answer. (Ch. 8)
  • Use logprobs as quality signals. Average exp(logprobs), not raw logprobs. (Ch. 7)
  • Unique first tokens for classification. Prevent token conflation when options share prefixes. (Ch. 7)
  • Temperature 0 for factual tasks. Higher temperature compounds errors through self-reinforcing patterns. (Ch. 2)

Tools and Agency

  • Never trust prompt-only safety. The model will sometimes do exactly what you told it not to. Intercept dangerous requests in code. (Ch. 8)
  • Minimal tool definitions. Few tools, few arguments, meaningful names, no superfluous output fields. (Ch. 8)
  • Agents for exploration, workflows for execution. Use conversational agents for open-ended tasks; use structured workflows for reliable complex tasks. (Ch. 9)
  • If your system message keeps growing, build a workflow. Excessive rules in a system message signal that you need deterministic control. (Ch. 9)

Evaluation and Operations

  • Evaluation first, always. Write evaluation code before any prompt code. (Ch. 10)
  • Use SOMA for LLM-as-judge. Specific questions, ordinal scales, multiple aspects. (Ch. 10)
  • LLM assessment is relative. Compare versions, don't trust absolute scores. (Ch. 10)
  • Record latency and token consumption. Cheap to collect, critical for catching regressions. (Ch. 10)
  • Don't bake model choice into code. Use abstraction layers — the landscape changes weekly. (Ch. 7)

Debugging

  • Ask "What document does the model think this is?" Before debugging content, check if the format is misleading the model. (Ch. 4)
  • Check content placement before content quality. The Valley of Meh causes more issues than content gaps. (Ch. 6)
  • Use echo logprobs to find prompt anomalies. Tokens with very negative logprobs indicate typos or surprises. (Ch. 7)
  • Character-level tasks fail by design. The model operates on tokens, not characters. Offload to code. (Ch. 2)

Key Quotes

"Don't stray far from the path upon which the model was trained." — Berryman & Ziegler, Chapter 4

"Models are inherently undependable, and with a strategy like this, we guarantee that a small portion of the time, the model will do exactly the thing you told it not to do." — Berryman & Ziegler, Chapter 8

Related References