Key Principle
Because the model has a fixed compute budget per token and processes text unidirectionally, the arrangement of content matters as much as the content itself. The Valley of Meh is an attention dead zone in the early-middle of prompts caused by two interacting phenomena. The sandwich technique, inception, and knapsack-style assembly algorithms treat prompt construction as an engineering optimization problem rather than freeform writing.
Why This Matters
Engineers naively pack context in document order, causing critical information to land in the attention dead zone. The result: outputs that seem to miss obvious context, leading teams to debug content quality when the real problem is content placement. Systematic prompt assembly with scored elements, elastic snippets, and token budget optimization prevents both overflow (truncating critical content) and underutilization (leaving valuable context out).
Good Examples
The Valley of Meh. Two independent phenomena interact: (1) in-context learning gives later tokens disproportionate influence on the completion, (2) the "lost middle" phenomenon means models recall beginnings and endings but not middles. Their intersection creates a trough where context has minimal impact. Place high-importance elements outside this valley. Keep prompts concise — shorter prompts have shallower valleys. (Chapter 6)
The sandwich technique. Structure prompts in four parts: introduction (states the question, activates relevant knowledge early), context elements (the "meat"), refocus (restates the question with specifics after the context), and transition (shifts from problem description to solving). This anchors the task at the two high-attention positions. (Chapter 6)
Inception. Write the first portion of the model's answer as part of the prompt. Because autoregressive generation cannot backtrack, the model accepts this as its own and continues consistently. Removes uncertainty about the completion's opening, making outputs reliable and parseable. (Chapter 6)
Document type selection. Three archetypes: advice conversation (dialogue), analytic report (structured exposition), structured document (XML/YAML/JSON). LLMs respect scope boundaries more consistently in reports than in dialogues. GitHub Copilot used code comments with explicit side remarks to provide cross-file context naturally. (Chapter 6)
Counterpoints
Tokenization inertness breaks budget calculations. Token count is not additive when strings are concatenated — "cat" + "tail" (2 tokens separately) becomes "cattail" (3 tokens: [c][att][ail]). Separate elements with whitespace. Prefer elements starting with a space. (Chapter 6)
Elastic snippets avoid binary include/exclude. Elements with multiple length versions (full chapter down to key quotes) let the assembly engine ask "what is the longest version that fits?" rather than all-or-nothing inclusion. This maximizes information density. (Chapter 6)
Without a refocus, the model drifts. After long context stretches, the model's attention shifts away from the original question, producing off-topic completions. Without a transition, completion models may continue adding context rather than answering. (Chapter 6)
Key Quotes
"The Valley of Meh is caused by the interaction of in-context learning and the lost middle phenomenon, creating an attention dead zone in the early-middle of the prompt." — Berryman & Ziegler, Chapter 6
Rules of Thumb
- Place the most important content at the beginning and end of the prompt, not the middle
- Always include a refocus section after long context — restate the question with specifics
- Use inception to control the opening of the completion and ensure parseability
- Treat prompt assembly as a knapsack problem: maximize information value within token budget
- Use elastic snippets (multiple length versions) instead of binary include/exclude
- Separate snippets with whitespace to maintain tokenization inertness
- Choose document format deliberately: reports for scope control, dialogues for naturalness, structured formats for parsing
Related References
- How LLMs Process Information - Unidirectional attention explains why position matters
- What Goes Into the Prompt - What to include; this reference covers how to arrange it
- Designing LLM Applications - The feedforward pass pipeline that feeds into assembly