Diagnosing and Fixing Bad Outputs

TL;DR - When output is wrong, vague, or off-tone, don't blame the AI - diagnose which prompt part failed and fix that. Bad output is feedback on your prompt.

Why it matters

Beginners give up after one bad result. Experts treat it like debugging: each failure points to a missing piece. Fix the piece and the next attempt works.

Worked example - the diagnosis table

| Symptom | Likely cause | Fix |

Too generic -> missing context/persona -> add a role and specifics.

Wrong length/shape -> no format -> state length + structure.

Made-up facts -> asked for something it can't know -> give it the data, or use a tool that browses.

Inconsistent -> no examples -> add few-shot examples or lower temperature.

Off-tone -> tone unstated -> name the tone or paste a sample.

Steal this - the fix checklist

Bad output? Ask in order:
1. Did I give context/role?      -> add it
2. Did I state the format?       -> add length + structure
3. Did I give the input/facts?   -> paste them
4. Did I show an example?        -> add 1-2
5. Still off? -> refine in one small follow-up, don't restart

Common mistakes (and the fix)

Blaming the model. Fix: assume the prompt is the lever.
Changing five things at once. Fix: change one, re-run, learn what mattered.
Rewriting from scratch. Fix: iterate on the existing prompt.

Good to know

This diagnose-and-iterate habit is the difference between "AI is unreliable" and "AI is a power tool." It carries straight into building evals for prompts (testing across many cases) when you start shipping AI in workflows and apps.