TL;DR - When output is wrong, vague, or off-tone, don't blame the AI - diagnose which prompt part failed and fix that. Bad output is feedback on your prompt.
Why it matters
Beginners give up after one bad result. Experts treat it like debugging: each failure points to a missing piece. Fix the piece and the next attempt works.
Worked example - the diagnosis table
| Symptom | Likely cause | Fix |
Too generic -> missing context/persona -> add a role and specifics.
Wrong length/shape -> no format -> state length + structure.
Made-up facts -> asked for something it can't know -> give it the data, or use a tool that browses.
Inconsistent -> no examples -> add few-shot examples or lower temperature.
Off-tone -> tone unstated -> name the tone or paste a sample.
Steal this - the fix checklist
Bad output? Ask in order:
1. Did I give context/role? -> add it
2. Did I state the format? -> add length + structure
3. Did I give the input/facts? -> paste them
4. Did I show an example? -> add 1-2
5. Still off? -> refine in one small follow-up, don't restart
Common mistakes (and the fix)
- Blaming the model. Fix: assume the prompt is the lever.
- Changing five things at once. Fix: change one, re-run, learn what mattered.
- Rewriting from scratch. Fix: iterate on the existing prompt.
Good to know
This diagnose-and-iterate habit is the difference between "AI is unreliable" and "AI is a power tool." It carries straight into building evals for prompts (testing across many cases) when you start shipping AI in workflows and apps.