TL;DR - Three power-user controls: a system prompt (standing behavior), temperature (creativity vs consistency), and the context window (how much it can hold). They turn good prompting into reliable systems.
Why it matters
These don't change what you ask - they change how reliably you get it. They're what separates a competent user from someone who can make AI behave the same way every time, at scale.
The three controls
- System prompt - a standing instruction applied to every reply: "You are a concise support agent; never give legal advice." Set once, persists.
- Temperature - the creativity dial. Low (0-0.3) = consistent, factual. High (0.8+) = varied, creative.
- Context window - the model's working memory. Huge today, but for long inputs, put key instructions at the top and restate them at the end.
Worked example
Building a support bot? Use a firm system prompt ("always apologize first, never promise refunds over $50") and a low temperature so answers stay consistent - you want reliability, not surprise.
Steal this - settings by job
Factual / templated work -> temperature LOW, firm system prompt
Brainstorming / creative -> temperature HIGH, loose prompt
Long document -> instructions at top AND restated at end
Common mistakes (and the fix)
- High temperature for factual work -> inconsistent answers. Fix: turn it down.
- Losing instructions in a long input. Fix: restate the task at the end.
- No system prompt for repeated tasks. Fix: set the behavior once.
Good to know
Temperature and system prompts live in the APIs for ChatGPT, Claude, and Gemini, and inside builder tools (Custom GPTs, Claude Projects). Context windows are large now - Claude and Gemini handle very long documents - but the "top and tail your instructions" rule still pays off. You'll use all three when you build with APIs in Level 5.