Most prompt-engineering advice reads like superstition: add "you are an expert", promise the model a tip, threaten it politely. For someone building real software, that is the wrong frame entirely. A prompt is an interface between your code and a probabilistic system — and the same discipline that makes a good API makes a good prompt.
Specify the contract, not just the request
Vague prompts produce vague, drifting output. The fix is to state precisely what you want: the role, the task, the constraints, and the exact shape of the result. The more deterministic your downstream code, the more rigid that contract should be.
Treat the model like a brilliant contractor with no context. Spell out the deliverable, or accept whatever you get.
Show, don't just tell
A single worked example teaches the model more about format and tone than a paragraph of instructions. Few-shot prompting — a couple of input/output pairs in the prompt — is the highest-return technique there is for anything with a consistent structure. Pick examples that cover your edge cases, not just the happy path.
Demand structured output
If your code consumes the response, never parse prose. Ask for JSON against a schema and validate it, so a malformed reply fails loudly instead of corrupting state downstream.
// ask for a contract, then verify it
const res = await llm.complete({
system: "Reply with JSON only: { sentiment: 'pos'|'neg'|'neu', score: number }",
prompt: review,
});
const out = Schema.parse(JSON.parse(res)); // throws on drift
Test prompts like code
A prompt that works on three inputs can quietly fail on the fourth, and you will never notice without a harness. Build a small evaluation set of representative inputs with expected properties, and re-run it whenever you change the prompt or the model. Prompts are code with fuzzy edges — they deserve the same regression safety net.
- Pin the model version so behaviour does not shift under you.
- Keep an eval set of tricky inputs and assert on the output.
- Version your prompts in the repo, next to the code that uses them.
The teams that get reliable results from LLMs are not the ones with the cleverest wording. They are the ones who treat the prompt as a tested, versioned part of the system — because that is exactly what it is.