PromptingPractical·May 12, 2026·6 min read

Prompting still matters. Here's the cheap stuff that works.

You don't need a 4,000-token system prompt to ship a good AI feature. You need clear roles, three examples, and a stopping condition. Here's a working template.

Gautam Manak

Founder, doc2mcp

Glass cards labelled System Prompt, Temperature, Chain of Thought, Tool Calls and Evals orbiting a glowing prompt-input sphere on a dark navy background.

Prompt engineering went from "the future of jobs" to "kind of embarrassing to talk about" in 18 months. Both takes are wrong. A good prompt is still the cheapest reliability lever you have, it just doesn't have to be precious.

Four blocks. Always the same four.

Every system prompt we ship in production has exactly four sections:

text

1. Role        — who the model is, in one sentence
2. Capabilities — what tools/data it has
3. Style       — how to format, what to avoid
4. Stop rule   — when to ask vs guess vs escalate

That's it. The 4,000-token monstrosities you see online are 90% ritual. The model isn't reading them as carefully as you think.

Three examples beat ten rules

Models learn faster from examples than from constraints. Pick the edge cases that broke you in dev. Show them as input → expected output. Two or three is usually enough. The "few-shot" framing isn't magic, it's just clearer than prose.

Temperature is a setting, not a religion

For tool calls and structured outputs, temperature: 0 (or close). For brainstorming and copy, 0.7–1.0. There is no principled reason to use 0.5 — that's the "I have no idea" temperature.

Chain-of-thought, with restraint

Asking the model to "think step by step" still works on hard reasoning tasks. It also adds latency and tokens. We only enable it on the routes where evals show it actually moves the needle. Default off, opt in.

Cheap evals, today

Set up the smallest possible eval before you ship: 20 hand-picked inputs with expected outputs (or "must include" substrings). Run them in CI on every prompt change. You'll catch 80% of regressions and feel a lot more comfortable touching the prompt at 3 AM.

for (const { input, expect } of cases) {
  const out = await runAgent(input);
  assert.ok(out.includes(expect), `failed: ${input}`);
}

A template you can paste

markdown

# Role
You are doc2mcp's onboarding assistant. You help developers paste a docs URL and ship a working MCP server.

# Capabilities
You can call:
- search_docs(query)
- generate_mcp(url)
- explain_config(editor)

# Style
- Be direct. No filler intros.
- If the user pastes a URL, generate immediately. Don't ask for confirmation.
- Use short, scannable answers with code blocks.

# Stop rule
- If a tool fails twice in a row, surface the error verbatim and stop.
- If the user asks for advice outside MCP/docs/AI tooling, politely redirect.

You don't need a prompt longer than your function. You need a prompt that tells the model exactly when to stop guessing.

Try it

Paste a docs URL. Get an MCP server in 90 seconds.

Free tier included. Works with Cursor, Claude, Windsurf, VS Code, Codex, and Zed.

Generate your MCP