Why are LLMs so verbose by default?
Models are trained and tuned to be helpful, thorough, and safe, which biases them toward longer answers: they add context, restate your question, hedge with caveats, and wrap the answer in preamble and a friendly sign-off. None of that is malfunction — it's the default register of an assistant optimized to leave no question unaddressed. When you want a terse answer, you're asking the model to behave against that default, which is why you have to be explicit.
"Be concise" fails because it's a soft cue with no target — the model just trims a little and still rambles. What works is converting concision into something checkable: a word or item count, an answer-first ordering, a list of banned filler patterns, and a fixed format. These are the same principles behind clear instruction design in the complete guide to prompt engineering and how to write better prompts.