What does temperature do?
At each step a model assigns a probability to every possible next token. Temperature rescales those probabilities before one is sampled. A low temperature sharpens the distribution so the most likely tokens dominate, making output more deterministic and repetitive. A high temperature flattens the distribution so less-likely tokens get a real chance, making output more varied and unpredictable.
Practically: low temperature for tasks with a right answer (extraction, classification, code, factual Q&A) and higher temperature for tasks where variety is the goal (brainstorming, fiction, marketing copy). Per the OpenAI API reference, the chat `temperature` parameter ranges from 0 to 2, with higher values producing more random output.