Skip to contentNew: Does ChatGPT recommend your brand? Free 60-second AI visibility check →
By The DDH Team · Digital Dashboard Hub

What Is a Token in AI? (2026)

Tokens are how a language model counts text — and how every major provider counts your bill. Here is the plain-English version.

By The DDH Team at Digital Dashboard HubUpdated

A token is the basic unit of text an AI language model reads and writes. In English a token is roughly 4 characters, or about 0.75 words — so 1,000 tokens is around 750 words (a rough estimate, per OpenAI and Anthropic docs). Models do not see letters or words directly; they see sequences of tokens.

Tokens matter because they are the unit every provider uses to bill you and to measure how much text fits in a request. If you want to turn token counts into dollars for a specific model, use our AI prompt cost calculator.

Digital Dashboard Hub

Writing good prompts for ONE AI is hard. Writing them for GPT-5, Claude, Gemini, Perplexity, Midjourney and 6 more is a full-time job. DDH's AI Prompt Builder writes once, runs everywhere — locked to your niche, voice, and brand tone.

Free 14 days, no card.

Token planning cheatsheet (English, rough estimates)

Feature
Approximate tokens
1 token~4 characters / ~0.75 words
1 word~1.33 tokens
1 sentence (~15 words)~20 tokens
1 paragraph (~100 words)~133 tokens
1 single-spaced page (~500 words)~667 tokens
1,000 tokens~750 words

Rough English estimates per OpenAI and Anthropic documentation; actual counts vary by language, code, and formatting. Count real tokens for cost-sensitive work. Pricing as of June 2026 — see OpenAI (https://developers.openai.com/api/docs/pricing), Anthropic (https://claude.com/pricing), Google (https://ai.google.dev/gemini-api/docs/pricing).

How does tokenization work?

Before a model processes text, a tokenizer splits it into tokens using a learned vocabulary. Most modern models use subword tokenization (variants of byte-pair encoding), which means common words are usually a single token while rare words, long words, code, and non-English text get broken into several pieces.

A few practical consequences fall out of this. Common English words like 'the' or 'time' are one token each. A word like 'tokenization' may be two or three tokens. Whitespace and punctuation count too — a leading space is often part of a token. Emoji and many non-Latin scripts cost more tokens per visible character, so the same sentence in, say, Japanese or Arabic can use noticeably more tokens than its English equivalent.

The 4-characters-per-token rule is only a planning heuristic. For anything cost-sensitive, count real tokens with the provider's own tooling rather than estimating from word count.


Why do tokens matter for cost?

Every major API prices per million tokens (MTok), and almost always charges more for output (the tokens the model generates) than for input (the tokens you send). That asymmetry is the single most important thing to internalize: a long answer can cost several times more than a long prompt of the same length.

As of June 2026, for example, GPT-5.4 is $2.50 in / $15.00 out per 1M tokens and the cheaper GPT-5.4-mini is $0.75 / $4.50 (see OpenAI pricing). Claude Sonnet 4.6 is $3 / $15 and Claude Haiku 4.5 is $1 / $5 (see Anthropic pricing). Gemini 2.5 Flash is $0.30 / $2.50 (see Google Gemini pricing). Prices change, so treat these as directional and check the live pages.

Two features cut token costs sharply when your prompts repeat. Prompt caching lets you reuse the processed form of a stable prefix — on Anthropic, a cache read is about 10% of the base input price. Batch APIs trade latency for a discount — Anthropic's Batch API is 50% off both input and output. Both are documented on the Claude API pricing detail page.


Why do tokens matter for context?

The other reason tokens matter is the context window: the maximum number of tokens — prompt plus output — a model can consider in a single request. Everything you send (system prompt, instructions, retrieved documents, conversation history) and everything the model generates has to fit inside that budget.

Run out of room and you have to truncate, summarize, or retrieve only the most relevant pieces rather than dumping everything in. That is exactly why estimating token counts up front is a core skill. For the full picture on limits and 1M-token windows, see what is a context window.


How do I count or estimate tokens?

For quick planning, divide your character count by 4, or multiply your word count by about 1.33. That gets you close enough to decide whether something will fit and roughly what it will cost.

For real budgeting, count actual tokens. Providers expose tokenizers and token-counting endpoints, and most SDKs report exact input/output token usage on every response — log those numbers and you will know your true cost per call. When in doubt, measure on your own representative text rather than trusting a generic ratio, because code, JSON, and non-English content all skew the estimate.

Frequently Asked Questions

How many tokens is one word?

In English a word averages about 1.33 tokens, and one token is roughly 0.75 words or 4 characters — a rough estimate per OpenAI and Anthropic docs. Rare or long words and non-English text use more tokens per word.

How many words is 1,000 tokens?

Roughly 750 words of English, give or take depending on vocabulary and formatting. Code and structured data such as JSON usually pack fewer words into the same token count.

Do input and output tokens cost the same?

No. Almost every provider charges more for output (generated) tokens than input (sent) tokens — often 5x or more. See current rates on OpenAI, Anthropic, and Google Gemini.

Why does my non-English text cost more tokens?

Subword tokenizers are trained mostly on English, so many non-Latin scripts and accented characters get split into more tokens per visible character. The same sentence can cost noticeably more in another language.

How do I count tokens accurately?

Use the provider's tokenizer or token-counting endpoint, or read the input/output token usage that the SDK returns on each response. For estimates, divide characters by 4. To turn counts into dollars, try our AI prompt cost calculator.

How are tokens related to the context window?

The context window is the maximum number of tokens — prompt plus output — a model can handle in one request. Everything you send and everything it generates must fit inside it. See what is a context window.

Can I reduce token costs?

Yes. Trim redundant instructions, use a smaller model where quality allows, cache stable prompt prefixes (cache reads are about 10% of input price on Anthropic), and use Batch APIs for non-urgent jobs (50% off on Anthropic). See the Claude API pricing detail.

Turn tokens into dollars

Estimate the real per-call cost of any prompt across current models.

Browse all prompt tools →