Model card · Verified against xAI docs · 2026-06-20

Grok-4: Full Spec Sheet (June 2026)

By The DDH Team at Digital Dashboard Hub·Updated June 20, 2026

Stop writing AI prompts from scratch.

Tell us your business + your task + your model. We write the prompt — perfectly tuned for ChatGPT, Claude, Grok, Gemini, Midjourney, or any model. Plus 500+ pre-built prompts in your library.

Grok-4 is xAI's flagship model, released July 2025 as the successor to Grok 3 and Grok 3 Beta. It is the most distinctive frontier model in the 2026 menu: native real-time access to the X (Twitter) firehose, an explicit 'Think' mode for extended reasoning, and a competitive position on benchmark scores against GPT-5, Claude Opus 4.7, and Gemini 2.5 Pro despite xAI being the youngest of the major labs.

Headline numbers: $3 per 1M input tokens, $15 per 1M output, $0.75 per 1M for cached input (75% off). Context window is 256,000 tokens. Max output is 64,000 tokens per response. Knowledge cutoff is November 2024 (with real-time X data on top via the search tool). Modalities are text + image input; text output. Function calling, parallel tool use, structured outputs (JSON Schema), real-time X search, and the Live Search tool are all supported.

Below: full spec table, when Grok-4 is the right call vs GPT-5 or Claude Opus, what the X integration actually does, the minimal API request, and 8 FAQs. Sibling pages: GPT-5 spec sheet · Claude Opus 4.7 spec sheet · Grok-4 cost calculator. Write a Grok-tuned prompt free with our ChatGPT prompt generator.

Digital Dashboard Hub

Writing good prompts for ONE AI is hard. Writing them for GPT-5, Claude, Gemini, Perplexity, Midjourney and 6 more is a full-time job. DDH's AI Prompt Builder writes once, runs everywhere — locked to your niche, voice, and brand tone.

Free 14 days, no card. →

Grok-4 — Full spec sheet (June 2026)

Feature	Grok-4 spec
Provider	xAI
Model ID (API)	grok-4
Released	July 2025
Input price (per 1M)	$3.00
Cached input price (per 1M)	$0.75 (75% off)
Output price (per 1M)	$15.00
Live Search tool (per 1k sources)	$0.025
Context window	256,000 tokens
Max output tokens	64,000 tokens
Modalities (input)	Text, image
Modalities (output)	Text
Function calling
Parallel tool use
Structured outputs (JSON Schema)
Streaming
Think mode (reasoning)
Live Search (X + web)
Vision (image understanding)
Knowledge cutoff	November 2024
Endpoint	api.x.ai/v1/chat/completions (OpenAI-compatible)

Sources verified 2026-06-20: xAI models documentation (https://docs.x.ai/docs/models), xAI pricing (https://docs.x.ai/docs/models#models-and-pricing), xAI Live Search docs (https://docs.x.ai/docs/guides/live-search). The Live Search tool is billed separately at $0.025 per 1,000 sources processed. Grok's API is OpenAI-compatible — point the OpenAI SDK at api.x.ai/v1 with an xAI key. Re-verify the live pages before budgeting.

What Grok-4 actually is (and what makes it different)

Grok-4 is xAI's first model to land on the same benchmark tier as the GPT-5 / Claude Opus 4.7 / Gemini 2.5 Pro frontier. Where Grok 3 lagged the frontier by 6-12 months on most benchmarks, Grok-4 is competitive on reasoning, code, and math — and ahead on a small set of benchmarks that reward real-time data access.

What makes it structurally different: native real-time access to the X (Twitter) firehose via the Live Search tool. Ask Grok-4 'what are people saying about the iPhone 18 launch right now?' and it issues a live X search, processes the recent posts, and synthesizes a response with citations to specific X posts. No other frontier model has this — Perplexity comes closest with web search but doesn't have first-class X data.

Think mode is xAI's name for explicit extended reasoning, comparable to GPT-5's `reasoning_effort: high` or Claude's extended thinking. Enabled via the `reasoning_effort` parameter (`low` or `high` — Grok-4 does not expose `medium`). When enabled, Grok-4 burns reasoning tokens before producing the visible answer, billed at the output rate.

The API is OpenAI-compatible: point the OpenAI Python or Node SDK at `api.x.ai/v1` with an xAI API key, change the model ID to `grok-4`, and existing OpenAI integration code works without further changes. That portability lowers the migration cost from GPT-5 dramatically.

Pricing math: what Grok-4 actually costs per call

Standard rates: `cost = (input_tokens / 1M) × $3 + (output_tokens / 1M) × $15`. The representative 1,000-in / 500-out call: `0.001 × $3 + 0.0005 × $15 = $0.003 + $0.0075 = $0.0105`. About 1¢ per call — same as Claude Sonnet 4.6, about 5× more than gpt-5-mini, about 5× less than Claude Opus 4.7.

Cached input bills at $0.75/M (75% off). xAI's caching is automatic and prefix-based, similar to OpenAI's: structure your prompt prefix-first (stable system + tools), put dynamic context last in user messages, and cache hits trigger automatically on subsequent calls within the cache window.

Live Search adds $0.025 per 1,000 sources processed. A typical query that pulls 20-50 recent X posts adds <$0.002 per call — usually a negligible fraction of the LLM cost. For high-volume real-time monitoring (e.g., scanning 10K posts per query), Live Search billing becomes a meaningful line item.

Reasoning tokens (Think mode `high`) bill at the output rate. A 3,000-token reasoning burn adds `0.003 × $15 = $0.045` per call — meaningfully more than the base call. Use Think mode deliberately. Worked $ at scale: Grok-4 cost calculator.

The X integration: when it actually matters

Grok-4's Live Search tool can target three sources: the X (Twitter) firehose, the open web, or RSS feeds (per the xAI docs). Configure via the `search_parameters` block in the API request. The model decides when to invoke Live Search based on the user query's temporal sensitivity ('what happened today', 'current sentiment', 'recent posts about').

When the X integration actually matters: social-listening workloads (sentiment on a brand, reactions to a news event, viral content tracking), real-time event response (during a product launch, an outage, a market event), influencer-aware research (what are the relevant accounts in this space saying), trend detection.

When it doesn't matter: classification, extraction, code synthesis, content generation, anything that doesn't depend on what was posted in the last 24 hours. For these workloads, Grok-4 is competing on raw model quality vs GPT-5 / Claude Opus / Gemini 2.5 Pro — and the choice should be made on cost, ecosystem fit, and benchmark performance on your specific tasks.

Caveat: the X integration's value depends on whether the X firehose has signal for your use case. B2B software developers will find less utility than consumer-brand marketing teams or news organizations.

Context window: 256K, between Opus and GPT-5

Grok-4 ships with a 256,000-token context window — larger than Claude Opus 4.7 (200K), smaller than GPT-5 (400K), much smaller than Gemini 2.5 Pro (1M). Comfortably handles RAG workloads, multi-turn conversations, code review of moderate-sized files. Not the right pick for full-codebase or full-book reasoning where Gemini 2.5 Pro's 1M context is necessary.

Max output is 64,000 tokens per response — same as Claude. For generated content longer than 64K, chunk via multiple calls.

Recall across the 256K window is generally strong per xAI's public benchmarks, comparable to Opus 4.7 at similar context utilization. The practical bottleneck for most workloads is cost, not recall.

Think mode: explicit reasoning, two levels

Grok-4 exposes `reasoning_effort` with two levels: `low` (model-decided light reasoning, default) and `high` (extended thinking, comparable to GPT-5's `reasoning_effort: high`). Unlike GPT-5's four-level dial (`minimal`/`low`/`medium`/`high`), Grok-4 is binary on the production-API surface.

Reasoning tokens are billed at the output rate ($15/M) and are not returned to you (xAI returns reasoning content as a separate `reasoning` block on supported endpoints; consult docs for current behavior).

When to use `high`: complex code synthesis, math/proof tasks, multi-step planning with branching logic, deep analysis tasks where the output quality justifies the 2-5× cost. When to use `low`: chat, content generation, classification, extraction, anything where the model's instinct response is good enough.

Function calling, structured outputs, OpenAI-compatible API

Grok-4 supports the OpenAI-compatible function calling API: define tools as JSON Schema in the `tools` parameter, the model picks one (or several in parallel) and returns arguments. Parallel tool use is supported and on by default.

Structured outputs via `response_format: {type: 'json_schema', json_schema: {...}}` — same shape as OpenAI's structured outputs. xAI guarantees the output validates against the schema for the responses where the model would have produced a tool call.

The OpenAI-compatible API is the easiest migration story in the frontier menu. `client = OpenAI(base_url='https://api.x.ai/v1', api_key=os.environ['XAI_API_KEY'])` and your existing OpenAI integration code runs against Grok-4 with the model ID change.

When to pick Grok-4 vs GPT-5 vs Claude Opus vs Gemini 2.5 Pro

**Pick Grok-4** when real-time X data is part of the task, when you want OpenAI-compatible API migration speed at a mid-tier frontier price, or when you've benchmarked Grok-4 and it specifically wins on your task class. The price-performance position vs Claude Sonnet 4.6 is competitive ($3/$15 both); the differentiator is X access and reasoning effort.

**Pick GPT-5** when broader ecosystem matters (Responses API, Assistants, structured-output maturity), when 400K context is needed, or when reasoning-effort granularity (4 levels vs 2) matters for cost optimization.

**Pick Claude Opus 4.7** when long-form writing voice, refusal-calibration discipline, or extended thinking depth on the hardest tasks is the bottleneck. 5× more expensive than Grok-4 on input — pay the premium only when the quality gap is measurable.

**Pick Gemini 2.5 Pro** when 1M context, native video/audio input, or built-in tools (code execution, Search grounding) replace custom orchestration. Better feature-for-price ratio than Grok-4 on multimodal workloads.

Verified sources and how to re-check the numbers

Every number on this page was verified against xAI's live documentation on 2026-06-20. Sources: docs.x.ai/docs/models for context, modalities, and feature support; docs.x.ai/docs/models#models-and-pricing for input/output/cached prices; docs.x.ai/docs/guides/live-search for Live Search pricing and configuration.

xAI's pricing has been stable since Grok-4's launch, but xAI is the youngest of the major labs and pricing changes are likely as the platform matures. Re-verify quarterly.

Methodology: when a number could not be cross-confirmed against an official xAI page on the verification date, it was omitted from this card rather than guessed.

Make your first Grok-4 API call in 5 steps

1
Get an xAI API key
console.x.ai → API Keys → Create. Add credits before first call. Set `XAI_API_KEY=...` in `.env`.
2
Use the OpenAI SDK with the xAI base URL
No xAI-specific SDK needed. Python: `from openai import OpenAI; client = OpenAI(base_url='https://api.x.ai/v1', api_key=os.environ['XAI_API_KEY'])`. Existing OpenAI integration code works.
3
Send a minimal call
Python: `r = client.chat.completions.create(model='grok-4', messages=[{'role': 'user', 'content': 'Hello'}]); print(r.choices[0].message.content)`. Same shape as GPT-5 chat completions; only the model ID and base URL differ.
4
Enable Live Search for real-time queries
For X-aware queries: pass `search_parameters={'mode': 'auto', 'sources': [{'type': 'x'}]}`. Grok-4 decides when to invoke Live Search based on the query's temporal sensitivity. Adds $0.025 per 1k sources processed.
→ Open the ChatGPT prompt generator
5
Enable Think mode for hard reasoning
For complex code synthesis, math, or multi-step planning: pass `reasoning_effort='high'`. Grok-4 burns reasoning tokens before producing the visible answer; reasoning tokens bill at the output rate ($15/M). Use deliberately — `high` typically 2-5× the cost of `low`.

Digital Dashboard Hub

The prompt patterns above work 10x better when they live in a library you actually own — tunable to your niche, exportable to GPT-5, Claude, Gemini, Perplexity, Midjourney, Llama. Stop pasting across 6 tools.

Try DDH's AI Prompt Builder — free 14 days, no card. →

Related calculators

OpenAI Pricing Calculator →GPT-5.5, 5.4, mini, nano — full per-call cost in one input.Claude Pricing Calculator →Opus 4.8, Sonnet 4.6, Haiku 4.5, Fable 5 — input + output combined.Context Window Comparison →Max input length and price per 1M for every current model.

Related prompt tools

Prompt generator (Grok-tuned)→GPT-5 spec sheet→Claude Opus 4.7 spec sheet→Gemini 2.5 Pro spec sheet→Grok-4 cost calculator→

Frequently Asked Questions

How much does Grok-4 cost in 2026?

$3 per 1M input tokens, $15 per 1M output tokens, $0.75 per 1M for cached input (75% off). Live Search bills separately at $0.025 per 1,000 sources processed. A representative 1,000-in / 500-out call costs ~$0.0105 — same as Claude Sonnet 4.6, about 5× more than gpt-5-mini. Source: docs.x.ai/docs/models, verified 2026-06-20.

What is Grok-4's context window?

256,000 tokens — larger than Claude Opus 4.7 (200K), smaller than GPT-5 (400K), much smaller than Gemini 2.5 Pro (1M). Comfortably handles RAG, multi-turn chat, and moderate-sized code review. Max output is 64,000 tokens per response.

Does Grok-4 have real-time access to X (Twitter)?

Yes, via the Live Search tool. Configure with `search_parameters={'mode': 'auto', 'sources': [{'type': 'x'}]}`. Grok-4 decides when to invoke Live Search based on query's temporal sensitivity. Bills at $0.025 per 1,000 sources processed. The only frontier model with first-class X data access.

Is Grok-4's API compatible with OpenAI?

Yes. Point the OpenAI Python or Node SDK at `https://api.x.ai/v1` with an xAI API key, change the model ID to `grok-4`. Existing OpenAI integration code works without further changes — fastest migration path of any frontier model.

What is Think mode on Grok-4?

xAI's name for explicit extended reasoning. Enable via `reasoning_effort='high'` (default is `low`). Grok-4 burns reasoning tokens before producing the visible answer; reasoning tokens bill at the output rate ($15/M). Use for complex code synthesis, math, multi-step planning; skip for classification/chat.

What is Grok-4's knowledge cutoff?

November 2024 for the base model. With Live Search enabled, Grok-4 can fetch real-time X posts and web pages to augment the base knowledge. The cutoff is effectively 'now' for queries that trigger Live Search; baseline cutoff applies for everything else.

Does Grok-4 support vision?

Yes — text + image input via the standard OpenAI-compatible vision message format. Pass images as URLs or base64-encoded data inside a user message's content array. Output is text only.

Where is Grok-4 available?

xAI API (api.x.ai), the Grok consumer apps (grok.com, X.com Premium and SuperGrok tiers), and the X mobile/web apps for X Premium subscribers. API and consumer billing are separate.

Real-time X access is power. Wasted Think tokens are bill.

Our AI Prompt Generator writes Grok-tuned prompts (reasoning-mode aware, Live-Search-targeted, OpenAI-API-compatible) based on YOUR business + task. 14-day free trial of DDH Pro, no card.

Browse all prompt tools →