Skip to contentNew: Does ChatGPT recommend your brand? Free 60-second AI visibility check →
By The DDH Team · Digital Dashboard Hub

Mistral API Cost Calculator (2026)

By The DDH Team at Digital Dashboard HubUpdated

Stop writing AI prompts from scratch.

Tell us your business + your task + your model. We write the prompt — perfectly tuned for ChatGPT, Claude, Grok, Gemini, Midjourney, or any model. Plus 500+ pre-built prompts in your library.

14 days, no card. Cancel in 2 clicks.

Mistral charges per token through La Plateforme, its hosted API. Every call has two priced streams: input tokens (the prompt, system message, prior turns you replay, tool definitions) and output tokens (everything the model writes back, including tool-call arguments and reasoning where the model emits it). Input and output bill at different per-1M rates, with output typically 3-5x more expensive than input across the Mistral family.

As of June 2026, Mistral prices span a 75x range from Mistral Small 4 ($0.10 input / $0.30 output per 1M tokens) up to Mistral Medium 3.5 ($1.50 / $7.50). The headline story of the year is **Mistral Large 3 (2512)** — same model family at $0.50 input / $1.50 output, a 75% price drop versus Large 2 ($2.00 / $6.00) on what is, in most evals, a stronger model. That single move repositioned Mistral as one of the cheapest serious frontier APIs in the market.

Mistral's market position is unique: EU-sovereign data residency, GDPR-native, EU AI Act compliant, open-weight friendly. The hosted La Plateforme API sits price-wise between DeepSeek (cheapest) and OpenAI (most expensive), but the regulatory story is the moat — for EU enterprises that cannot send data to US-controlled clouds, Mistral is the default frontier vendor by default.

Below: the full June-2026 price table verified against Mistral's live pricing page, the canonical cost formula, four worked examples (1k, 100k, 1M, and a 5-turn agent loop), comparisons to OpenAI and DeepSeek, the breakeven math on La Plateforme vs self-hosting open weights, and the FAQ that captures everything that trips teams up on their first invoice. Sibling calculators: OpenAI API cost · GPT-5 cost · DeepSeek cost. Quickly draft Mistral-tuned prompts with our free ChatGPT prompt generator.

Digital Dashboard Hub

Writing good prompts for ONE AI is hard. Writing them for GPT-5, Claude, Gemini, Perplexity, Midjourney and 6 more is a full-time job. DDH's AI Prompt Builder writes once, runs everywhere — locked to your niche, voice, and brand tone.

Free 14 days, no card.

Mistral La Plateforme API price per 1M tokens — June 2026

Feature
Input ($/1M)
Output ($/1M)
Notes
Mistral Large 2$2.00$6.00Legacy flagship (pre-2512)
Mistral Large 3 (2512)$0.50$1.50Current flagship; 75% drop vs Large 2
Mistral Medium 3$0.40$2.00Balanced cost/quality
Mistral Medium 3.5$1.50$7.50Premium reasoning tier
Mistral Small 4$0.10$0.30Cheapest production-grade tier

Source, as of June 2026: Mistral pricing (https://mistral.ai/pricing/). No public cached-input discount on La Plateforme as of this snapshot. Free tier: up to 50 free monthly requests on La Plateforme for evaluation. Open-weight models (Mistral 7B, Mixtral 8x7B/8x22B, Mistral NeMo, the Codestral family) can be self-hosted under permissive licenses — see the self-hosting section below for the breakeven math. Pricing for fine-tuning, agents, and embeddings (Mistral Embed) is published separately and not covered in this calculator.

The cost formula (memorize this one)

Every Mistral La Plateforme API call follows the same math. There is no platform fee, no per-call fee, no minimum. You pay for what you send and what you get back, at the model's per-1M-token rate:

``` cost = (input_tokens / 1,000,000) × input_price_per_M + (output_tokens / 1,000,000) × output_price_per_M ```

Unlike OpenAI and DeepSeek, Mistral does **not** currently publish a cached-input discount on La Plateforme. That means there is no automatic 90%-off lever for stable system prompts — if you are migrating from OpenAI and your bill is cache-dominated, model that explicitly when you compare quotes. The compensation is that headline rates are lower across the board (Large 3 at $0.50 input is already 80% cheaper than GPT-5.4 at $2.50, before any cache discounts on the OpenAI side).

Mistral does not bill 'reasoning tokens' as a separate stream the way OpenAI's o-series does. When a Mistral model emits chain-of-thought in its response, those tokens count as normal output tokens at the standard rate. This is simpler to budget against but means you cannot 'see and skip' reasoning trace separately — cap `max_tokens` to keep verbose outputs in check.

The free tier matters at evaluation time: La Plateforme grants up to 50 free monthly requests per account, enough to prototype against every tier in the table before you ever attach a card. Use it to confirm Small 4 hits your quality bar before you default to Large 3.


Worked example 1: a single 1,000-in / 500-out call

Take a representative call — a 1,000-token prompt that returns a 500-token answer, roughly equivalent to a 750-word brief in and a 375-word reply out. At standard rates, the per-call cost lands as:

Mistral Large 2: (1000 / 1,000,000) × $2.00 + (500 / 1,000,000) × $6.00 = $0.002 + $0.003 = **$0.005 per call**.

Mistral Large 3: 0.001 × $0.50 + 0.0005 × $1.50 = $0.0005 + $0.00075 = **$0.00125 per call** (75% cheaper than Large 2 on identical tokens).

Mistral Medium 3: 0.001 × $0.40 + 0.0005 × $2.00 = $0.0004 + $0.001 = **$0.0014 per call**.

Mistral Medium 3.5: 0.001 × $1.50 + 0.0005 × $7.50 = $0.0015 + $0.00375 = **$0.00525 per call** (more expensive than Large 3 — see the Medium 3 vs 3.5 section for when this is worth it).

Mistral Small 4: 0.001 × $0.10 + 0.0005 × $0.30 = $0.0001 + $0.00015 = **$0.00025 per call**.

Notice the 21x spread between Small 4 ($0.00025) and Medium 3.5 ($0.00525) on identical token volumes. The right model is rarely the most expensive one — it is the cheapest tier that meets your quality bar on the actual task. For most production traffic (classification, extraction, summarization, simple Q&A), Small 4 is the answer.


Worked example 2: 100,000 calls per month

Multiply the per-call numbers by 100,000. This is a realistic mid-size workload — daily classification on 3,000+ records, weekly summarization, a low-volume agent loop, a handful of long-form generations per business hour:

Mistral Large 2: $500/month. Mistral Large 3: **$125/month** (the same workload, 75% off, just by moving model versions). Mistral Medium 3: $140/month. Mistral Medium 3.5: $525/month. Mistral Small 4: **$25/month**.

Put this next to OpenAI: the same 100k calls on gpt-5.4 ($2.50 / $15.00) cost $1,000/month — 8x what Large 3 costs and 40x what Small 4 costs. On gpt-5.5 ($5.00 / $30.00), the bill is $2,000/month — 16x Large 3.

The single highest-EV move on a Mistral bill is not finding a 90% cache discount that does not exist. It is making sure you have not defaulted to Medium 3.5 or Large 2 when Small 4 or Large 3 would do the job. Re-run a held-out eval whenever you size up — most teams discover that 70-80% of their production traffic was Small-4-suitable all along.


Worked example 3: scaling to 1,000,000 calls

Now scale to 1M calls — a full-scale production workload (e.g., per-user summarization across a SaaS app with 30,000 active users running 33 calls/month each, or a high-volume classification pipeline):

Mistral Large 2: **$5,000/month**. Mistral Large 3: **$1,250/month**. Mistral Medium 3: $1,400/month. Mistral Medium 3.5: $5,250/month. Mistral Small 4: **$250/month**.

Same volume on OpenAI gpt-5.4: $10,000/month. Same volume on OpenAI gpt-5.5: $20,000/month. Same volume on DeepSeek-V4-Flash ($0.14 / $0.28): $280/month. DeepSeek is still cheaper at the bottom; Mistral Small 4 is within striking distance and ships from EU infrastructure.

At 1M+ calls/month the question shifts from 'what's the cheapest API' to 'what's the cheapest API that meets my latency, residency, and compliance requirements.' For US-only consumer apps, DeepSeek wins on raw price. For EU enterprises with GDPR exposure or any workload touching regulated data (health, finance, public sector), Mistral Small 4 at $250/month for 1M calls is the cheapest defensible option on the market.

The canonical lever order at this scale: (1) pick the cheapest Mistral tier that hits quality on a held-out eval, (2) cap `max_tokens` aggressively (output is 3-5x input price), (3) summarize agent context past 5,000 tokens instead of replaying it, (4) for batch-style async work, build the queue yourself — Mistral does not currently offer a Batch API analogue, so the discount comes from your own scheduling.


Worked example 4: a real production stack (5-turn agent loop on Large 3)

An agent loop is the worst-case cost shape — the model takes multiple turns per user query, replaying the full transcript each turn. Take a typical 5-turn loop with a 2,000-token system prompt + tool definitions, growing context 800 tokens per turn:

Turn 1: 2,800 in / 200 out. Turn 2: 3,000 in / 200 out. Turn 3: 3,200 in / 200 out. Turn 4: 3,400 in / 200 out. Turn 5: 3,600 in / 200 out. Total: 16,000 input tokens + 1,000 output tokens.

On Mistral Large 3: 0.016 × $0.50 + 0.001 × $1.50 = $0.008 + $0.0015 = **$0.0095 per query** — about 7.6x a single call.

On Mistral Small 4 (if the agent logic survives the smaller model): 0.016 × $0.10 + 0.001 × $0.30 = $0.0016 + $0.0003 = **$0.0019 per query** — a fifth of Large 3's cost.

Compare to the same agent loop on OpenAI gpt-5.5: $0.080 + $0.030 = $0.11 per query. Mistral Large 3 is **11.5x cheaper** on this workload before any caching discounts on the OpenAI side. With OpenAI's cache stack applied to a stable system prefix, gpt-5.5 drops to roughly $0.074 per query — still 7.8x more expensive than Mistral Large 3.

For 100k agent queries/month: Mistral Large 3 = $950, Mistral Small 4 = $190, OpenAI gpt-5.5 = $11,000 standard / $7,400 cached. The gap is so wide that for greenfield EU-based agent products, building on Mistral Large 3 and only escalating to Medium 3.5 or a competitor model on specific failure modes is the rational default. Build cache-friendly, instruction-tuned agent prompts free with our code prompt builder.


Why Mistral Large 3 dropped 75% versus Large 2

Mistral Large 2 launched at $2.00 input / $6.00 output. Mistral Large 3 (codename 2512) launched eighteen months later at $0.50 / $1.50 — same family, 75% cheaper, and on most published benchmarks a stronger model. Three forces converged to make that pricing both possible and necessary.

**Architecture wins.** The Large 3 family pushes harder on sparse mixture-of-experts routing, more aggressive quantization-aware training, and improved attention kernels (FlashAttention-3 derivatives plus speculative decoding at serving time). The per-token cost of running an active expert is materially lower than running the dense Large 2 weights. Mistral did not lose margin on Large 3 — it pushed serving cost down faster than it cut the price, then handed the rest to customers.

**Competitive pressure from DeepSeek.** DeepSeek-V3 at $0.14 / $0.28 and DeepSeek-R1 at $0.55 / $2.19 (see our DeepSeek cost calculator) made every Western frontier vendor's pricing look stale. Mistral could either cede the high-volume tier to a Chinese-origin model that EU enterprises were quietly testing under the table, or aggressively reprice and keep the workload. They chose the latter, with the EU-residency story as the moat.

**Competitive pressure from open-weight Llama 4.** Meta's Llama 4 Scout and Maverick (see Groq pricing at $0.11-$0.50 input on our Llama 4 cost calculator) ship under permissive licenses, run on commodity inference providers, and approach Large-2-class quality on many tasks. The implicit anchor for any hosted model with comparable quality became Groq's Llama 4 Scout at $0.11 input. Large 3 had to land within striking distance to remain defensible as a hosted choice.

The lesson for buyers: prices in this market move down annually, sometimes 50-75% in a single version bump. If your contracts or projections lock in last year's rates as the floor, you are leaving money on the table. Re-quote at every major version cycle.


Mistral's EU data-residency moat (when this is worth paying for)

Mistral is a French company headquartered in Paris, with primary inference infrastructure in EU data centers (initial deployment via OVHcloud and Scaleway, with expanded sovereign-cloud partnerships through 2025-2026). Customer data sent to La Plateforme stays in-region under EU jurisdiction. This is not marketing — it is a material legal posture that changes which contracts are buyable.

**GDPR exposure.** Under GDPR, transferring personal data of EU residents to US-controlled cloud infrastructure requires a lawful transfer mechanism — Standard Contractual Clauses (SCCs) post-Schrems II, with documented Transfer Impact Assessments. Many EU enterprises have either banned US-cloud-hosted LLM APIs outright or wrapped them in heavy legal review that adds 6-12 weeks to procurement. Mistral on La Plateforme bypasses that entirely: the data does not leave the EU. Procurement becomes a normal vendor-onboarding conversation.

**EU AI Act compliance.** The AI Act (in force since 2024, with high-risk obligations phasing through 2026-2027) imposes documentation, risk-management, and post-market monitoring requirements on AI systems used in regulated contexts. Mistral has built explicit tooling and contractual language around these obligations. US providers offer comparable language, but for EU public sector and regulated industries (banking, insurance, healthcare, critical infrastructure), procuring an EU-domiciled vendor with EU-domiciled processing is materially simpler.

**When this is worth a price premium.** If you are a US-only B2C product with no EU users, none of this matters and DeepSeek or Groq Llama 4 are likely cheaper choices. If you are processing personal data of EU residents at any volume, the implicit cost of *not* being on a sovereign vendor — legal review time, transfer impact assessments, customer pushback, the residual risk of an enforcement action — frequently exceeds the per-token premium versus the cheapest available API. Mistral Large 3 at $0.50/$1.50 is already cheaper than OpenAI gpt-5.4, so for EU workloads the choice often pays for itself.

**When residency doesn't help you.** Mistral cannot make a non-compliant workload compliant on its own. If your application stores prompts or outputs in a US-hosted database, the residency story is broken at the storage layer. Treat the API choice as one input to a residency architecture, not the entire answer.


La Plateforme vs self-hosting Mistral open weights (the breakeven)

Mistral publishes most of its non-flagship models under permissive open-weight licenses — Mistral 7B, Mixtral 8x7B and 8x22B, Mistral NeMo, the Codestral family. Large 2 and Large 3 are hosted-only under commercial terms, but for many production workloads the open-weight models are sufficient and can be self-hosted on your own GPUs (or rented GPUs on Lambda, RunPod, CoreWeave, Together, etc.).

**The breakeven heuristic.** A single H100 GPU costs roughly $2-3/hour on spot pricing, $4-8/hour on-demand, or $20k+/month if you buy one outright and amortize. To run a Mixtral 8x22B-class workload at production-grade throughput and latency you typically need 4-8 H100s with redundancy — call it $15-30k/month in raw GPU cost, plus engineering time for serving (vLLM, TGI, or Triton), monitoring, autoscaling, and on-call.

Rough rule: **if your Mistral hosted bill is under ~$5,000/month, La Plateforme almost always wins** on total cost once you account for engineering overhead. Between $5k and $20k/month it is a judgment call that depends on whether you already run GPU infrastructure for other workloads (lower marginal cost) and how much engineering capacity you have to spend on serving. Above $20-30k/month, self-hosting an open-weight equivalent (Mixtral 8x22B for Large-2-class tasks, Mistral NeMo for Small-4-class tasks) starts to pencil out.

**What self-hosting cannot give you.** Self-hosting buys cost control and data sovereignty at infrastructure level. It does not give you Mistral Large 3 — that model is hosted-only. It does not give you free model updates as Mistral ships new versions. It does not give you the EU-residency contractual posture without your own legal work to backstop it. And it pulls engineering attention away from product work.

**The hybrid pattern that works.** Many EU enterprises run a hybrid: hosted Mistral Large 3 on La Plateforme for the small fraction of traffic that needs frontier quality, self-hosted Mixtral 8x22B or Mistral NeMo on internal GPUs for the high-volume, lower-stakes traffic. The hosted bill stays small, the self-host bill amortizes against existing infrastructure, and the residency story is consistent across both. This is the closest equivalent to OpenAI Batch + cache in the Mistral ecosystem.


Mistral Medium 3 vs Medium 3.5 (when the 3.75x premium is worth it)

Medium 3 lists at $0.40 / $2.00. Medium 3.5 lists at $1.50 / $7.50 — 3.75x the input price, 3.75x the output price. Same naming family, very different models. The pricing tells you what Mistral thinks the use cases are.

**Medium 3 is positioned as the balanced-default tier**: high-throughput summarization, structured extraction, RAG question-answering, content moderation at scale. It is roughly the price-quality equivalent of OpenAI gpt-5.4-mini and meaningfully cheaper than gpt-5.4. If you are not specifically reasoning-bound, start here and only escalate if eval scores demand it.

**Medium 3.5 is a premium-reasoning tier.** The output price ($7.50) is higher than Mistral Large 3 ($1.50), which is the tell — Mistral is signaling 'this model thinks longer and emits more chain-of-thought before answering.' For tasks where the upgrade from Medium 3 to 3.5 measurably moves accuracy (complex code synthesis, multi-step planning, ambiguous-input classification), the premium can pencil out. For straight chat and extraction, it almost never does.

The cost-aware default: route to Medium 3 first, escalate to Medium 3.5 only on tasks where a held-out eval shows a quality delta of at least 5 percentage points. For most production teams, that filter catches under 10% of traffic and keeps the bulk of the bill on Medium 3 or Small 4.


Mistral vs OpenAI vs DeepSeek (side-by-side, June 2026)

Same 1,000-in / 500-out call, every flagship model on the market:

**Mistral Large 3**: $0.00125. **OpenAI gpt-5.4**: $0.010 (8x). **OpenAI gpt-5.5**: $0.020 (16x). **DeepSeek-V3**: $0.000280 (4.5x cheaper than Mistral Large 3). **DeepSeek-R1**: $0.001645 (1.3x more than Mistral Large 3, but reasoning-output-heavy).

Same 1,000-in / 500-out on cheap tiers: **Mistral Small 4**: $0.00025. **OpenAI gpt-5.4-mini**: $0.0008 (3.2x). **DeepSeek-V4-Flash**: $0.00028 (1.1x). For sub-cent traffic, Mistral Small 4 and DeepSeek-V4-Flash are roughly co-equal on price; Mistral wins on EU residency, DeepSeek wins on cache-hit pricing (90% off).

**Where Mistral Large 3 wins**: EU residency, frontier-class quality, predictable per-token pricing with no Batch/cache complexity, instruction-tuning friendly to JSON-mode and tool-use. Best for EU enterprises and any team that wants the cheapest credible frontier API without managing two discount levers.

**Where OpenAI wins**: highest absolute quality ceiling (gpt-5.5-pro for hardest reasoning), deepest tool ecosystem, mature Batch API and cache stack that can take the bill below Mistral on cache-dominated workloads. Best for US-first products and reasoning-heavy frontier work.

**Where DeepSeek wins**: cheapest per-token rates in the market, aggressive cache-hit discounts. Best for high-volume cost-sensitive workloads where the China-domicile is acceptable. See our DeepSeek cost calculator for the full breakdown.

The honest cross-vendor advice: do not pick one and lock in. Build with a thin abstraction layer (OpenAI-compatible chat completions interface), route traffic per task to the cheapest model that hits the quality bar, and re-evaluate quarterly as prices move. Compare prompts head-to-head with our GPT vs Claude vs Mistral cost calculator.


Frequent mistakes that inflate the Mistral bill

**Mistake 1: still running on Mistral Large 2.** If your code calls `mistral-large-2` and has not been updated to `mistral-large-2512` (Large 3), you are paying 4x what you should for equivalent or better quality. This is the single highest-EV migration on the platform — change one model string, cut 75% off the bill.

**Mistake 2: defaulting to Medium 3.5 because the name sounds newer.** Medium 3.5 is 3.75x more expensive than Medium 3 and is positioned for reasoning-heavy tasks. For RAG, extraction, and summarization, Medium 3 is the right default and frequently the right answer outright.

**Mistake 3: not testing Small 4 before defaulting to Large 3.** Small 4 at $0.10 input is 5x cheaper than Large 3 and handles a remarkable amount of production traffic. The 50-free-requests/month free tier exists precisely so you can run this test before committing.

**Mistake 4: assuming Mistral has a cache discount like OpenAI.** It does not, as of this snapshot. If your bill estimate was built on a 90%-cache-hit assumption from your OpenAI experience, redo the math at full input price on Mistral.

**Mistake 5: not capping output.** A 200-token answer that returns 1,200 tokens because you forgot to set `max_tokens` costs 6x. On Medium 3.5, that is $0.009 per call vs $0.0015. Output is 3-5x input price on every Mistral model. Cap aggressively.

**Mistake 6: replaying full history every turn in a long agent.** Summarize earlier turns into a compact 200-token recap once context exceeds 5,000 tokens. You will save 50-80% on input across long sessions with no perceptible quality loss — and without a cache discount to fall back on, this is your highest-EV agent-cost lever on Mistral.


Sourcing methodology and how to keep these numbers current

Every price in this guide comes from Mistral's live pricing page at mistral.ai/pricing, fetched on 2026-06-20 and verified against independent corroborating sources (community pricing aggregators, recent integration commits in popular open-source projects, public Mistral SDK docs). When a number could not be verified against the official page, it was omitted — we'd rather ship a guide missing a row than ship a guide with a fabricated number.

Mistral has historically pushed price changes without explicit changelog entries, with material moves at major model launches (Large 2 → Large 3 was a 75% cut). Expect the cadence to continue — competitive pressure from DeepSeek, Llama 4, and the next generation of frontier models from OpenAI and Anthropic keeps the floor moving. If your monthly bill is over $1,000, re-verify quarterly.

**How to verify before you budget**: open mistral.ai/pricing in an incognito window (no logged-in session interfering with rendering), copy the numbers for your target models into a spreadsheet, compare against this guide. If they match, this guide is current. If they don't, trust the live page. Cross-check against the Mistral SDK release notes — meaningful pricing changes typically coincide with SDK version bumps.

**Why we omitted some rows**: pricing for Mistral Embed (the embeddings model), the Codestral code-specialized model, fine-tuning, and the dedicated agent runtime is published separately and changes more frequently. We covered the chat completion API only in this calculator — for embeddings cost see our Embeddings cost calculator which sources from each provider directly.

**Reproducible methodology**: every row in the table above has a citation; every worked example uses those rows; every FAQ answer reflects them. If you find a discrepancy with the live page, treat the live page as canonical and tell us — we re-fetch and update.

How to estimate any Mistral API call cost in 5 steps

  1. 1

    Estimate your input tokens

    Take your prompt's character count and divide by 4, or its word count and divide by 0.75. Rule of thumb: 1 token ≈ 4 characters ≈ 0.75 English words. Mistral tokenizers tokenize French and other Latin-script European languages roughly equivalently. A 500-word system prompt + a 200-word user message is roughly (500 + 200) ÷ 0.75 ≈ 933 input tokens.

    → Open the Mistral-tuned prompt generator
  2. 2

    Estimate your output tokens

    Estimate output the same way — words ÷ 0.75. Output usually drives cost because output prices are 3-5x input on every Mistral model. If you set a `max_tokens` cap, that is your worst-case ceiling. Use it to budget conservatively, especially on Medium 3.5 where output is $7.50/1M.

  3. 3

    Look up the input and output price per 1M

    From the table above (verified June 2026): Large 2 $2.00 / $6.00, Large 3 $0.50 / $1.50, Medium 3 $0.40 / $2.00, Medium 3.5 $1.50 / $7.50, Small 4 $0.10 / $0.30. Always check the live page before shipping — Mistral has shipped 75%+ price moves at major version cycles.

  4. 4

    Apply the cost formula

    cost = (input_tokens / 1,000,000) × input_price + (output_tokens / 1,000,000) × output_price. A 1,000-in / 500-out call on Mistral Large 3 = 0.001 × $0.50 + 0.0005 × $1.50 = $0.0005 + $0.00075 = $0.00125. The same call on Small 4 = $0.00025 — five times cheaper.

  5. 5

    Decide on residency and Plateforme vs self-host

    If your workload touches EU personal data and your monthly bill is under ~$5,000, La Plateforme is the right answer almost every time. Above $20k/month with steady traffic, run the breakeven on self-hosting Mixtral 8x22B or Mistral NeMo on your own GPUs. Between those bands, judgment call based on existing infrastructure.

Frequently Asked Questions

How much does Mistral cost per 1 million tokens in 2026?

As of June 2026, Mistral Large 3 (the current flagship) charges $0.50 per 1M input tokens and $1.50 per 1M output tokens. Large 2 (legacy) is $2.00 / $6.00. Medium 3 is $0.40 / $2.00. Medium 3.5 is $1.50 / $7.50. Small 4 is $0.10 / $0.30. La Plateforme grants up to 50 free requests/month for evaluation. There is no public cached-input discount. Source: Mistral's live pricing page.

Mistral Large 2 vs Large 3 pricing — what changed?

Mistral Large 3 (codename 2512) launched at $0.50 input / $1.50 output per 1M tokens — a 75% drop versus Large 2's $2.00 / $6.00. Same model family, same hosted API, on most published benchmarks a stronger model. The drop was driven by architecture improvements (sparse MoE routing, better serving stack) plus competitive pressure from DeepSeek and Llama 4. If your code still references `mistral-large-2`, migrating to Large 3 cuts the bill 75% with no quality regression.

Is Mistral cheaper than GPT-5?

Yes, significantly. Mistral Large 3 at $0.50 / $1.50 is 80% cheaper on input and 90% cheaper on output than OpenAI gpt-5.4 ($2.50 / $15.00). Versus gpt-5.5 ($5.00 / $30.00), Mistral Large 3 is 90% cheaper on input and 95% cheaper on output. On a 1,000-in / 500-out call: Mistral Large 3 = $0.00125, gpt-5.4 = $0.010 (8x more), gpt-5.5 = $0.020 (16x more). OpenAI can claw back some of that gap on cache-hit-heavy workloads (90% off cached input) and Batch API (50% off both streams) — Mistral does not currently offer either lever.

Does Mistral have a free tier?

Yes. La Plateforme grants up to 50 free requests per month per account, enough to evaluate every model tier (Small 4, Medium 3, Medium 3.5, Large 3) before attaching a card. Use it specifically to test whether Small 4 hits your quality bar before defaulting to a more expensive tier — most teams discover Small 4 handles 70-80% of production traffic. The free tier does not include rate-limit relief or dedicated capacity; it is for evaluation only.

Mistral Medium 3 vs Medium 3.5 — which should I use?

Medium 3 ($0.40 input / $2.00 output) is the balanced default for high-throughput summarization, extraction, RAG, and content moderation — roughly equivalent in positioning to OpenAI gpt-5.4-mini. Medium 3.5 ($1.50 / $7.50) is a premium reasoning tier where the model emits longer chain-of-thought; the output price is higher than Mistral Large 3, which signals the use case. Start on Medium 3 and only escalate to Medium 3.5 on tasks where a held-out eval shows at least 5 percentage points of quality lift.

How much does Mistral Small 4 API cost?

Mistral Small 4 lists at $0.10 input / $0.30 output per 1M tokens — one of the cheapest serious APIs on the market. A 1,000-in / 500-out call costs $0.00025. At 1M calls/month: $250. At 1M calls/month on OpenAI gpt-5.4-mini for comparison: $1,000 (4x). For classification, extraction, simple Q&A, intent routing, and most RAG retrieval-augmented answers, Small 4 is production-grade and the right default for cost-sensitive EU workloads.

Is Mistral GDPR compliant?

Yes. Mistral is a French company with primary inference infrastructure in EU data centers. Data sent to La Plateforme stays within EU jurisdiction, which materially simplifies GDPR compliance versus US-hosted alternatives — no Standard Contractual Clauses, no Transfer Impact Assessment for the API call itself, no Schrems II exposure on the model-serving leg. Mistral also publishes EU AI Act-aligned documentation for regulated-industry buyers. Note: GDPR compliance at the application level still requires you to handle storage, logging, and downstream data flows correctly — Mistral's residency posture covers the API call, not your entire stack.

La Plateforme vs self-hosting Mistral weights — when does each win?

La Plateforme almost always wins under ~$5,000/month hosted spend once you account for engineering overhead (vLLM/TGI serving, monitoring, autoscaling, on-call). Between $5k and $20k/month it is a judgment call based on whether you already run GPU infrastructure. Above $20-30k/month, self-hosting an open-weight equivalent (Mixtral 8x22B for Large-2-class workloads, Mistral NeMo for Small-4-class workloads) starts to pencil out. Caveat: Large 3 itself is hosted-only and not available as open weights; self-hosting buys you Mixtral or NeMo, not the current flagship. The hybrid pattern — Large 3 on La Plateforme for premium traffic, self-hosted open weights for high-volume tail — is the most common production setup.

Mistral is cheaper than GPT. Bad prompts are not.

Whether you're on Large 3 or Small 4, prompt structure determines what you actually spend. Our AI Prompt Generator writes Mistral-tuned prompts (instruction format, JSON-mode-ready) based on YOUR business + task. 14-day free trial, no card.

Browse all prompt tools →