Overview

Mistral Small 3 is Mistral AI's latest open-weight model for everyday production tasks — fast, affordable, and capable enough to handle the vast majority of real-world use cases without reaching for a larger and more expensive model. At 24 billion parameters released under the Apache 2.0 license, it occupies the ideal middle ground: strong benchmark performance (MMLU: 81.0), practical hardware requirements for self-hosting, and near-zero cost via Mistral's free API tier.

Apache 2.0 License

Unlike many model releases that impose restrictive commercial licenses or usage policies, Mistral Small 3 is released under the Apache 2.0 license — the most permissive mainstream open-source license available for AI models. This means:

Commercial use: Build and sell products using this model without royalties or restrictions.
Modification: Fine-tune, adapt, and redistribute modified versions.
No attribution requirement: Use it without crediting Mistral in your product.
No usage caps: The license doesn't restrict how many users or requests you can serve.

For organisations with legal or compliance teams that scrutinise AI vendor agreements, Apache 2.0 removes most of the friction.

24B Parameters — The Sweet Spot

At 24 billion parameters, Mistral Small 3 is substantially smaller than 70B models but meaningfully larger than 7B models. This places it in a hardware sweet spot:

Full BF16 precision: ~48GB VRAM — fits on 2× RTX 4090 (48GB total) or 1× A100 80GB.
4-bit quantised: ~13GB VRAM — fits on a single RTX 3090/4090, or Apple Silicon with 16GB unified memory.
8-bit quantised: ~24GB VRAM — single A100 40GB or high-end consumer GPU.

This means a capable team can self-host Mistral Small 3 for a fraction of the infrastructure cost required by larger models.

Strong Coding Performance

Despite its size, Mistral Small 3 is a strong code generation model — reflecting Mistral's focus on coding capability across their model family. It handles:

Code generation in Python, JavaScript, TypeScript, Go, Rust, and other mainstream languages.
Code explanation and documentation.
Bug identification and fix suggestions.
Simple algorithmic reasoning and data transformation.

For coding tasks that don't require the depth of Codestral (which is code-specialist) but need a capable general model, Mistral Small 3 is a practical choice.

Fast Inference

The 24B parameter count enables fast inference compared to larger models. On managed providers, this translates to lower time-to-first-token and higher throughput — meaning better user experience for interactive applications and lower cost for batch workloads.

Free via La Plateforme

Mistral Small 3 is available for free (with rate limits) through Mistral's La Plateforme API. This makes it accessible for:

Prototyping and development without billing setup.
Low-volume personal projects.
Evaluation before committing to a production deployment model.

The paid tier removes rate limits at $0.10 per million input tokens and $0.30 per million output tokens — among the most affordable managed API options available.

Best Use Cases

High-volume text tasks: Classification, summarisation, extraction at scale where cost matters.
Self-hosted deployments: Teams that want open-weight flexibility without 70B hardware requirements.
Coding assistance: Integrated development tools, code review, and documentation generation.
Conversational agents: Chatbots and assistants where response speed and cost are priorities.
Fine-tuning base: A capable starting point for domain-specific fine-tuning on consumer hardware.

Overview

Apache 2.0 License

Commercial use: Build and sell products using this model without royalties or restrictions.
Modification: Fine-tune, adapt, and redistribute modified versions.
No attribution requirement: Use it without crediting Mistral in your product.
No usage caps: The license doesn't restrict how many users or requests you can serve.

For organisations with legal or compliance teams that scrutinise AI vendor agreements, Apache 2.0 removes most of the friction.

24B Parameters — The Sweet Spot

At 24 billion parameters, Mistral Small 3 is substantially smaller than 70B models but meaningfully larger than 7B models. This places it in a hardware sweet spot:

Full BF16 precision: ~48GB VRAM — fits on 2× RTX 4090 (48GB total) or 1× A100 80GB.
4-bit quantised: ~13GB VRAM — fits on a single RTX 3090/4090, or Apple Silicon with 16GB unified memory.
8-bit quantised: ~24GB VRAM — single A100 40GB or high-end consumer GPU.

This means a capable team can self-host Mistral Small 3 for a fraction of the infrastructure cost required by larger models.

Strong Coding Performance

Despite its size, Mistral Small 3 is a strong code generation model — reflecting Mistral's focus on coding capability across their model family. It handles:

Code generation in Python, JavaScript, TypeScript, Go, Rust, and other mainstream languages.
Code explanation and documentation.
Bug identification and fix suggestions.
Simple algorithmic reasoning and data transformation.

For coding tasks that don't require the depth of Codestral (which is code-specialist) but need a capable general model, Mistral Small 3 is a practical choice.

Fast Inference

Free via La Plateforme

Mistral Small 3 is available for free (with rate limits) through Mistral's La Plateforme API. This makes it accessible for:

Prototyping and development without billing setup.
Low-volume personal projects.
Evaluation before committing to a production deployment model.

The paid tier removes rate limits at $0.10 per million input tokens and $0.30 per million output tokens — among the most affordable managed API options available.

Best Use Cases

High-volume text tasks: Classification, summarisation, extraction at scale where cost matters.
Self-hosted deployments: Teams that want open-weight flexibility without 70B hardware requirements.
Coding assistance: Integrated development tools, code review, and documentation generation.
Conversational agents: Chatbots and assistants where response speed and cost are priorities.
Fine-tuning base: A capable starting point for domain-specific fine-tuning on consumer hardware.

Provider	Mistral AI
Released	2025-01-30
Status	Current
Context window	33K tokens
Pricing	Open
Input price	$0.10/M
Output price	$0.30/M
Capabilities	textcodefunction-calling
Hugging Face	View on HF ↗

	Mistral Small 3	Llama 3.3 70B	DeepSeek V3	Llama 3.1 405B
Context	33K	131K	66K	131K
MMLU	81.0	86.0	88.5	88.6
HumanEval	—	—	91.6	—
MATH	—	77.0	90.2	73.5
GPQA	—	—	—	—
Pricing	Open	Open	Open	Open
Input $/M	$0.10	—	$0.27	—

Mistral Small 3

Benchmarks

Overview

Apache 2.0 License

24B Parameters — The Sweet Spot

Strong Coding Performance

Fast Inference

Free via La Plateforme

Best Use Cases

Compare with similar models

Overview

Apache 2.0 License

24B Parameters — The Sweet Spot

Strong Coding Performance

Fast Inference

Free via La Plateforme

Best Use Cases