Overview
Mistral Small 3 is Mistral AI's latest open-weight model for everyday production tasks — fast, affordable, and capable enough to handle the vast majority of real-world use cases without reaching for a larger and more expensive model. At 24 billion parameters released under the Apache 2.0 license, it occupies the ideal middle ground: strong benchmark performance (MMLU: 81.0), practical hardware requirements for self-hosting, and near-zero cost via Mistral's free API tier.
Apache 2.0 License
Unlike many model releases that impose restrictive commercial licenses or usage policies, Mistral Small 3 is released under the Apache 2.0 license — the most permissive mainstream open-source license available for AI models. This means:
- Commercial use: Build and sell products using this model without royalties or restrictions.
- Modification: Fine-tune, adapt, and redistribute modified versions.
- No attribution requirement: Use it without crediting Mistral in your product.
- No usage caps: The license doesn't restrict how many users or requests you can serve.
For organisations with legal or compliance teams that scrutinise AI vendor agreements, Apache 2.0 removes most of the friction.
24B Parameters — The Sweet Spot
At 24 billion parameters, Mistral Small 3 is substantially smaller than 70B models but meaningfully larger than 7B models. This places it in a hardware sweet spot:
- Full BF16 precision: ~48GB VRAM — fits on 2× RTX 4090 (48GB total) or 1× A100 80GB.
- 4-bit quantised: ~13GB VRAM — fits on a single RTX 3090/4090, or Apple Silicon with 16GB unified memory.
- 8-bit quantised: ~24GB VRAM — single A100 40GB or high-end consumer GPU.
This means a capable team can self-host Mistral Small 3 for a fraction of the infrastructure cost required by larger models.
Strong Coding Performance
Despite its size, Mistral Small 3 is a strong code generation model — reflecting Mistral's focus on coding capability across their model family. It handles:
- Code generation in Python, JavaScript, TypeScript, Go, Rust, and other mainstream languages.
- Code explanation and documentation.
- Bug identification and fix suggestions.
- Simple algorithmic reasoning and data transformation.
For coding tasks that don't require the depth of Codestral (which is code-specialist) but need a capable general model, Mistral Small 3 is a practical choice.
Fast Inference
The 24B parameter count enables fast inference compared to larger models. On managed providers, this translates to lower time-to-first-token and higher throughput — meaning better user experience for interactive applications and lower cost for batch workloads.
Free via La Plateforme
Mistral Small 3 is available for free (with rate limits) through Mistral's La Plateforme API. This makes it accessible for:
- Prototyping and development without billing setup.
- Low-volume personal projects.
- Evaluation before committing to a production deployment model.
The paid tier removes rate limits at $0.10 per million input tokens and $0.30 per million output tokens — among the most affordable managed API options available.
Best Use Cases
- High-volume text tasks: Classification, summarisation, extraction at scale where cost matters.
- Self-hosted deployments: Teams that want open-weight flexibility without 70B hardware requirements.
- Coding assistance: Integrated development tools, code review, and documentation generation.
- Conversational agents: Chatbots and assistants where response speed and cost are priorities.
- Fine-tuning base: A capable starting point for domain-specific fine-tuning on consumer hardware.