GPT-4o mini is OpenAI's small, fast model optimised for high-throughput, cost-sensitive applications. At $0.15/M input and $0.60/M output tokens it is roughly 15× cheaper than GPT-4o while still outperforming GPT-3.5 Turbo on most benchmarks. It supports the same function calling, JSON mode, and vision capabilities as its larger sibling.

Key capabilities

Ultra-low latency — optimised for real-time response in chatbots and copilots
Vision support — can process images alongside text at the same low price point
Function calling & structured outputs — suitable for tool-use and agentic pipelines
128K context window — the same generous context as GPT-4o

What it's best for

High-volume production workloads where per-call cost matters: customer support automation, document triage, coding autocomplete backends, content classification, and any pipeline that calls the model hundreds of thousands of times per day.

Context

128K

200K

1.0M

131K

MMLU

82.0

83.0

85.9

87.5

HumanEval

87.2

88.0

—

MATH

—

58.5

76.1

GPQA

—

Pricing

Freemium

Input $/M

$0.15

$0.80

$1.25

$2.00

Key capabilities

Ultra-low latency — optimised for real-time response in chatbots and copilots
Vision support — can process images alongside text at the same low price point
Function calling & structured outputs — suitable for tool-use and agentic pipelines
128K context window — the same generous context as GPT-4o

Provider	OpenAI
Released	2024-07-18
Status	Current
Context window	128K tokens
Pricing	Freemium
Input price	$0.15/M
Output price	$0.60/M
Capabilities	textvisioncodefunction-calling
API docs	Docs ↗

GPT-4o mini

Benchmarks

Key capabilities

What it's best for

Compare with similar models

Tools built on this model

Key capabilities

What it's best for