Why it matters
- Industry standard API format adopted by 30+ other providers (Groq, Together AI, Fireworks, Cerebras) — code written for OpenAI works with most competitors with minimal changes.
- Broadest capability surface — text, vision, images, audio, and code interpreter in one API account; no need for multiple providers.
- Massive ecosystem of libraries, tutorials, and community resources — virtually every AI integration guide starts with OpenAI examples.
- Assistants API provides managed state for user-facing AI assistants — eliminates conversation history management and RAG pipeline code.
Key capabilities
- Chat Completions: GPT-4o, GPT-4o mini, o3, o4 for text + vision tasks.
- Function calling: Tool use with parallel calls for agent workflows.
- Structured outputs: Force model to output valid JSON matching a schema.
- DALL-E 3: Text-to-image generation; 1024x1024 to 1792x1024 resolution.
- Whisper: Speech-to-text in 99 languages; timestamped transcription.
- TTS: 6 voice options; MP3/Opus/AAC output.
- Embeddings: text-embedding-3-small/large for RAG and semantic search.
- Assistants API: Persistent threads, File Search (RAG), Code Interpreter.
- Batch API: Async batch processing at 50% discount.
- Fine-tuning: Fine-tune GPT-4o mini and GPT-3.5 on custom datasets.
Technical notes
- Models: GPT-4o, GPT-4o mini, o1, o3, o4 mini; DALL-E 3; Whisper; TTS; ada/3-small/3-large embeddings
- Context: 128K tokens (GPT-4o); 200K (some variants)
- Python:
pip install openai
- TypeScript:
npm install openai
- API base: api.openai.com/v1
- Pricing: GPT-4o $2.50/M input; $10/M output tokens
- Rate limits: Vary by usage tier; see platform.openai.com
Usage example
from openai import OpenAI
client = OpenAI(api_key="YOUR_OPENAI_API_KEY")
# Streaming chat completion
with client.chat.completions.stream(
model="gpt-4o",
messages=[{"role": "user", "content": "Explain quantum computing in simple terms"}]
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
# Structured output
from pydantic import BaseModel
class Person(BaseModel):
name: str
age: int
occupation: str
response = client.beta.chat.completions.parse(
model="gpt-4o",
messages=[{"role": "user", "content": "Extract: John is a 30-year-old engineer"}],
response_format=Person,
)
person = response.choices[0].message.parsed
Ideal for
- New AI projects that want the most documented, community-supported API with the broadest feature surface.
- Applications requiring multiple modalities (text + images + audio) without managing multiple APIs.
- Teams building user-facing AI assistants who want managed conversation state via the Assistants API.
Not ideal for
- Cost-sensitive applications at scale — Llama via Groq/Together AI is 10x+ cheaper for comparable open-model quality.
- 200K+ token context needs — Anthropic Claude supports longer context windows.
- Fully offline/on-premise — OpenAI is cloud-only (Azure OpenAI offers managed cloud alternative).
See also
- Anthropic API — Claude API; 200K context, computer use, strong instruction following.
- Vercel AI SDK — TypeScript SDK with native OpenAI support for web apps.
- Groq — Ultra-fast Llama inference at fraction of GPT-4 cost for latency-sensitive apps.