LM Studio is a free desktop application for Mac, Windows, and Linux that lets you download and run large language models locally. You browse its built-in model catalog (sourced from HuggingFace), download a model in GGUF format, and start chatting immediately. No technical setup or command line needed — it's the simplest way to run local LLMs.

LM Studio is free for personal use. Commercial licensing is required for business deployments. The software itself costs nothing; you just need the hardware to run the models (8GB+ RAM for small models, 16–64GB for larger ones).

What hardware do I need for LM Studio?

LM Studio runs on Mac (Apple Silicon M1/M2/M3 recommended — excellent performance via Metal), Windows (NVIDIA GPU recommended), and Linux (NVIDIA GPU). CPU-only inference is supported but slow. 8GB RAM handles 3B models; 16GB handles 7B models; 32–64GB for 13B–70B models with quantization.

What models can I run in LM Studio?

LM Studio supports any GGUF-format model. Popular choices: Llama 3 (8B, 70B), Mistral 7B, Mixtral 8×7B, Phi-3, Gemma, Qwen, DeepSeek Coder V2, Command-R, and thousands more. The built-in browser shows compatible models from HuggingFace with size and performance info.

LM Studio | db.fyi

Why it matters

The simplest way to run a local LLM — download an app, pick a model, and start chatting in under 10 minutes.
Apple Silicon support via Metal acceleration makes M1/M2/M3 Macs genuinely fast for 7B–13B models.
Built-in OpenAI-compatible local server lets Continue.dev, Cline, and any other OpenAI-SDK tool use your local model — zero code changes needed.
Privacy-first: all inference runs on your machine, nothing sent to any external server.

Key capabilities

Model browser: Search and download GGUF models from HuggingFace, filtered by size and compatibility.
Chat UI: Multi-turn chat with system prompt configuration, temperature, and context window controls.
Local inference: GPU-accelerated inference via CUDA (NVIDIA), Metal (Apple Silicon), or CPU.
OpenAI-compatible API server: Built-in server at localhost:1234 — drop-in replacement for OpenAI API calls.
Multi-model management: Download and switch between multiple models; manage storage.
Context window configuration: Adjust context size to balance memory vs. capability.
GGUF quantization support: Q4_K_M, Q5_K_M, Q8_0 — balance between quality and memory usage.
Preset system prompts: Save and reuse custom system prompts for different use cases.

Technical notes

Supported hardware: Apple Silicon (M1/M2/M3 via Metal), NVIDIA CUDA, AMD ROCm (experimental), CPU
Platforms: macOS (10.14+), Windows (10/11), Linux
Model format: GGUF (via llama.cpp); automatic download from HuggingFace Hub
API: OpenAI-compatible REST API at http://localhost:1234/v1
Pricing: Free for personal use; commercial license required for enterprise deployment
Use with other tools: Works with Continue.dev, Cline (via custom base URL), Jan, and any OpenAI SDK
Maintained by: LM Studio team; regularly updated with new model support

Ideal for

Developers and researchers who want a private, offline ChatGPT experience with frontier-quality open models.
Teams building applications who want a local development LLM without API costs or internet dependency.
Power users on Apple Silicon who want maximum local LLM performance in a polished desktop app.

Not ideal for

Very large model deployments (70B+ without quantization) — requires 64GB+ RAM and powerful GPUs.
Server or headless deployments — LM Studio is a GUI app; use Ollama or vLLM for server/API serving.
Teams needing enterprise support, SLAs, or auditing capabilities.

LM Studio

Why it matters

Key capabilities

Technical notes

Ideal for

Not ideal for

See also

FAQ

Alternatives

Integrations

Built on

Why it matters

Key capabilities

Technical notes

Ideal for

Not ideal for

See also

LM Studio

Why it matters

Key capabilities

Technical notes

Ideal for

Not ideal for

See also

FAQ

What is LM Studio?

Is LM Studio free?

What hardware do I need for LM Studio?

What models can I run in LM Studio?

Alternatives

Integrations

Built on

Related tools

Why it matters

Key capabilities

Technical notes

Ideal for

Not ideal for

See also