What is Stable Diffusion?

Stable Diffusion is an open-source latent diffusion model for text-to-image generation, originally developed by Stability AI with CompVis and Runway ML. It can run locally on consumer GPUs (8GB+ VRAM recommended) and has spawned a massive ecosystem of fine-tuned models, LoRAs, and tools that outperform the base model for specific styles and subjects.

What hardware do I need to run Stable Diffusion?

Stable Diffusion SD 1.5 runs on GPUs with 4GB+ VRAM. SDXL requires 8–12GB VRAM for full quality. Apple Silicon Macs (M1/M2/M3) can run it via MPS. CPU-only is possible but very slow. For cloud-based use without local hardware, services like Getimg.ai, NightCafe, or the Stability AI API offer pay-per-image pricing.

What is the difference between SD 1.5, SDXL, and SD3?

SD 1.5 (512×512 base resolution) is the most widely supported and has the largest community model library. SDXL generates 1024×1024 natively with significantly better quality. Stable Diffusion 3 (2024) introduces multi-modal text encoding for better prompt following and improved detail. Most community content targets SD 1.5 and SDXL.

What are LoRAs and ControlNet?

LoRA (Low-Rank Adaptation) are small fine-tuned weights that modify the model's output style, character, or subject with minimal GPU resources. ControlNet adds spatial conditioning (e.g., pose, depth, edge maps) to guide composition. Both are widely used community extensions available on HuggingFace and Civitai.

Stable Diffusion | db.fyi

Why it matters

The only major AI image model you can run entirely locally — no API costs, no data leaving your machine, complete privacy.
Massive community ecosystem: tens of thousands of fine-tuned models on Civitai and HuggingFace for any style, subject, or aesthetic.
ControlNet, LoRA, and IP-Adapter enable a level of compositional and stylistic control unavailable in closed models.
Free to use — no subscription, no per-image fee — just hardware and electricity.

Key capabilities

Text-to-image generation: Generate images from text prompts locally or via API, with full resolution control.
Image-to-image (img2img): Use a reference image as a starting point; guide the output with a text prompt.
Inpainting and outpainting: Fill masked regions or extend image borders with AI-generated content.
ControlNet: Use depth maps, pose skeletons, edge maps, or scribbles to precisely control image composition.
LoRA fine-tuning: Apply small, downloadable model adapters to shift style, introduce a character, or match a specific aesthetic.
Dreambooth: Fine-tune the entire model on 10–30 reference images to generate new images of a specific subject or face.
AUTOMATIC1111 / ComfyUI: The two dominant local UIs — AUTOMATIC1111 for ease of use; ComfyUI for node-based workflow automation.
Civitai & HuggingFace: Community hubs with 100,000+ free models, LoRAs, and embeddings.

Technical notes

Models: SD 1.5 (512px base), SDXL (1024px base), SD3, Flux.1 (next-gen); all run locally
VRAM requirements: 4GB for SD 1.5; 8–12GB for SDXL; 16GB+ for SD3 / Flux.1
Local UIs: AUTOMATIC1111 (140K+ GitHub stars), ComfyUI (60K+ stars), Forge, InvokeAI
Cloud APIs: Stability AI API, Replicate, Getimg.ai for pay-per-image without local hardware
License: Varies by model — SD 1.5 is CreativeML Open RAIL-M; SDXL has its own license; check per-model
Founded: Stable Diffusion released publicly by Stability AI in August 2022

Ideal for

Creatives who want full control, privacy, and zero ongoing costs for high-volume image generation.
Developers and researchers building AI image applications where proprietary model costs are prohibitive.
Artists fine-tuning models on their own style or creating character-consistent image sets with Dreambooth/LoRA.

Not ideal for

Users without a dedicated GPU who need instant results — cloud-based alternatives like DALL-E or Midjourney are simpler.
Non-technical users uncomfortable with command-line setup or Python environments.
Commercial projects requiring legally clear training data provenance — check each model's license carefully.

Stable Diffusion

Why it matters

Key capabilities

Technical notes

Ideal for

Not ideal for

See also

FAQ

Alternatives

Integrations

Built on

Why it matters

Key capabilities

Technical notes

Ideal for

Not ideal for

See also

Stable Diffusion

Why it matters

Key capabilities

Technical notes

Ideal for

Not ideal for

See also

FAQ

What is Stable Diffusion?

What hardware do I need to run Stable Diffusion?

What is the difference between SD 1.5, SDXL, and SD3?

What are LoRAs and ControlNet?

Alternatives

Integrations

Built on

Related tools

Why it matters

Key capabilities

Technical notes

Ideal for

Not ideal for

See also