Modal is a serverless cloud platform for Python. You decorate Python functions with `@app.function()` and specify compute requirements (CPU, GPU, memory, timeout). Modal automatically builds a container, provisions resources, runs the function, and scales to zero when idle. It's designed for ML workloads: fine-tuning LLMs, running inference, processing datasets, and deploying web APIs.

Modal has a free tier with $30/month in credits for new accounts, then $0.10/credit thereafter. T4 GPU: ~$0.59/hr; A10G: ~$1.10/hr; A100: ~$3.72/hr; H100: ~$7.20/hr. You pay only for compute time — no idle costs. Free tier is generous enough for experimentation and small projects.

How does Modal compare to RunPod?

RunPod gives raw GPU VMs or serverless endpoints — you manage environment setup (Docker, dependencies). Modal is Python-native: you write Python code with decorators, and Modal handles containerization automatically. Modal is faster to iterate; RunPod is cheaper for sustained workloads and has more GPU options. Modal is better for developers; RunPod for MLOps teams with existing Docker setups.

What can Modal deploy?

Modal runs: LLM inference (Llama, Mistral via vLLM), Stable Diffusion image generation, batch ML jobs (data processing, feature engineering), scheduled cron tasks, REST APIs (FastAPI with `@asgi_app()`), web scraping, and custom ML training. Any Python code that can run in a container can run on Modal.

Modal | db.fyi

Why it matters

Python-native syntax (@app.function(gpu='A100')) eliminates Docker, Kubernetes, and infrastructure configuration — the simplest possible GPU cloud interface.
Scale-to-zero by default means LLM inference endpoints cost nothing when idle — critical for early-stage products with unpredictable traffic.
Hot reloading during development (modal serve) makes iterating on ML code fast — changes deploy in seconds, not minutes.
First-class web endpoint support (@asgi_app()) turns any FastAPI app into a serverless, GPU-backed API with one command.

Key capabilities

GPU functions: @app.function(gpu='T4'/'A10G'/'A100'/'H100') — any Python function on any GPU class.
Auto-scaling: Scale from 0 to N instances based on request volume; scale back to zero when idle.
Container management: Automatic container building from Python requirements — no Dockerfile needed.
Web endpoints: Deploy FastAPI or any ASGI app as serverless web endpoint with @asgi_app().
Scheduled jobs: Cron-style scheduling with @app.function(schedule=Period(hours=1)).
Parallel jobs: Map a function across thousands of inputs in parallel (stub.map()).
Persistent volumes: Mount NFS volumes for model weights and dataset caching.
Secrets: Secure secrets management for API keys and credentials.
CLI: modal run, modal serve, modal deploy for development, serving, and production.

Technical notes

Language: Python (primary); REST API for other languages
GPUs: T4, A10G, A100 (40/80GB), H100
Pricing: Free $30/mo credits; T4 ~$0.59/hr, A10G ~$1.10/hr, A100 ~$3.72/hr, H100 ~$7.20/hr
Containers: Custom Python environments built from modal.Image; supports pip, conda, Docker
Secrets: modal.Secret for environment variables; integrations with AWS, GCP, Cloudflare
Founded: 2021; New York; raised $67M (Redpoint, Andreessen Horowitz)
Team: Ex-Stripe, ex-Google, MIT engineers

Ideal for

ML engineers who want GPU access for LLM inference, diffusion models, or training without managing Kubernetes or Docker.
Python developers building AI-powered APIs or batch processing pipelines on serverless infrastructure.
Startups and researchers who need powerful GPUs for experiments but don't want to pay for idle VMs.

Not ideal for

Non-Python workloads — Modal is built around Python; Go, Rust, or Node.js backend work needs another solution.
Sustained high-throughput inference at scale — RunPod or dedicated GPU instances may be cheaper at constant load.
Teams who need fine-grained GPU memory sharing across concurrent users (LoRAX, vLLM serving features).

Modal

Why it matters

Key capabilities

Technical notes

Ideal for

Not ideal for

See also

FAQ

Alternatives

Integrations

Built on

Why it matters

Key capabilities

Technical notes

Ideal for

Not ideal for

See also

Modal

Why it matters

Key capabilities

Technical notes

Ideal for

Not ideal for

See also

FAQ

What is Modal?

Is Modal free?

How does Modal compare to RunPod?

What can Modal deploy?

Alternatives

Integrations

Built on

Related tools

Why it matters

Key capabilities

Technical notes

Ideal for

Not ideal for

See also