Skip to main content

LoRAX

Multi-LoRA inference server — serve hundreds of fine-tuned adapters on a single GPU

LLM FrameworksFree

LoRAX (LoRA eXchange) is an open-source LLM serving framework that runs hundreds of fine-tuned LoRA adapters simultaneously on a single GPU — sharing base model weights while dynamically loading adapters per request. Built by the Predibase team, LoRAX makes serving many custom models cost-effective by eliminating the need for separate GPU instances per fine-tuned model.

Key specs
2,800 GitHub stars source
as of 2026-03-27
Loading…

FAQ

Alternatives

Integrations

None listed.

Built on

None listed.