Why it matters
- True self-hosted option with complete data privacy — code never leaves the network, critical for defense, finance, and healthcare customers.
- Proprietary 1.6B Refact model achieves strong HumanEval scores with a tiny model — enables fast completion on consumer GPUs (RTX 3080).
- VS Code and JetBrains coverage reaches the full developer population — not VS Code-only like many competitors.
- Open source server (github.com/smallcloudai/refact) means self-hosted teams can audit and customize the backend.
Key capabilities
- Inline completion: Context-aware code completion in VS Code and JetBrains IDEs.
- AI chat: Conversational AI chat for code explanation, debugging, and generation.
- Refactoring: Suggest improvements, extract functions, and simplify code.
- Self-hosted server: Docker-based server running open-source models on private GPU infrastructure.
- Model selection: Cloud (GPT-4, Claude, Refact-1.6B) or self-hosted (Code Llama, StarCoder2).
- Custom fine-tuning: Enterprise option for fine-tuning on internal codebase (self-hosted).
- Codebase indexing: Understand repository context for more accurate completions.
- Team management: User access control, usage analytics, and permission settings.
Technical notes
- IDE plugins: VS Code, JetBrains (IntelliJ, PyCharm, GoLand, etc.)
- Self-hosted: Docker; NVIDIA GPU (8GB+ VRAM recommended); github.com/smallcloudai/refact
- Cloud models: GPT-4, Claude 3, Refact-1.6B
- Self-hosted models: Code Llama (7B/13B/34B), StarCoder2, WizardCoder
- Pricing: Free (cloud, limited); Pro ~$10/user/mo; Enterprise (self-hosted, custom)
- Founded: 2022; Tallinn, Estonia
Ideal for
- Engineering organizations with strict data privacy requirements that prevent using cloud-based coding AIs.
- Financial services, healthcare, defense, and government teams who need on-premise AI coding assistance.
- Teams with existing GPU infrastructure who want to maximize it for developer productivity.
Not ideal for
- Teams without GPU infrastructure for self-hosted deployment — cloud tier is competitive but not class-leading.
- Maximum AI coding quality without data privacy constraints — Cursor or GitHub Copilot have stronger cloud-side capabilities.
- Solo developers — the self-hosted value proposition doesn't apply at individual scale.
See also
- Tabnine — Competitor with strong on-premise deployment for enterprise teams.
- Codeium for Teams — Cloud-based team coding AI with SSO; less focused on self-hosting.
- GitHub Copilot — Industry leader in cloud-based AI coding; strongest quality, no self-host option.