BYOLLM

Bring Your Own LLM. No lock-in.

Bodega One doesn't bundle a model. You connect yours (local or cloud) and switch between them without restarting. 10+ provider presets included.

Join the Waitlist Setup guide

What does Bring Your Own LLM mean?

BYOLLM (Bring Your Own LLM) means the AI application doesn't bundle a specific model. You connect the LLM you want to use. In practice: Bodega One ships with 10+ provider presets. You configure your model (local or cloud), and the app uses it. Switch models in seconds. Your API keys go directly to the provider. Bodega One never touches your usage or trains on your data.

Every other major AI coding tool ships with a bundled model. That's a business decision, not a technical one. It lets them charge you for inference, control what models you access, and build a moat around the model relationship.

BYOLLM inverts that. You pick the model. You own the API relationship. We just provide the IDE, the agent, and the tooling. Read the full post on why BYOLLM matters.

Run models on your own hardware

Local providers run entirely on your machine. Zero data leaves your hardware. Fully compatible with air-gap mode.

Ollama

Run any GGUF model locally. The most popular local LLM runtime.

LM Studio

GUI-based local model runner. Easy model downloads and management.

vLLM

High-throughput inference server for production-grade local deployments.

llama.cpp

Raw llama.cpp server. Maximum control, minimal overhead.

LocalAI

OpenAI-compatible local API. Drop-in replacement for cloud endpoints.

KoboldCpp

llama.cpp-based with extra sampling options. Popular in the local AI community.

GPT4All

Beginner-friendly local model runner. Good for first-time local AI setup.

MLX

Apple Silicon-optimized inference. Native Metal performance on M-series chips.

Jan

Desktop app for local models with a clean UI. Similar to LM Studio.

Or connect a cloud API

When you need top-tier reasoning on a hard problem, use a cloud provider. Your API key goes directly to the provider. We're not a middleman.

OpenAI

GPT-4o, GPT-4 Turbo, and o-series models via API key.

Anthropic

Claude Sonnet 4.6 and Claude Opus 4.6 via API key. Strong reasoning, long context.

Groq

Extremely fast inference on Llama and Mixtral models via cloud.

Together AI

Wide model selection. Good for evaluating open-source models via API.

OpenRouter

Single API key for 200+ models. Useful for model comparison workflows.

Custom endpoint

Any OpenAI-compatible API. Point Bodega One at your own server or self-hosted inference stack.

Switching takes seconds

Select your provider

Open the provider panel and pick from 10+ presets. Each preset ships with the right API format, base URL, and model list pre-filled.

Add your key or local server

For cloud providers, paste your API key. It goes directly to the provider. For local providers, point at your running server.

Switch takes effect immediately

No restart. The next message uses the new provider. Per-session overrides let you swap models without changing your default.

Full setup guide for each provider: Documentation · Ollama guide · LM Studio guide · DeepSeek guide

Common
questions

What does BYOLLM mean?+
BYOLLM stands for "Bring Your Own LLM." It means the AI application doesn't bundle a specific model. Instead, you connect whichever LLM you want to use. In Bodega One, you can wire up a local model running on your machine (Ollama, LM Studio, llama.cpp) or a cloud API (OpenAI, Anthropic, Groq) and switch between them at any time without restarting the app.
Why does BYOLLM matter?+
Most AI coding tools lock you to their bundled model. When they raise prices, change the model, or drop features, you don't have a choice. BYOLLM inverts that: you own the model relationship. Use a free local model for everyday work. Switch to Claude when you need top-tier reasoning on a hard problem. Your API keys go directly to the provider. Bodega One never touches your usage.
Which local LLMs work best with Bodega One?+
For coding tasks, Qwen3.6-27B (16-24 GB VRAM) is the current gold standard at 77.2% on SWE-bench Verified. For 6-8 GB cards, Qwen3.5-9B Q4_K_M is solid. For 12-16 GB, Gemma 4 26B MoE punches above its weight. For Apple Silicon, Qwen3.6-27B MLX on 64+ GB unified memory runs well. Bodega One auto-detects your hardware on launch and recommends a model tier based on your VRAM.
Can I use different models for different tasks?+
Yes. Bodega One supports per-session model override. You can set a different model for a single conversation without changing your default provider. Use Ollama for everyday tasks, then override to Claude for a specific complex problem, then go back to your local default.
Does switching models require a restart?+
No. Provider switching takes effect immediately in Bodega One. Change your provider or model mid-session and the next message uses the new selection.
What's the difference between local and cloud providers in Bodega One?+
Local providers (Ollama, LM Studio, llama.cpp, etc.) run entirely on your machine. Zero data leaves your hardware. Cloud providers (OpenAI, Anthropic, Groq) send your prompts to external APIs. Both work in Bodega One. If you need full data privacy, use a local provider, or enable air-gap mode to block all outbound connections at the OS level.

Related resources

Local LLM Rankings

Browse supported models by benchmark

Running DeepSeek Locally

Deploy DeepSeek R1 and V3 on your machine with full privacy

LM Studio + Bodega One Setup

Run LM Studio as a local OpenAI-compatible server

VRAM Calculator

Check if your GPU can run a specific model

Your model. Your data. Your IDE.

Beta is live now. Pay once. No subscription, no model lock-in. Join the waitlist for full launch.

Join the Waitlist (it's free)See Pricing