Skip to main content
setuplocal-firstBYOLLM

How to run DeepSeek locally with Bodega One

Bodega One7 min read
Quick answer

To run DeepSeek locally with Bodega One: pull the model in Ollama (ollama pull deepseek-r1:14b), then in Bodega One go to Settings → Providers → Ollama. DeepSeek-R1 14B runs well on 12GB+ VRAM. See all 15+ supported providers.

DeepSeek attracted attention in early 2025 with models that matched frontier performance at a fraction of the training cost. DeepSeek-R1 in particular, a reasoning model trained with reinforcement learning, showed that the compute gap between US and Chinese AI labs was smaller than most had assumed.

For developers running local AI, DeepSeek's models are compelling for a specific reason: they are open-weight, quantization-friendly, and capable on coding and reasoning tasks. Here's how to run them with Bodega One.

Which DeepSeek model should you use?

DeepSeek has released several model families. For coding work:

  • DeepSeek-R1: A reasoning model. Slower (it “thinks” before answering), but noticeably stronger on complex tasks: debugging, architecture decisions, multi-step code generation. Available in 1.5B, 7B, 8B, 14B, 32B, and 70B parameter sizes.
  • DeepSeek-V3: A general-purpose model. Faster than R1, strong on code. The full V3 is 671B parameters (MoE) and only runs well on multi-GPU setups. The distilled versions (7B, 8B, 14B) run on consumer hardware.
  • DeepSeek-Coder-V2: An earlier coding-specific model. Still solid, but DeepSeek-R1 and V3 have largely superseded it for general coding tasks.

Which size to run by VRAM

  • 8GB VRAM: DeepSeek-R1 7B or 8B (Q4_K_M). Functional, good for everyday tasks.
  • 12GB VRAM: DeepSeek-R1 14B (Q4_K_M). Noticeably stronger reasoning.
  • 16-24GB VRAM: DeepSeek-R1 32B or DeepSeek-V3 distilled 14B, strong on complex code.
  • 48GB+ VRAM: DeepSeek-R1 70B, approaches frontier performance locally.
  • Apple Silicon 16GB: DeepSeek-R1 7B or 8B MLX, good balance of speed and quality.

For a full hardware reference, see the GPU guide for local AI.

Option 1: Run via Ollama (recommended)

Ollama has DeepSeek-R1 in its model library. Pull a specific size:

  • ollama pull deepseek-r1:7b (7B parameter model)
  • ollama pull deepseek-r1:14b (14B parameter model)
  • ollama pull deepseek-r1:32b (32B parameter model, needs ~20GB+ VRAM)

Ollama handles quantization automatically. The default pull gives you Q4_K_M, which is a good balance of quality and size.

Once the model is pulled and Ollama is running, connect Bodega One: Settings → Providers → Ollama. The model will appear in the model selector.

Option 2: Run via LM Studio

LM Studio's model browser includes DeepSeek-R1 in multiple sizes. Search for “DeepSeek-R1” in the model browser, pick your size, and download. Load it and start the local server. Then connect Bodega One to LM Studio at http://localhost:1234/v1.

For the full LM Studio setup guide, see LM Studio + Bodega One setup.

A note on the thinking tokens

DeepSeek-R1 is a reasoning model. It generates a chain of thought before giving its final answer. This shows up in responses as a <think>...</think> block before the actual answer. Some Ollama versions strip this automatically; others pass it through.

For the agentic coding loop in Bodega One, this usually isn't a problem. The agent extracts the final answer from the response. But if you see thinking tokens in unexpected places in the UI, it's worth checking whether your Ollama version handles the R1 reasoning format correctly.

Performance expectations

DeepSeek-R1 14B is a strong all-round model for coding. On a 12GB VRAM machine with Ollama, expect token generation in the 15-30 tokens/second range depending on GPU. That's fast enough for interactive chat and agent loops without feeling slow.

For comparison: Qwen2.5-Coder-32B at the same quality level requires ~22GB VRAM. If you have less than 16GB VRAM and want strong coding performance, DeepSeek-R1 14B is worth trying first.

See the full BYOLLM provider list for all local and cloud options supported in Bodega One. If you want to try cloud DeepSeek (via API) for comparison, the custom provider preset supports any OpenAI-compatible endpoint.

Ready to own your tools?

Beta opens May 2026. Complete 14 days and earn a $30 promo code.