Models & providers

Bodega Mixture

Bodega Mixture is an optional Mixture-of-Agents engine. On a message you send it, several "reference" models answer in parallel, then one "aggregator" model reads all of their drafts and writes the single final reply. It is off by default and activated per-message from the model picker.

What Mixture does

A normal turn asks one model. A Mixture turn asks several and then synthesizes them:

References run in parallel. Each reference model gets the conversation text only - no tools - and drafts its own answer.
The aggregator synthesizes. One aggregator model reads every reference draft and writes the final response. The aggregator is the only model that uses tools and produces what you see.
Optional QEL gate. On creation tasks, the synthesized result can be run through QEL verification before it is returned.

The idea is breadth then judgment: diverse drafts surface more of the solution space, and a single capable aggregator distills them into one coherent answer. Mixture is global-only (all mixture.* settings apply app-wide, not per-project) and off by default.

Turn it on (Settings → Models → Mixture)

Open Settings → Models → Mixture and set it up once:

Enable Mixture.
Add reference models - pick from the dropdown of your configured providers (it lists every model across the providers you have set up, grouped by provider). Add at least two; with fewer than two reachable references, a Mixture turn quietly falls back to a normal single-model turn.
Choose an aggregator model from the same dropdown. This is the model that writes your final answer, so pick your strongest one.
(Optional) turn on the QEL gate to verify synthesized creation-task output.

Then, in any chat, open the model picker and choose "Mixture" to run that message as a mixture. It is per-message: the picker resets after each send, so you opt into a mixture turn each time. While it runs, a progress card shows the references fanning out and the aggregator synthesizing.

Settings reference

Setting	What it does
Enable Mixture	Master switch. When off, "Mixture" is hidden in the model picker.
Reference models	The models queried in parallel (need ≥ 2). Each drafts an answer with no tools.
Aggregator model	The model that reads all drafts and writes the final reply (owns tools).
Reference temperature	Sampling temperature for the reference calls. Higher = more diverse drafts (default 0.6).
Aggregator temperature	Temperature for the synthesis call. Lower = more focused (default 0.4).
Max tokens per call	Output budget for each reference call and the aggregator call.
QEL gate	Require the synthesized result to pass QEL before returning. Slower, higher-quality on creation tasks.

Reference and aggregator entries are stored as providerId:modelName (for example anthropic:claude-opus-4-8 or ollama:qwen3); the dropdowns produce that for you from your configured providers.

Cost, privacy, and practical notes

Cost. Local reference models run on your hardware for free, so an all-local mixture costs roughly one cloud call (the aggregator, if you point it at a cloud model) - unlike cloud-only Mixture-of-Agents, where cost multiplies by the number of references.

Air-gap. Under air-gap, cloud references are dropped and only local references run; if that leaves fewer than two, the turn falls back to a normal single-model turn.

Running multiple local models. References run as separate model calls. On a single llama.cpp server only one model is resident at a time, so two llamacpp: references make it swap models between them (slow). Smoother options: several Ollama models (Ollama can hold more than one), or a mix of local + cloud references. With one local model and no cloud access, a true parallel mixture is limited - add a second runnable model to feel the benefit.

Pick a strong aggregator. The aggregator writes the answer and runs the tools, so its quality sets the ceiling. Reference diversity helps most when the aggregator is capable enough to judge between drafts.

This page mirrors the in-app docs hub for app version 1.0.0-beta.31.6. Found something unclear or out of date? Tell us on Discord. New here? Download the free beta and follow along.