Skip to main content

Core workflows

Chat Mode

Chat Mode (Ctrl+1) is a session-based conversational interface backed by the same agentic loop as Code Mode. It keeps persistent memory across sessions, supports file attachments and web research, and routes messages to the right model tier automatically.

The input bar

The chat composer is a two-row layout:

Top row (inside the input box): + (AttachDropdown) → textarea → Stop / Send

Bottom floating row: Auto/Fast/Smart/Code tier picker → Research pill → Web pill → Boost pill → Reasoning control (brain icon - per-message effort picker; shown for reasoning-capable models) → Fast Mode toggle (bolt icon - skips extended thinking on Claude; shown for Claude models) → Model selector → Mic button (only when voice input is enabled) → context budget button

Note: in the main chat composer, the + menu does not show Research or Web toggles - those are exclusively in the bottom floating row. The + menu contains Attach, Context, and Context Scope only.

All controls in the bottom row are hidden or disabled while the agent is generating a response.

Auto model routing

By default, Bodega classifies each message before sending it and picks the appropriate model tier:

Tier When it fires
Fast Greetings, short lookups, single-line questions
Smart General reasoning, multi-step explanations
Code Programming tasks, debugging, file edits
Image Image generation requests (falls back gracefully if no image model is configured)

The tier picker pill in the input bar shows the active mode. In Auto mode it shows a hint - for example Auto →Smart - as you type, so you know where the message is headed before you send.

The frontend classifier (routingPatterns.ts) runs in real time for the hint. The backend (MessageRoutingService.ts) runs the authoritative classification and determines which model actually handles the request.

Your routing mode choice persists across sessions (chat.routing_mode setting). The models behind each tier are configured at Settings → Models (llm.fast_model, llm.smart_model, llm.code_model).

Single-model setups

When only one model is active - for example, a single llama.cpp or LM Studio instance - the Auto/Fast/Smart/Code tier picker is hidden entirely. All tiers resolve to the same model, so the picker would be meaningless. The Research and Web pills stay visible because those are independent of which model handles the request.

The + menu

Click the + button at the left of the input bar to open the AttachDropdown. It has four sections:

Attach

  • Add files or photos - opens the system file picker
  • Take a screenshot - minimizes the app briefly, captures the full desktop, attaches the image

Context

  • Add to project folder - links a filesystem folder to the current session. A remove button appears when a folder is already linked.
  • Add knowledge from URL - opens the KnowledgeDialog to fetch a page or paste text into the knowledge base

Context Scope (only meaningful when a project folder is linked)

  • Auto - repo map + recent files
  • Current File - only the open editor file
  • Selection - only selected code
  • Codebase - full repo map

Close with Escape or by clicking outside.

Research mode

Research mode runs a dedicated synthesis phase before the agent responds. It plans 3–5 search queries from your message, runs them in parallel via DuckDuckGo, ranks and de-duplicates the results, then produces a cited summary - all before the normal LLM response begins.

You see live progress between your message and the reply: Planning → Searching → Synthesizing → Complete.

Toggle it via the Research pill in the bottom row of the input bar (active state: purple-tinted with a border). Research mode is blocked when air-gap is on.

Research mode vs. web search: Research runs a pre-loop multi-query synthesis. Web search lets the agent decide mid-response whether to call web_search or web_fetch. Use Research when you want comprehensive coverage with citations; use Web search when you want the agent to look things up on its own judgment.

When Web search is on, the agent can call the web_search and web_fetch tools during its normal response. Unlike Research mode, there's no separate synthesis phase - the agent decides when and whether to search.

Toggle via the Web pill in the bottom row of the input bar.

Web search resets to off when the app restarts - it is not persisted to disk.

The auto-intent router can also enable web search automatically for a single turn when it detects recency-dependent patterns ("latest version", "current price") even if the toggle is off. It only promotes OFF→ON, never overrides a setting you've made.

Cloud Boost

The Boost pill routes the current message through a configured cloud provider (default: OpenRouter) when local model quality is insufficient for the task.

  1. Configure it at Settings → Cloud Boost - enter your API key.
  2. The Boost pill appears in the input bar between the textarea and the model selector.
  3. Click it to activate for the next send (it turns solid purple with a lightning icon).
  4. It resets to off after each send.

Boost is hidden when no cloud provider is configured or when air-gap mode is on. The QEL system can also activate Boost automatically when a verification fails and hardware headroom is low (BoostEscalationDecider).

Attaching files

Three ways to attach files:

  1. File picker: Click + → Add files or photos
  2. Drag and drop: Drag files onto the chat window - a drop overlay appears when files enter the window
  3. Paste image: Ctrl+V pastes an image from the clipboard directly into the attachment area

Attached files appear as chips above the input. Click the × on a chip to remove it before sending.

File contents are injected as inline text. Images are base64-encoded and require a vision-capable model for interpretation. Watch the context budget meter on the right side of the input bar - large attachments fill the context window quickly.

Voice input

Voice input transcribes microphone audio to text. It's off by default - the mic button only appears once you enable it.

  1. Go to Settings → Profile → Voice Input and toggle it on (stt.enabled).
  2. The mic button appears in the input bar.
  3. Click it to start recording. The button turns red with a pulsing ring that tracks mic amplitude.
  4. Click again to stop, or it auto-stops after 2 minutes.
  5. Transcribed text is appended to whatever is already in the input field.

Provider options (stt.provider):

  • ollama (default) - hits Ollama's /v1/audio/transcriptions endpoint (Ollama 0.5+) and falls back to /api/generate with base64 audio on older versions
  • openai - uses OpenAI Whisper (whisper-1); blocked in air-gap mode

For a dedicated faster-whisper-server, point stt.base_url at it (for example http://localhost:8000). The model defaults to large-v3 for Ollama and whisper-1 for OpenAI; override with stt.model.

Persistent memory

The agent automatically saves facts from conversations using the save_memory tool and retrieves relevant ones before each response via query_memory. You don't configure this - it runs on every message.

Memories persist indefinitely in SQLite, scoped to your user account. Relevance uses BM25 + recency + access-frequency scoring. The context assembly layer keeps memory injection within 5–10% of the context window budget.

Useful interactions:

  • "What do you remember about me?" - triggers an explicit query_memory call
  • "Forget [fact]" - the agent removes it from memory
  • View and delete memories at Settings → Knowledge

The Knowledge Base is separate from Memory. Memory = facts the agent extracted from conversations. Knowledge = content you added manually (via + → Add knowledge from URL, or Settings → Knowledge). The agent queries both, but via different tools (query_memory vs. query_knowledge).

Prompt templates

Prompt templates define how the agent behaves - its purpose, tone, tool usage policy, and output formatting. You can set separate defaults for Chat and Code mode; Bodega switches automatically when you change modes.

  1. Go to Settings → Prompts.
  2. The top of the section shows two Default Selector widgets: Chat Default and Code Default.
  3. Below is a template list with tabs: All / Built-in / Custom.
  4. Click + New to create a template. Built-in templates are read-only.
  5. Each template has five content fields: Purpose, Tone & Style, Tool Usage Policy, Safety & Permissions, Output Formatting.
  6. Click the lightning bolt icon on a template card to set it as the Chat or Code default.

Templates layer on top of the base system prompt. The lower-level persona.system_prompt override (Settings → Profile → System Prompt) predates the templates system and still works if you need full control.

The agent's display name is persona.name (default: Bodega). Change it at Settings → Profile.

Follow-up queue

You can type and send a message while the agent is still responding. The new message goes into a queue instead of interrupting.

  • A badge appears next to the stop button showing how many messages are queued (e.g. 2 queued).
  • The input placeholder changes to "Send a follow-up..." and the send button tooltip changes to "Add to queue".
  • Queued messages inject between tool iterations, not mid-sentence.

Hitting Stop cancels the current response and clears all queued messages.

Context budget meter

The meter on the right side of the input bar shows how much of the model's context window is in use. It updates automatically after each message (the data comes from the streaming backend via debug frames).

It shows nothing on a fresh session before the first message. When it's visibly full, use /compact to summarize older history and free space.

Auto-intent detection

The backend's ChatIntentRouter analyzes each message and can automatically promote feature toggles from off to on for a single request:

  • Recency-dependent query ("latest version", "current price") → enables web search for that turn
  • Comprehensive analysis request → enables research mode for that turn
  • Complex reasoning request → promotes extended thinking for that turn

It only promotes off→on. It never overrides a toggle you've explicitly set. The frontend shows routing hints in real time (e.g. Auto →Smart) but the backend is the authoritative router.

Slash commands

Type / in the input to open the command menu. Arrow keys or Tab to navigate, Enter to select, Escape to dismiss.

Command What it does
/export Save the current conversation as a Markdown file. Opens a save dialog. Chat mode only - shows an info toast in Code mode.
/compact Summarize older history to free context space. POSTs to the backend and refreshes the message list on success.
/clear Clears the input field. Does not delete conversation history.
/mode ask|plan|act Sets the permission mode for Code mode panels. Has no effect in Chat mode.
/preview [url] Opens the Preview tab. Without a URL, shows the port picker (3000 / 5173 / 8080 / 4200). With a URL, jumps straight to it.
/map Opens the Bodega Map (file-dependency graph for the linked project). Shows an info toast if the map is disabled in settings.

Skill triggers (e.g. /commit, /review) also appear in the menu when configured. They pass through to the backend as messages rather than being handled client-side.

Exported files include user, assistant, and system message text only. Tool call results, thinking blocks, and attached file content are not included.

Keyboard shortcuts

KeysAction
Ctrl+1Switch to Chat Mode
EnterSend message (or add to queue if agent is responding)
Shift+EnterNew line in the input
EscapeDismiss slash command menu or close + dropdown
/exportExport conversation to Markdown
/compactSummarize old history to free context space
/clearClear the input field
/mode ask|plan|actSet permission mode (Code mode panels only)
/preview [url]Open Preview tab
/mapOpen Bodega Map

This page mirrors the in-app docs hub for app version 1.0.0-beta.26.1. Found something unclear or out of date? Tell us on Discord. New here? Download the free beta and follow along.