Agents & automation

Tools & Skills

Bodega has 28 built-in tools that run automatically during agentic tasks, 12 skills you trigger with slash commands, and a set of client-side commands that control the app itself. This page covers what each one does, when it fires, and what to watch out for.

How tools and skills differ

Tools are functions the agent calls on its own during a response - reading a file, running a shell command, doing a web search. You see each tool call as a card in the message stream showing what ran and what came back. In Ask mode, every tool call pauses for your approval before executing - unless you turn on auto-approve read-only tools (Settings → Agent, beta.28, off by default), which lets pure reads (search, grep, glob, symbol lookup, Map/memory/knowledge/session queries) skip the prompt while writes, web, and shell still ask. Shell can never auto-approve in Ask mode.

Skills (/commit, /debug, etc.) are structured workflows you trigger explicitly. They load a pre-written policy into the agent's context that shapes how it approaches the task - which tools it's allowed to use, what order to do things in, and when to stop and ask you. You invoke them with a slash command in any AI panel.

Client-side slash commands (/help, /clear, /compact, /mode, /preview, /map, /export) are handled entirely in the app - nothing goes to the LLM.

File and code tools

Tool	What it does	Key limits
`file_system`	Read, write, append, delete, list, mkdir, rename, check existence. Sandboxed to your project workspace. Large files support pagination via `offset`/`length` params and a `nextOffset` return field.	100MB size limit. Blocked extensions: `.exe`, `.dll`, `.sh`, `.bat`, `.ps1`. The agent must read a file before writing it.
`grep`	Regex search across file contents using ripgrep. Returns matching lines with paths and line numbers.	50 results max, 1MB buffer, 25s timeout. Scoped to workspace.
`glob`	Finds files matching glob patterns, sorted by modification time.	100 results max. Patterns capped at 200 chars, max 5 `**` segments (ReDoS protection).
`str_replace`	Surgical find-and-replace inside a file. Whitespace must match exactly. Use `replace_all` for multiple occurrences.	Code Mode only - blocked in Chat Mode. Exact string match required; no fuzzy matching.
`find_symbol`	Looks up where a function, class, or type is defined. Returns file paths and line numbers. Exact matches rank first.	Index is built when your project opens - very large projects may have a brief delay before it's ready.
`code_search`	Semantic code search supporting symbol names, class definitions, and text patterns with file-type filtering.	500-char query limit. Max 100 results. Shell metacharacters are rejected.
`diff_file`	Shows the git diff for a specific file. Supports staged-only via an optional parameter.	10s timeout. Requires the project to be a git repo.
`run_tests`	Runs your test suite. Auto-detects Vitest, Jest, pytest, or `go test` from the project structure.	120s timeout, 2MB output buffer. Only available in plan, act, code, or debug modes - blocked in Ask mode.
`get_diagnostics`	Pulls the current type errors for a TypeScript/JavaScript file from the bundled language server. Use it before editing already-broken code, instead of running the whole compiler through the shell. Read-only, auto-approves.	Project must be open. Non-TS/JS files return a redirect note. Same bundled server as editor diagnostics - air-gap safe.
`dispatch_scout`	Sends a read-only sub-agent to investigate the codebase and report back a short digest, so the exploration (greps, reads, symbol lookups) doesn't fill the main conversation's context. Good for "how does X work?" or "trace the Z flow."	One scout at a time, 90s cap. The scout can't edit, run commands, or dispatch its own scout. Runs on your primary model.

Shell tool

The shell tool runs commands in your project directory: git operations, package managers, test runners, compilers, linters - anything you'd run in a terminal.

Every command is classified into one of three tiers before it runs:

SAFE (auto-approve in Act mode): ls, cat, git status, and similar read-only commands.
MODERATE (brief confirmation): npm install, git commit, and similar.
DANGEROUS (full approval dialog): rm, chmod, git push, curl, and similar.

In Ask mode, shell commands always show an approval card regardless of tier - there is no auto-approve timeout.

After every execution, output is scanned for 13 credential patterns: SSH private keys, AWS AKIA* keys, GitHub ghp_* PATs, OpenAI sk-* keys, JWTs, and high-entropy base64 strings over 40 chars.

Air-gap mode blocks all network-touching shell commands: curl, wget, ssh, git clone/push/pull, npm install, pip install, docker pull.

Web tools

Both web tools are blocked entirely in air-gap mode.

Tool	What it does	Key limits
`web_search`	Searches the web via DuckDuckGo. No API key needed. Returns up to 8 results with title, URL, and snippet.	20s timeout.
`web_fetch`	Fetches a URL, strips HTML for readability, returns up to 500KB of text. Used to read docs, articles, and API references.	SSRF protection blocks all private IPs (`127.x`, `10.x`, `192.168.x`, `172.16–31.x`, `169.254.x`, localhost, `0.0.0.0`). HTTP/HTTPS only. 30s timeout.

Memory and knowledge tools

Tool	What it does	Notes
`save_memory`	Stores a key-value fact in persistent memory, scoped to your user account. Facts survive across all sessions and are injected into future context automatically.	Rate-limited per session to prevent memory overload. Say "Remember that..." to trigger this directly.
`query_memory`	Searches your persistent memory by keyword across session, shared, and project scopes. Returns up to 10 results, deduplicated (shared scope preferred over project over session).	3s timeout.
`query_knowledge`	Searches your Knowledge Base using a 5-tier strategy: semantic embedding search, FTS5 full-text, LIKE search, full message scan, empty fallback.	Semantic tier requires an embeddings model configured and indexed (Settings → Models → Codebase Embeddings).
`scratchpad`	In-memory notepad for planning multi-step tasks. The agent writes notes, checks them, and clears them as it works.	Session-scoped only - not saved to disk or database.
`query_map`	Asks a natural-language question about your project's codebase and returns a grounded answer with source file citations. Uses semantic search over the codebase index. Never throws if the index hasn't been built yet - it tells you to build the index first.	Requires an embeddings model (Settings → Models → Codebase Embeddings) and the project to be open. 65s timeout. Air-gap: local provider only.

Session and coordination tools

Tool	What it does
`query_session`	Searches the current session's message history - lets the agent refer back to earlier in a long conversation.
`link_session`	Creates parent-child, fork, or merge relationships between sessions. Used for cross-session coordination and Fleet Parallel worktree tracking.
`todo_write`	Creates and manages a session-scoped TODO list for multi-step tasks. Every third tool call, the agent gets a reminder of open items to prevent context drift. Session-scoped only.

Vision and preview tools

These two tools work together when Bodega is interacting with a live web app in the Preview tab.

preview_interaction lets the agent drive the Preview tab (an embedded browser). Five actions:

screenshot - captures a PNG and returns an img_XXXXXXXX handle
navigate - loads a localhost URL (only localhost accepted)
click - clicks an element by CSS selector (approval-gated by default)
getDom - returns outerHTML for a selector (output capped at 8,000 chars)
getConsoleErrors - returns sanitized JS console errors

vision_query takes an image_id from a prior screenshot and a plain-English question (max 500 chars), sends both to your configured Vision Language Model, and returns a text answer. This is how a text-only loop driver (e.g., Claude) can "see" the screen.

To use vision features: configure a VLM in Settings → Models → Vision Model, and have the Preview tab open with a dev server running.

Document and research tools

Tool	What it does	Notes
`create_document`	Generates a structured Markdown document artifact from a chat turn. Triggered when the agent classifies your request as document intent.	Chat Mode only. 50,000-char output cap.
`deep_research`	Multi-step parallel web research. Runs multiple DuckDuckGo queries in parallel, fetches pages, and synthesizes a structured answer with citations. Progress shown in real time: Planning → Searching → Synthesizing.	Blocked in air-gap mode. Max 10 queries per turn, 45s total timeout. Enable via the Research toggle in the `+` menu.
`convert_to_markdown`	Converts HTML, CSV, or JSON content to clean Markdown. HTML becomes headings/links/lists/tables; CSV becomes a table; JSON becomes a fenced code block.	100KB input / 50KB output limits, 5s timeout.

Authoring a skill with /learn

The learn_skill tool turns "I keep doing this same workflow" into a reusable skill. Point it at a source - a folder inside your workspace, or a URL - and it has the model draft a spec-conformant skill YAML, validates the draft, and returns it as a preview for you to review.

This first version stops at the preview: it reads, drafts, and validates, but it does not write the skill or change your skill registry yet - the save/approve/reload steps come later. So you can use it today to see exactly what a skill for a given source would look like before anything is committed.

What it needs:

source - directory (a path inside the workspace) or url.
path_or_url - the folder path or the http(s) URL to learn from.
skill_name - lowercase letters, numbers, hyphens, or underscores.
skill_description - a one-line summary of what the skill does and when to use it.
triggers (optional) - comma-separated trigger phrases; defaults to /<skill_name>.

Safety: directory sources go through the workspace sandbox (no escaping your project), URL sources go through the SSRF-protected fetcher, and the source is truncated before authoring. Under air-gap, URL sources are refused with a clear message - directory sources still work. If the model produces invalid YAML, you get an error rather than a malformed skill.

Hallucination auto-correction

When the LLM calls a tool name that doesn't exist, Bodega auto-corrects it against a large alias table (40+ mappings) before the call fails. The correction is logged in the tool call card so you can see what happened.

Examples of what gets redirected:

bash, exec, run, terminal → shell
read_file, write_file, list_dir → file_system
search, rg, ripgrep, grep_search → grep
find_files, list_files → glob
string_replace, edit_file → str_replace
browse, fetch_url, navigate → web_fetch
remember, store_memory → save_memory
search_knowledge, recall → query_knowledge

One known quirk: the alias code_search in the correction map redirects to grep, not to the code_search tool. If the agent calls code_search by name, it gets grep behavior. The actual code_search tool is only triggered when the LLM uses its exact registered name.

Skills - structured slash command workflows

Skills load a policy into the agent that controls how it approaches the task. Type / in any AI panel to see autocomplete.

Skill	What it does	Mode
`/commit`	Stage changes, generate a conventional commit message (`type(scope): description`), and commit. At least 4 shell calls: `git status`, `git diff`, `git add`, `git commit`.	Code Mode only (needs shell)
`/debug`	Hypothesis-driven bug investigation: read the code, form a hypothesis, test it, apply a minimal fix, verify.	Any panel
`/docs`	Write JSDoc or docstring comments for every export, class, and public method in a specified file.	Any panel
`/explain`	Explain what code does: purpose, data flow, key patterns, dependencies. Reads the file and traces imports.	Any panel
`/generate`	Generate a new file from a plain-English description. Asks clarifying questions if the request is vague, then matches existing project patterns and verifies with `tsc --noEmit`.	Code Mode (needs file writes + shell)
`/perf`	Identify performance hotspots - O(n²) loops, missing memoization, N+1 queries. Presents findings ranked by impact. Does not apply changes automatically.	Any panel
`/refactor`	Structural refactoring with plan-first approval. Reads the file and its dependents, presents the plan, waits for your go-ahead, then applies and verifies.	Code Mode
`/review`	Code quality review producing Critical / Warnings / Suggestions sections, ending with a merge-safety assessment. Without a file argument, reviews the most recently changed file (`git diff --name-only HEAD~1`).	Any panel
`/security`	Static security audit: credential patterns, OWASP top 10 vulnerabilities, unsafe patterns (`eval`, unsanitized `exec`, path traversal). Produces Critical / Warnings / Informational report. Does not modify the file unless you ask.	Any panel
`/test`	Generate tests for a file or function: happy path, error path, and edge cases. Runs the tests and iterates on failures before reporting. Test files stay under 400 lines.	Code Mode (needs shell to run tests)

Client-side slash commands

These are handled entirely in the app - nothing goes to the LLM.

Command	What it does	Available in
`/clear`	Clears the current input field. Does not clear conversation history.	Any panel
`/help`	Lists the available slash commands in a toast.	Any panel
`/compact`	Summarizes older conversation history to free up context window space. A toast confirms and shows how many messages were summarized.	Any panel (requires an open session)
`/mode ask` / `/mode plan` / `/mode act`	Sets the Agent panel permission mode.	Agent panel (Code Mode)
`/export`	Exports the current chat conversation as a Markdown file. Opens a native save dialog pre-filled with the session title.	Chat Mode only
`/preview`	Opens the Preview tab. Bare `/preview` opens a port picker (common ports: 3000, 5173, 8080, 4200) or re-opens the last URL. `/preview localhost:5173` jumps directly.	Any panel (Code Mode)
`/map`	Opens the Bodega Map codebase visualization panel. Requires Dockview layout - if the map doesn't open, check Settings → Layout.	Any panel (Code Mode)

Claude Fast Mode

Claude Fast Mode skips extended thinking on Claude models to get faster replies without switching to a smaller model. Enable it with the Fast toggle in the message composer (next to the reasoning control, shown for Claude models).

The priority order for reasoning control:

Per-message reasoning pill in the composer (always wins)
Claude Fast Mode toggle
Global reasoning effort default (Settings → Models)

Fast Mode only affects Claude models - other providers are unaffected. If you've set a per-message reasoning level in the composer, Fast Mode is ignored for that message.

Custom Agents

Custom agents let you define named profiles with a tailored system prompt, an optional pinned model, a tool allowlist, and an iteration cap.

To create a custom agent:

Go to Settings → Custom Agents
Fill in: Name (required, max 100 chars), System Prompt (required, max 6,000 chars), and optionally Description, a pinned Model, Tool Allowlist, Read-only filesystem, and Max Iterations (1–50)
Leave the tool allowlist empty to allow all tools. Add specific tools to restrict what the agent can call.
Save.

To use a custom agent:

Open Code Mode and find the Agent panel
In the Agent panel header, click the profile picker dropdown (robot icon, shows "Default" when none selected). The picker is hidden if you have no custom agents.
Select a custom agent. The next message you send applies that profile.
Switch back to Default to remove the profile.

If a selected agent is deleted, the picker falls back to Default automatically. Tool allowlists are validated against the live tool registry - invalid tool names are rejected at create time.

Permission mode (ask/plan/act) and sandbox rules are global settings, not agent-derived.

ACP Agent Server - external agents as Fleet members

The ACP (Agent Client Protocol) server lets external coding agents run as Fleet members inside Bodega. Supported agents: Gemini CLI, Claude Code, Codex, and Cursor.

Configure at Settings → ACP Agents. What each one needs:

Gemini CLI - GEMINI_API_KEY and @google/gemini-cli installed globally
Claude Code - ANTHROPIC_API_KEY and @zed-industries/claude-code-acp installed
Codex - OPENAI_API_KEY and @zed-industries/codex-acp installed
Cursor - cursor-agent CLI installed and cursor-agent login run once (subscription auth, no API key)

Once configured, ACP agents appear as Fleet session options. Communication is NDJSON-RPC over stdin/stdout - Bodega spawns the agent as a child process and routes prompts through the ACP session protocol.

What ACP agents can and can't do:

File system and shell access routes through Bodega's own tools - sandbox and air-gap rules still apply
ACP agents do not go through QEL verification - they show an "external - not QEL-verified" badge
Blocked entirely in air-gap mode - enabling air-gap kills any running ACP subprocesses

Managed llama.cpp embedding server

When using llama.cpp as your embeddings provider, Bodega can manage the embedding server process automatically - separate from the chat server (port 8080), running on port 8081 by default.

To set it up:

Install an embedding-capable GGUF first: go to Models → llama.cpp → Discover and download a model such as nomic-embed-text or a bge model
Go to Settings → Knowledge → Search & Embeddings (or Settings → Models → Codebase Embeddings)
Set provider to llama.cpp
Enable the "Let Bodega manage the embedding server" toggle
In the GGUF dropdown, select an installed model (the dropdown only shows models from your llama.cpp library)
Optionally set a custom port (default: 8081)

The embedding server starts with the --embedding flag and is fully independent of the chat server. If you'd rather manage the process yourself, leave the toggle off and type the path manually.

Local code review in the Git panel

The Review button in the Git panel runs an AI code review against your current git diff - the "review before I commit" case. It uses your configured LLM provider, so it works fully locally.

To use it:

Make changes to your project files (staged, unstaged, or both)
Open the Git panel in Code Mode (Activity Bar → git icon)
Click Review (next to the Generate commit message button)
The review result appears inline as a markdown block in the Git panel

If the working tree is clean, the review falls back to comparing the branch against its base (the PR-scope view). Large diffs are clipped and a truncation note is appended.

Delta-only re-review (beta.28): files unchanged since your last review of the project are skipped - the result notes how many, with a Review everything link for a full pass.

This is separate from the /review skill. The /review skill reads source files and runs a code quality analysis. The Git panel Review button reads the git diff and focuses on what changed.

Keyboard shortcuts

Keys	Action
`/commit`	Stage, write a commit message, and commit
`/debug`	Hypothesis-driven bug investigation
`/decompose`	Turn an objective into a persistent goal with verifiable tasks
`/docs`	Generate JSDoc/docstring documentation for a file
`/explain`	Explain what code does
`/generate`	Generate a new file from a description
`/onboard`	Tour an unfamiliar repo and save the findings as project knowledge
`/perf`	Analyze code for performance issues
`/refactor`	Refactor with plan-first approval
`/review`	Code quality review with merge-safety assessment
`/security`	Static security audit
`/test`	Generate and run tests for a file
`/clear`	Clear the current input field
`/compact`	Summarize older conversation history
`/mode ask`	Set Agent panel to Ask mode
`/mode plan`	Set Agent panel to Plan mode
`/mode act`	Set Agent panel to Act mode
`/export`	Export current chat as a Markdown file (Chat Mode)
`/preview`	Open the Preview tab browser
`/map`	Open the Bodega Map codebase visualization

This page mirrors the in-app docs hub for app version 1.0.0-beta.32.1. Found something unclear or out of date? Tell us on Discord. New here? Download the free beta and follow along.