Agents & automation
Tools & Skills
Bodega has 24 built-in tools that run automatically during agentic tasks, 10 skills you trigger with slash commands, and a set of client-side commands that control the app itself. This page covers what each one does, when it fires, and what to watch out for.
How tools and skills differ
Tools are functions the agent calls on its own during a response - reading a file, running a shell command, doing a web search. You see each tool call as a card in the message stream showing what ran and what came back. In Ask mode, writes and shell commands pause for your approval before executing.
Skills (/commit, /debug, etc.) are structured workflows you trigger explicitly. They load a pre-written policy into the agent's context that shapes how it approaches the task - which tools it's allowed to use, what order to do things in, and when to stop and ask you. You invoke them with a slash command in any AI panel.
Client-side slash commands (/help, /clear, /compact, /mode, /preview, /map, /export) are handled entirely in the app - nothing goes to the LLM.
File and code tools
| Tool | What it does | Key limits |
|---|---|---|
file_system |
Read, write, append, delete, list, mkdir, rename, check existence. Sandboxed to your project workspace. Large files support pagination via offset/length params and a nextOffset return field. |
100MB size limit. Blocked extensions: .exe, .dll, .sh, .bat, .ps1. The agent must read a file before writing it. |
grep |
Regex search across file contents using ripgrep. Returns matching lines with paths and line numbers. | 50 results max, 1MB buffer, 25s timeout. Scoped to workspace. |
glob |
Finds files matching glob patterns, sorted by modification time. | 100 results max. Patterns capped at 200 chars, max 5 ** segments (ReDoS protection). |
str_replace |
Surgical find-and-replace inside a file. Whitespace must match exactly. Use replace_all for multiple occurrences. |
Code Mode only - blocked in Chat Mode. Exact string match required; no fuzzy matching. |
find_symbol |
Looks up where a function, class, or type is defined. Returns file paths and line numbers. Exact matches rank first. | Index is built when your project opens - very large projects may have a brief delay before it's ready. |
code_search |
Semantic code search supporting symbol names, class definitions, and text patterns with file-type filtering. | 500-char query limit. Max 100 results. Shell metacharacters are rejected. |
diff_file |
Shows the git diff for a specific file. Supports staged-only via an optional parameter. | 10s timeout. Requires the project to be a git repo. |
run_tests |
Runs your test suite. Auto-detects Vitest, Jest, pytest, or go test from the project structure. |
120s timeout, 2MB output buffer. Only available in plan, act, code, or debug modes - blocked in Ask mode. |
Shell tool
The shell tool runs commands in your project directory: git operations, package managers, test runners, compilers, linters - anything you'd run in a terminal.
Every command is classified into one of three tiers before it runs:
- SAFE (auto-approve in Act mode):
ls,cat,git status, and similar read-only commands. - MODERATE (brief confirmation):
npm install,git commit, and similar. - DANGEROUS (full approval dialog):
rm,chmod,git push,curl, and similar.
In Ask mode, shell commands always show an approval card regardless of tier - there is no auto-approve timeout.
After every execution, output is scanned for 13 credential patterns: SSH private keys, AWS AKIA* keys, GitHub ghp_* PATs, OpenAI sk-* keys, JWTs, and high-entropy base64 strings over 40 chars.
Air-gap mode blocks all network-touching shell commands: curl, wget, ssh, git clone/push/pull, npm install, pip install, docker pull.
Web tools
Both web tools are blocked entirely in air-gap mode.
| Tool | What it does | Key limits |
|---|---|---|
web_search |
Searches the web via DuckDuckGo. No API key needed. Returns up to 8 results with title, URL, and snippet. | 20s timeout. |
web_fetch |
Fetches a URL, strips HTML for readability, returns up to 500KB of text. Used to read docs, articles, and API references. | SSRF protection blocks all private IPs (127.x, 10.x, 192.168.x, 172.16–31.x, 169.254.x, localhost, 0.0.0.0). HTTP/HTTPS only. 30s timeout. |
Memory and knowledge tools
| Tool | What it does | Notes |
|---|---|---|
save_memory |
Stores a key-value fact in persistent memory, scoped to your user account. Facts survive across all sessions and are injected into future context automatically. | Rate-limited per session to prevent memory overload. Say "Remember that..." to trigger this directly. |
query_memory |
Searches your persistent memory by keyword across session, shared, and project scopes. Returns up to 10 results, deduplicated (shared scope preferred over project over session). | 3s timeout. |
query_knowledge |
Searches your Knowledge Base using a 5-tier strategy: semantic embedding search, FTS5 full-text, LIKE search, full message scan, empty fallback. | Semantic tier requires an embeddings model configured and indexed (Settings → Models → Codebase Embeddings). |
scratchpad |
In-memory notepad for planning multi-step tasks. The agent writes notes, checks them, and clears them as it works. | Session-scoped only - not saved to disk or database. |
query_map |
Asks a natural-language question about your project's codebase and returns a grounded answer with source file citations. Uses semantic search over the codebase index. Never throws if the index hasn't been built yet - it tells you to build the index first. | Requires an embeddings model (Settings → Models → Codebase Embeddings) and the project to be open. 65s timeout. Air-gap: local provider only. |
Session and coordination tools
| Tool | What it does |
|---|---|
query_session |
Searches the current session's message history - lets the agent refer back to earlier in a long conversation. |
link_session |
Creates parent-child, fork, or merge relationships between sessions. Used for cross-session coordination and Fleet Parallel worktree tracking. |
todo_write |
Creates and manages a session-scoped TODO list for multi-step tasks. Every third tool call, the agent gets a reminder of open items to prevent context drift. Session-scoped only. |
Vision and preview tools
These two tools work together when Bodega is interacting with a live web app in the Preview tab.
preview_interaction lets the agent drive the Preview tab (an embedded browser). Five actions:
screenshot- captures a PNG and returns animg_XXXXXXXXhandlenavigate- loads a localhost URL (only localhost accepted)click- clicks an element by CSS selector (approval-gated by default)getDom- returnsouterHTMLfor a selector (output capped at 8,000 chars)getConsoleErrors- returns sanitized JS console errors
vision_query takes an image_id from a prior screenshot and a plain-English question (max 500 chars), sends both to your configured Vision Language Model, and returns a text answer. This is how a text-only loop driver (e.g., Claude) can "see" the screen.
To use vision features: configure a VLM in Settings → Models → Vision Model, and have the Preview tab open with a dev server running.
Document and research tools
| Tool | What it does | Notes |
|---|---|---|
create_document |
Generates a structured Markdown document artifact from a chat turn. Triggered when the agent classifies your request as document intent. | Chat Mode only. 50,000-char output cap. |
deep_research |
Multi-step parallel web research. Runs multiple DuckDuckGo queries in parallel, fetches pages, and synthesizes a structured answer with citations. Progress shown in real time: Planning → Searching → Synthesizing. | Blocked in air-gap mode. Max 10 queries per turn, 45s total timeout. Enable via the Research toggle in the + menu. |
convert_to_markdown |
Converts HTML, CSV, or JSON content to clean Markdown. HTML becomes headings/links/lists/tables; CSV becomes a table; JSON becomes a fenced code block. | 100KB input / 50KB output limits, 5s timeout. |
Hallucination auto-correction
When the LLM calls a tool name that doesn't exist, Bodega auto-corrects it against a large alias table (40+ mappings) before the call fails. The correction is logged in the tool call card so you can see what happened.
Examples of what gets redirected:
bash,exec,run,terminal→shellread_file,write_file,list_dir→file_systemsearch,rg,ripgrep,grep_search→grepfind_files,list_files→globstring_replace,edit_file→str_replacebrowse,fetch_url,navigate→web_fetchremember,store_memory→save_memorysearch_knowledge,recall→query_knowledge
One known quirk: the alias code_search in the correction map redirects to grep, not to the code_search tool. If the agent calls code_search by name, it gets grep behavior. The actual code_search tool is only triggered when the LLM uses its exact registered name.
Skills - structured slash command workflows
Skills load a policy into the agent that controls how it approaches the task. Type / in any AI panel to see autocomplete.
| Skill | What it does | Mode |
|---|---|---|
/commit |
Stage changes, generate a conventional commit message (type(scope): description), and commit. At least 4 shell calls: git status, git diff, git add, git commit. |
Code Mode only (needs shell) |
/debug |
Hypothesis-driven bug investigation: read the code, form a hypothesis, test it, apply a minimal fix, verify. | Any panel |
/docs |
Write JSDoc or docstring comments for every export, class, and public method in a specified file. | Any panel |
/explain |
Explain what code does: purpose, data flow, key patterns, dependencies. Reads the file and traces imports. | Any panel |
/generate |
Generate a new file from a plain-English description. Asks clarifying questions if the request is vague, then matches existing project patterns and verifies with tsc --noEmit. |
Code Mode (needs file writes + shell) |
/perf |
Identify performance hotspots - O(n²) loops, missing memoization, N+1 queries. Presents findings ranked by impact. Does not apply changes automatically. | Any panel |
/refactor |
Structural refactoring with plan-first approval. Reads the file and its dependents, presents the plan, waits for your go-ahead, then applies and verifies. | Code Mode |
/review |
Code quality review producing Critical / Warnings / Suggestions sections, ending with a merge-safety assessment. Without a file argument, reviews the most recently changed file (git diff --name-only HEAD~1). |
Any panel |
/security |
Static security audit: credential patterns, OWASP top 10 vulnerabilities, unsafe patterns (eval, unsanitized exec, path traversal). Produces Critical / Warnings / Informational report. Does not modify the file unless you ask. |
Any panel |
/test |
Generate tests for a file or function: happy path, error path, and edge cases. Runs the tests and iterates on failures before reporting. Test files stay under 400 lines. | Code Mode (needs shell to run tests) |
Client-side slash commands
These are handled entirely in the app - nothing goes to the LLM.
| Command | What it does | Available in |
|---|---|---|
/clear |
Clears the current input field. Does not clear conversation history. | Any panel |
/help |
Lists the available slash commands in a toast. | Any panel |
/compact |
Summarizes older conversation history to free up context window space. A toast confirms and shows how many messages were summarized. | Any panel (requires an open session) |
/mode ask / /mode plan / /mode act |
Sets the Agent panel permission mode. | Agent panel (Code Mode) |
/export |
Exports the current chat conversation as a Markdown file. Opens a native save dialog pre-filled with the session title. | Chat Mode only |
/preview |
Opens the Preview tab. Bare /preview opens a port picker (common ports: 3000, 5173, 8080, 4200) or re-opens the last URL. /preview localhost:5173 jumps directly. |
Any panel (Code Mode) |
/map |
Opens the Bodega Map codebase visualization panel. Requires Dockview layout - if the map doesn't open, check Settings → Layout. | Any panel (Code Mode) |
Claude Fast Mode
Claude Fast Mode skips extended thinking on Claude models to get faster replies without switching to a smaller model. Enable it with the Fast toggle in the message composer (next to the reasoning control, shown for Claude models).
The priority order for reasoning control:
- Per-message reasoning pill in the composer (always wins)
- Claude Fast Mode toggle
- Global reasoning effort default (Settings → Models)
Fast Mode only affects Claude models - other providers are unaffected. If you've set a per-message reasoning level in the composer, Fast Mode is ignored for that message.
Custom Agents
Custom agents let you define named profiles with a tailored system prompt, an optional pinned model, a tool allowlist, and an iteration cap.
To create a custom agent:
- Go to Settings → Custom Agents
- Fill in: Name (required, max 100 chars), System Prompt (required, max 6,000 chars), and optionally Description, a pinned Model, Tool Allowlist, Read-only filesystem, and Max Iterations (1–50)
- Leave the tool allowlist empty to allow all tools. Add specific tools to restrict what the agent can call.
- Save.
To use a custom agent:
- Open Code Mode and find the Agent panel
- In the Agent panel header, click the profile picker dropdown (robot icon, shows "Default" when none selected). The picker is hidden if you have no custom agents.
- Select a custom agent. The next message you send applies that profile.
- Switch back to Default to remove the profile.
If a selected agent is deleted, the picker falls back to Default automatically. Tool allowlists are validated against the live tool registry - invalid tool names are rejected at create time.
Permission mode (ask/plan/act) and sandbox rules are global settings, not agent-derived.
ACP Agent Server - external agents as Fleet members
The ACP (Agent Client Protocol) server lets external coding agents run as Fleet members inside Bodega. Supported agents: Gemini CLI, Claude Code, Codex, and Cursor.
Configure at Settings → ACP Agents. What each one needs:
- Gemini CLI -
GEMINI_API_KEYand@google/gemini-cliinstalled globally - Claude Code -
ANTHROPIC_API_KEYand@zed-industries/claude-code-acpinstalled - Codex -
OPENAI_API_KEYand@zed-industries/codex-acpinstalled - Cursor -
cursor-agentCLI installed andcursor-agent loginrun once (subscription auth, no API key)
Once configured, ACP agents appear as Fleet session options. Communication is NDJSON-RPC over stdin/stdout - Bodega spawns the agent as a child process and routes prompts through the ACP session protocol.
What ACP agents can and can't do:
- File system and shell access routes through Bodega's own tools - sandbox and air-gap rules still apply
- ACP agents do not go through QEL verification - they show an "external - not QEL-verified" badge
- Blocked entirely in air-gap mode - enabling air-gap kills any running ACP subprocesses
Managed llama.cpp embedding server
When using llama.cpp as your embeddings provider, Bodega can manage the embedding server process automatically - separate from the chat server (port 8080), running on port 8081 by default.
To set it up:
- Install an embedding-capable GGUF first: go to Models → llama.cpp → Discover and download a model such as
nomic-embed-textor abgemodel - Go to Settings → Knowledge → Search & Embeddings (or Settings → Models → Codebase Embeddings)
- Set provider to llama.cpp
- Enable the "Let Bodega manage the embedding server" toggle
- In the GGUF dropdown, select an installed model (the dropdown only shows models from your llama.cpp library)
- Optionally set a custom port (default: 8081)
The embedding server starts with the --embedding flag and is fully independent of the chat server. If you'd rather manage the process yourself, leave the toggle off and type the path manually.
Local code review in the Git panel
The Review button in the Git panel runs an AI code review against your current git diff - the "review before I commit" case. It uses your configured LLM provider, so it works fully locally.
To use it:
- Make changes to your project files (staged, unstaged, or both)
- Open the Git panel in Code Mode (Activity Bar → git icon)
- Click Review (next to the Generate commit message button)
- The review result appears inline as a markdown block in the Git panel
If the working tree is clean, the review falls back to comparing the branch against its base (the PR-scope view). Large diffs are clipped and a truncation note is appended.
This is separate from the /review skill. The /review skill reads source files and runs a code quality analysis. The Git panel Review button reads the git diff and focuses on what changed.
Keyboard shortcuts
| Keys | Action |
|---|---|
| /commit | Stage, write a commit message, and commit |
| /debug | Hypothesis-driven bug investigation |
| /docs | Generate JSDoc/docstring documentation for a file |
| /explain | Explain what code does |
| /generate | Generate a new file from a description |
| /perf | Analyze code for performance issues |
| /refactor | Refactor with plan-first approval |
| /review | Code quality review with merge-safety assessment |
| /security | Static security audit |
| /test | Generate and run tests for a file |
| /clear | Clear the current input field |
| /compact | Summarize older conversation history |
| /mode ask | Set Agent panel to Ask mode |
| /mode plan | Set Agent panel to Plan mode |
| /mode act | Set Agent panel to Act mode |
| /export | Export current chat as a Markdown file (Chat Mode) |
| /preview | Open the Preview tab browser |
| /map | Open the Bodega Map codebase visualization |
This page mirrors the in-app docs hub for app version 1.0.0-beta.26.1. Found something unclear or out of date? Tell us on Discord. New here? Download the free beta and follow along.