Agents & automation
AI Panels & the Agentic Loop
Code Mode's right sidebar has four AI panels - Agent, Research, Debug, and Advisor. Each runs a separate conversation with its own model, tool set, and iteration budget, all powered by the same agentic loop under the hood.
The four panels at a glance
| Panel | Shortcut | What it does | Tools available | Max iterations | Writes files? |
|---|---|---|---|---|---|
| Agent | Ctrl+L |
Full coding agent - reads, writes, runs shell, searches the web, plans tasks | All 24 built-in tools + connected MCP tools | 25 (dynamic) | Yes |
| Research | Ctrl+Shift+R |
Web search and codebase queries; pins findings to the knowledge base | web_search, web_fetch, query_knowledge, query_memory |
3 | No |
| Debug | Ctrl+Shift+E |
Diagnoses errors; parses stack traces, reads git blame, traces call paths | file_system (read), grep, glob, shell (read), code_search, query_memory |
5 | No |
| Advisor | Ctrl+Shift+A |
Architecture advice and decision review; reads your codebase | file_system (read), grep, code_search, query_knowledge, query_memory |
15 | No |
Permission modes (Act / Ask / Plan) apply only to the Agent panel. Research, Debug, and Advisor are always read-only regardless of the mode shown in the header.
Assigning models to panels
Each panel can run a different LLM. To assign models:
- Go to Settings → Models → My Models.
- Set the Agent, Research, Debug, and Advisor roles to whichever models you want.
- Leave a role blank to fall back to the Agent (code) model - the panel header shows a
↩indicator when it's using the fallback.
The panel header displays the active model name (truncated at the first colon). Managed GGUF models show their filename (SmolLM2-1.7B-Instruct.gguf) rather than their internal ID.
Settings keys:
- Agent →
llm.code_model - Research →
llm.research_model - Debug →
llm.debug_model - Advisor →
llm.advisor_model
Agent panel - full coding agent
The Agent panel is where actual coding happens. It has access to all 24 built-in tools plus any MCP tools you've connected, runs QEL (Quality Enforcement Layer) verification in full mode, and supports permission gating so you can review writes before they land.
What you'll see while it runs:
- Tool call cards - every tool invocation appears as an expandable card showing tool name, status, args, result summary, and duration in milliseconds.
- FileChangeCards - file writes show filename, status badge (New / Modified / Error), line delta (
+N / -M), and buttons to open the file or view a diff. - ThinkingIndicator - shows the current status (
Preparing…,Reading files…, etc.) and iteration number. - TodoPanel - appears when the agent uses
todo_writeto self-plan a multi-step task. - Queued messages - messages typed while the agent is running are queued and injected between iterations. The queue appears at the bottom of the panel with cancel buttons per message.
- Small-model warning - a banner appears when the active model is below 13B parameters.
The iteration budget is dynamic: max(10, min(profileMax × 1.5, fileCount × 3 + 2)), up to the hard cap of 25.
Permission modes - Act, Ask, Plan
The colored dot in the PanelSidebar header shows the current mode. Change it with the pill selector at the bottom of the Agent panel input.
Act (green dot) - every tool runs automatically. No approval prompts.
Ask (yellow dot) - file writes and shell commands pause for approval before running. Reads and searches proceed automatically. A card appears with Accept / Reject buttons; Enter accepts, Esc rejects. Tool approvals are auto-rejected if you disconnect.
Plan (purple dot) - the agent generates a plan first and stops. The PlanApprovalCard slides up from the bottom showing the proposed files with CREATE / MODIFY / DELETE / READ badges. Enter approves the whole plan and execution starts; Esc rejects. If the model fails to produce a valid plan after 3 attempts, the loop automatically falls back to per-tool approval rather than blocking. Plan approvals persist to the database and survive disconnects - the pending plan is restored when you reload the session.
The /mode slash command also toggles modes. Mode is saved per session.
The code mode is a legacy alias for Ask behavior - if you see it, it behaves identically to Ask.
Using the Agent panel
- Press
Ctrl+Lto focus the Agent panel. - Type a task and press
Enter. The agent starts streaming immediately. - To attach context, click + Current File or + Selection above the input to attach the active editor file or your highlighted selection.
- To capture a screenshot, click the camera icon.
- To queue a follow-up while the agent is running, type in the input and click the queue-send button (separate from the main Send).
- To stop, click the Stop button. This also clears the message queue.
- Approval prompts (Ask / Plan mode) pause execution - use Accept / Reject buttons or
Enter/Esc.
Research panel
The Research panel runs a read-only agent with web access. It tracks citations and adds source attribution to its responses. It runs for up to 3 iterations.
In air-gap mode (Settings → General → Air-Gap), web_search and web_fetch are removed from the tool list. The panel relabels itself Codebase Research and only has access to query_knowledge and query_memory.
Pin to knowledge - after the agent responds, click Pin in the sub-header to save the response permanently to the knowledge base. The Agent panel will see it on subsequent messages.
Implement → Agent - click this button in the sub-header to hand off the research findings to the Agent panel. The panel switches automatically. The agent picks up the handoff context when you send your next message - clicking the button alone doesn't inject anything yet.
Debug panel
The Debug panel detects stack traces you paste into the input and shows a structured error badge in the sub-header with the error type and file/line location. Detection is client-side and pattern-matched against JavaScript (at function (file:line)), Python (File ... line N), and generic Error / Exception / panic / FATAL signatures.
The panel runs up to 5 iterations and has read access to the file system, grep, glob, code_search, and shell for running diagnostic commands (checking logs, process state, etc.). Destructive shell commands are still blocked.
Fix it → Agent - after diagnosis, click this button to hand off the diagnosis and error context to the Agent panel. The Agent panel picks it up on your next message.
Advisor panel
The Advisor panel runs with an Architecture Advisor persona injected into its system prompt. It analyzes code structure, reviews design decisions, and flags patterns and anti-patterns. It can read your codebase via file_system, grep, and code_search, but cannot write files and cannot do live web research (no web_search or web_fetch).
The Decisions toggle in the sub-header opens the DecisionLog viewer - a scrollable history of past advisor responses stored per session.
Iteration limit note: The panel's sub-header badge shows 2 iter, but this is a hardcoded UI bug. The actual backend limit (from PanelConfigs.ts) is 15 iterations.
Cross-panel handoffs
Panels are designed to chain together. Handoffs move context from Research or Debug into the Agent panel.
Research → Agent:
- Ask your research question in the Research panel.
- After the response, click Implement → Agent in the sub-header.
- The panel switches to Agent. Type your implementation task - the research findings are injected automatically into that message's context.
Debug → Agent:
- Paste the stack trace and get a diagnosis.
- Click Fix it → Agent in the sub-header.
- Type the fix task in the Agent panel - the debug context is injected.
Handoffs are tracked in the Context Inspector as breadcrumbs with consumed / pending status. A handoff is consumed exactly once - the Agent sees it on the first message after the handoff, then it's gone.
The Advisor → Agent path has no formal handoff button. Ask the advisor for a design decision, then describe the implementation task in the Agent panel yourself.
The agentic loop pipeline
Every message sent to any panel runs through the same pipeline server-side. Entry point: POST /api/chat/complete → AgenticChatService.processMessageStream(). The frontend receives results over an SSE stream.
Stage 1 - Contract extraction (<5ms, no LLM call): parses your message into a machine-checkable contract - what files should exist, what patterns, what language/framework. Confidence can be low, medium, or high.
Stage 2 - Context assembly: assembles system prompt + dynamic context (memory, project rules, repo map, conversation history) within the token budget. Compaction triggers automatically at 75% of the context window (85% for MoE models).
Stage 3 - RuntimeLayer.classify(): determines the execution lane, iteration cap, tool allowlist, and QEL mode from the panel config and message classification.
Stage 4 - Main loop (up to maxIterations): LLM call → parse tool calls → execute tools → inject results → continue or exit. Each iteration emits tool call cards to the frontend.
Stage 5 - QEL verification: three levels (see the QEL section below).
Stage 6 - Post-loop: save messages, emit telemetry, consolidate learnings.
You can stop the loop at any time with the Stop button.
Clarification interviews
On creation tasks where the contract confidence is low and the system detects two or more information gaps (missing language, framework, target files, underspecified functionality), the loop pauses before the first LLM call and shows a ClarificationCard with 2–3 targeted questions.
Click Submit to answer and refine the contract - this typically upgrades confidence to medium or high. Click Skip to proceed with the original low-confidence contract.
Clarification fires only in the Agent panel, only on creation tasks, and only before the first iteration - it does not interrupt a loop already in progress.
QEL - Quality Enforcement Layer
QEL runs automatically on every file write in the Agent panel. You don't configure it - you see its output in iteration cards and the VerificationReportCard that appears in the chat feed for creation tasks.
Level 0 - Pre-execution gates (every write): 9-gate pipeline that runs before a file write hits disk. Includes a permission mode firewall, contract guard (filename must match a declared deliverable), forbidden path check (node_modules, .env, .git), duplicate detection, shell redirect scan, and a fast TypeScript syntax validation via the compiler API (~2ms) that blocks broken JS/TS before it's written.
Level 1 - Mid-loop incremental verification (per file write): scores the written file against the contract (patterns 60% + framework consistency 25% + content completeness 15%). A score below 70 triggers a repair nudge.
Level 2 - Micro-proof gates (every 2nd write): runs the language toolchain:
- TypeScript:
npx tsc --noEmit --skipLibCheck - Python:
python -m compileall - Go:
go vet ./... - Rust:
cargo check - Java:
javac - C#:
dotnet build
If the toolchain isn't installed, this gate is skipped silently with no score penalty.
Level 3 - Post-loop full verification: 0–100 score. Pass threshold is 80 for creation tasks, 50 for others. Scoring: file existence (5pts), patterns (35pts), structural integrity multiplier, framework consistency (15pts), completeness (15pts), proof gates (30pts).
Repair flow: up to 3 repair nudges targeting specific missing files and patterns. You see Repairing... in the iteration progress. After 3 failed repairs, the loop ends with a detailed failure report.
QEL mode by panel: Agent = full. Debug = structural (stub detection only, no proof gates). Research and Advisor = none.
The 24 built-in tools
File tools
| Tool | What it does |
|---|---|
file_system |
Read, write, list, delete, mkdir, check existence - sandboxed to ./workspace |
str_replace |
Surgical find-and-replace within a file |
grep |
Regex search via ripgrep |
glob |
File pattern matching |
code_search |
Full-text search with ripgrep (sanitizes shell metacharacters) |
find_symbol |
Looks up where a named symbol is defined across the codebase |
diff_file |
git diff within the workspace sandbox |
run_tests |
Auto-detects test runner, runs tests, injection prevention enabled |
shell |
Hardened shell - credential-scans output, blocks destructive commands |
web_fetch |
HTTP fetch, SSRF-protected (private IPs blocked) |
web_search |
DuckDuckGo search |
save_memory |
Persist a fact to the memory database |
Knowledge and session tools
| Tool | What it does |
|---|---|
query_knowledge |
FTS5 search of the knowledge base |
query_memory |
Search the memory store |
query_session |
Search session history |
link_session |
Create a relationship between two sessions |
query_map |
Semantic search over the codebase embedding index (same as Ask the Map UI) |
Planning and utility
| Tool | What it does |
|---|---|
todo_write |
Agent self-planning TODO list (displays in TodoPanel) |
scratchpad |
Temporary computation workspace |
convert_to_markdown |
HTML / CSV / JSON → Markdown |
create_document |
Structured document creation |
deep_research |
Multi-step parallel research orchestration |
Plus dynamic MCP tools via connected MCP servers.
Tool alias correction: the agent automatically maps 44 known alias mappings to the real tool before a call fails. For example, read_file → file_system, bash → shell, browser → web_fetch. You'll see the correction in the tool call card.
In air-gap mode, web_search and web_fetch are removed from the tool list before the LLM call - the model never sees them.
Context Inspector
The Context Inspector shows exactly what context is being sent to the LLM.
In Code Mode: click the (i) button in the PanelSidebar header. The inspector slides up from the bottom of the panel.
In Chat Mode: click the context budget ring/bar in the chat input area.
What it shows:
- Budget meter - tokens used / context window total. Color: purple (<60%), yellow (60–80%), orange (80–95%), red (95%+).
- ContextBreakdownBar - proportional bar showing system prompt, memory, project rules, conversation, tools, attachments, and repo map.
- Section list - expandable sections for each context segment. The Repo Map section loads on demand.
- Handoff breadcrumbs - cross-panel handoffs with consumed / pending status.
- Compact button - available when the session has more than 2,000 tokens used. Click to summarize older messages via LLM (350–700 token summary) and free up space. A
Compact Nowbutton also appears when usage exceeds 80%. - Memory remove buttons - trash icon on each memory row to delete that entry from the memory store.
- Repo Map re-scan - triggers a fresh PageRank-based symbol ranking.
The inspector polls every 2 seconds during active streaming to show live context changes.
Note: in Code Mode, the inspector always shows the Agent panel's session context. The Research, Debug, and Advisor panels don't have their own inspector views.
PanelStatusPill - watching a background panel
If you switch away from a panel while it's still running, a PanelStatusPill overlay appears at the bottom of the sidebar showing the panel name and current iteration. Click it to jump back to the running panel.
This is useful when you're reading a file in the editor while the Agent is working - you can monitor progress without switching focus.
Background Sessions
The Agent panel has a Run in Background button that detaches the session so it runs without holding the UI. You can navigate away - or close the panel - and the agent continues server-side.
When a background session reaches a terminal state (ready-to-apply, error, or awaiting-approval), a badge appears on the FleetTopBarIndicator in the top bar and you receive a toast and OS notification.
This is separate from the Agent panel's normal in-UI streaming - background sessions communicate via the /ws WebSocket channel, not the chat-stream SSE.
When the agent hits its iteration cap
If the Agent reaches 25 iterations without completing the task, it sends a partial result with a summary of what's left to do. You can send a follow-up to continue - the agent picks up from the summary.
If you're consistently hitting the cap on a task, it usually means the task is too broad. Break it into smaller requests or use Plan mode so you can review the scope before execution starts.
Advisor panel - known UI bug
The Advisor panel's sub-header shows a 2 iter badge. This is wrong - the actual backend limit is 15 iterations (set in PanelConfigs.ts). The badge is hardcoded and hasn't been updated. Expect up to 15 iterations when using the Advisor panel.
Keyboard shortcuts
| Keys | Action |
|---|---|
| Ctrl+L | Focus Agent panel |
| Ctrl+Shift+R | Focus Research panel |
| Ctrl+Shift+E | Focus Debug panel |
| Ctrl+Shift+A | Focus Advisor panel |
| Enter | Accept tool approval or approve plan (in Ask/Plan mode) |
| Esc | Reject tool approval or reject plan |
This page mirrors the in-app docs hub for app version 1.0.0-beta.26.1. Found something unclear or out of date? Tell us on Discord. New here? Download the free beta and follow along.