Skip to main content
air-gapprivacysecurity

Air-gapped AI development for regulated industries

Bodega One7 min read
Quick answer

Air-gapped AI development means running AI models on isolated hardware with no network access. Bodega One's air-gap mode enforces this with 9 layers: tool filtering, shell blocking, auto-updater blocking, and more. Nothing leaves the machine.

Finance teams can't send client data to a cloud AI provider. Healthcare developers can't route patient records through an external API. Defense contractors are often prohibited from using any cloud service for certain work categories. Legal teams worry about privilege.

The question “can we use AI for development?” has a different answer for these teams than it does for a startup building a consumer app. The answer is yes, but not with most of the tools those startups use.

What regulated industries actually need

The requirements vary by sector, but the common thread is data residency and egress control:

  • Finance (SOC 2, PCI DSS, FCA): Source code is often classified. Any tool that sends code to an external service is a compliance risk. This includes cloud AI coding tools, even those with “enterprise privacy” modes.
  • Healthcare (HIPAA, GDPR): If code touches or processes PHI, sending it to an external AI provider requires a BAA and a full audit trail. Most AI coding tools don't offer either.
  • Defense and government: Often requires air-gap by policy, not just preference. The network simply isn't there.
  • Legal: Attorney-client privilege concerns with any cloud-based code analysis tool that might touch case-related code or systems.

What air-gapped AI development looks like in practice

The setup has three components: a local model, a local model server, and an AI development tool that can enforce network isolation.

Local model: A quantized LLM running on-premise hardware. For coding tasks, Qwen2.5-Coder-32B on a 24GB VRAM machine is competitive with GPT-4o on standard benchmarks. For teams without high-VRAM GPUs, Qwen3-14B on 12GB is strong for most everyday tasks.

Local model server: Ollama or vLLM on Linux (common for server deployments), or LM Studio on a developer's machine. Both expose an OpenAI-compatible API on localhost.

AI development tool: This is where most tools fall short. Many tools that claim to support local models still make outbound connections for telemetry, updates, account sync, or cloud-assisted features. A tool built for regulated environments needs to enforce isolation at a systems level, not just in settings.

Bodega One's air-gap enforcement

Bodega One's air-gap mode isn't a single toggle that disables the cloud LLM selector. It is 9 separate enforcement layers that cover every outbound path:

  1. Tool filtering: web_fetch and web_search are removed from the agent's tool list
  2. Pre-execution guard: checks every tool call before execution
  3. Shell command blocking: curl, wget, and other network commands are blocked in terminal
  4. Context assembly guard: prevents web content from being injected into context
  5. Auto-updater blocking: no update checks or downloads
  6. UI feedback: persistent indicator showing air-gap is active
  7. Cloud STT blocking: speech-to-text uses only local providers
  8. System prompt filtering: strips any instructions referencing external URLs
  9. Git IPC blocking: remote git operations are blocked

The result is a verified zero-egress environment. The agent can still write code, run local tests, read files, and use all local tools. It just cannot send anything outside the machine.

Hardware for regulated environments

Teams setting up air-gapped AI development need hardware that can run capable models. The economics have shifted dramatically in 2025-2026:

  • Individual developer workstation: A machine with a 24GB VRAM GPU (RTX 3090, RTX 4090, or similar) running Qwen2.5-Coder-32B is the current practical gold standard for local coding AI. One-time hardware cost, no ongoing API fees.
  • Shared inference server: A single high-VRAM server (2×A100 80GB, for example) running vLLM can serve a team of developers on an internal network that never touches the public internet. Each developer uses Bodega One pointed at the internal vLLM endpoint.
  • Apple Silicon for Mac shops: The M3 Ultra with 192GB unified memory runs large models surprisingly well and ships in a standard Mac Pro form factor.

The capability question

The concern regulated teams most often raise: “are local models actually good enough?”

For most enterprise development work in 2026, yes. Qwen2.5-Coder-32B scores within a few percentage points of GPT-4o on HumanEval and SWE-bench. For standard CRUD work, API integration, test writing, refactoring, and documentation. Quality isn't the blocker anymore. The blocker is usually setup complexity and organizational buy-in.

For a more detailed breakdown of model performance by VRAM tier, see the GPU guide. For a deeper look at the air-gap enforcement architecture, see the air-gap mode technical post.

Ready to own your tools?

Beta opens May 2026. Complete 14 days and earn a $30 promo code.