Skip to main content
augment code alternativemigrationlocal-firstBYOLLM

Augment Code Is Sunsetting Completions. Here's What to Do Next.

Bodega One9 min read
Quick answer

Augment Code is removing inline completions (Next Edit + Completions) from Indie, Standard, and Legacy plans on March 31, 2026. Enterprise is unaffected. Augment isn't shutting down. They're pivoting to Intent, a multi-agent orchestration product. If you relied on their completions, your options are: stay with Augment for Intent, switch to Cursor (from $20/mo), or go local-first with Bodega One ($79 one-time). This post walks through each option honestly.

If you use Augment Code on an Indie ($20/mo), Standard ($60/mo), or Legacy plan, you just got the email. On March 31, 2026, inline completions are gone. Next Edit and Completions, the features many developers signed up for, will stop working on those tiers.

This isn't speculation. Augment announced it directly. Enterprise customers keep everything. Everyone else loses the completion features.

Let's be clear about what's not happening: Augment Code is not shutting down. The company has $252M in funding and $20M in revenue. They have 156 employees and they're actively building their next product. This is a strategic pivot, not a collapse.

But if you're an individual developer or a small team that relied on those completions for daily work, the distinction between “shutting down” and “removing the feature you actually use” is academic. You need a plan before April 1.

Why Augment is doing this

Augment is going all-in on Intent, their multi-agent orchestration product. The thesis: the future of AI coding isn't autocomplete. It's autonomous agents that plan, execute, and verify multi-step tasks across your codebase.

It's not a bad thesis. The industry is clearly moving in that direction. But the execution has friction:

  • Intent is macOS only. There's no Windows timeline. There's no Linux roadmap. If you work on either platform, Intent isn't an option today.
  • Credit-based pricing replaced flat pricing in October 2025. Power users have reported 2-3x cost increases under the new system. Predictable billing is gone.
  • The completion sunset removes the most-used daily feature from the plans most individual developers are on.

Give Augment credit where it's due: their Context Engine is genuinely impressive. It indexes 100K+ files, they have 100+ native tools, and their MCP support is solid. For enterprise teams with big monorepos, it's a real differentiator. The question is whether that matters to you if you can't use their completions.

What your options actually are

You have three realistic paths. Each one has real tradeoffs.

Option 1: Stay with Augment and use Intent

If you're on macOS and the agent-first workflow appeals to you, Intent could be the right move. Augment's deep context indexing is their strongest feature, and Intent builds on top of it.

The downsides:

  • macOS only. No Windows, no Linux.
  • Credit-based pricing means your costs will vary month to month.
  • You're giving up the completion workflow you already had dialed in.
  • You're betting on a new product that's still early.

Option 2: Switch to Cursor

Cursor is the most obvious alternative. It's a VS Code fork with strong AI features, including tab completions. $20/month for Pro. Huge community. Active development.

The downsides:

  • It's a subscription. $20/month is $240/year, $720 over three years.
  • Your code goes to their servers for every completion and chat message.
  • You're locked to their model choices (though they offer some flexibility).
  • You're back in the same position: dependent on a company that can change pricing, models, or features at any time.

Option 3: Go local-first

Run your own models. Own the tooling. No subscription. No code leaving your machine. This is the path we built Bodega One for.

The downsides (being honest):

  • You need a GPU. Local inference requires hardware. A 12GB+ VRAM card is the sweet spot.
  • Local models aren't as strong as frontier cloud models. The gap is narrowing fast, but it exists.
  • Initial setup takes 10-15 minutes (Ollama install + model pull + connect to IDE).

The case for local completions

Inline completions are the most latency-sensitive AI feature you use. Every millisecond matters. Every extra keystroke of delay breaks your flow.

This is where local models have a structural advantage. Fill-in-the-Middle (FIM) models running on your own GPU have:

  • Zero network latency. No round-trip to a server. The completion is generated on the same machine where you're typing.
  • Zero cost per keystroke. Cloud completions cost money per token. Local completions cost electricity. After the hardware, the marginal cost is effectively zero.
  • Zero data exposure. Your code never leaves your machine. No server logs. No training data pipelines. No questions about who sees what.
  • Zero dependency on someone else's business decisions. Nobody can sunset your local model.

The recommended FIM model for local completions right now is qwen2.5-coder. At the 7B size with Q4 quantization, it runs comfortably on 8GB VRAM and produces completions that are genuinely useful for day-to-day work. Not perfect. But fast, free, and private.

For a full breakdown of which model to run at each VRAM tier, see our Local LLM Guide.

How Bodega One fits

Bodega One is a local-first AI desktop IDE. Full Monaco editor, AI chat, and an autonomous coding agent. It runs on Windows, macOS, and Linux.

Here's how the features map to what you had with Augment:

FeatureAugment CodeBodega One
Inline completionsRemoved from Indie/StandardFIM via local models
AI chatYes (cloud)Yes (local or cloud, your choice)
Autonomous agentIntent (macOS only)Built-in, all platforms
Code verificationNot specifiedQEL - 3-level automated verification
Model supportAugment's models10+ providers (Ollama, LM Studio, vLLM, OpenAI, Groq, etc.)
PlatformsVS Code extension + Intent (macOS)Windows, macOS, Linux
PrivacyCloud-processedAir-gap mode with 9 enforcement layers
Pricing$20-60/mo (credit-based)$79 one-time (Personal) / $109 one-time (Pro)

The big difference: Bring Your Own LLM (BYOLLM). You bring whatever LLM you want. Run Ollama locally for zero-cost completions. Use OpenAI when you need frontier reasoning. Switch between them per-message. You're not locked to any single provider.

For the full feature-by-feature comparison, see Bodega One as an Augment Code alternative.

Pricing: what you'd actually pay

Let's do the math over three years, since that's when the difference becomes impossible to ignore.

ToolYear 1Year 2Year 3 total
Augment Indie$240$480$720+
Cursor Pro$240$480$720
Cursor Pro+$720$1,440$2,160
Bodega One Personal$79$79$79

That's $79 total vs. $720. Not per year. Total. See the full breakdown in our 3-year cost analysis.

Getting started with local completions

Here's the fastest path to get running with local completions:

  1. Install Ollama - One command. Works on Windows, macOS, and Linux. Visit ollama.com and follow the install instructions for your OS.
  2. Pull a FIM model -
    ollama pull qwen2.5-coder:7b
    This is the recommended model for inline completions. 7B parameters, Q4 quantization, runs on 8GB VRAM.
  3. For chat and agent tasks, pull a larger model if your hardware supports it:
    ollama pull qwen2.5-coder:14b
    14B is the sweet spot for 12GB VRAM cards.
  4. Connect to Bodega One - In settings, select Ollama as your provider. The connection is automatic (localhost:11434).

For the full walkthrough with troubleshooting and model recommendations by VRAM tier, see Setting up Ollama with Bodega One.

If you prefer LM Studio (especially on Mac), we have a dedicated guide: LM Studio + Bodega One setup guide.

Hardware quick reference

You don't need a 4090. Here's what actually works at each tier:

  • 8GB VRAM (RTX 3060, 4060) - Qwen2.5-Coder 7B Q4. Good completions, good chat.
  • 12GB VRAM (RTX 3060 12GB, 4070) - Qwen3-14B Q4. Strong completions, strong agent work.
  • 16-24GB VRAM (RTX 4070 Ti Super, 4090) - Qwen2.5-Coder 32B Q4. This is the gold tier. Completions rival cloud quality.
  • Apple Silicon 16GB+ - Qwen3-8B MLX Q4. Unified memory means the whole model fits without discrete GPU.

For the complete GPU guide, see Which GPU do you actually need for local AI?


March 31 is a week away

If you're an Augment Code user on an Indie, Standard, or Legacy plan, your completions stop working in days. Not months. Days.

You can wait and see what Intent looks like (if you're on macOS). You can switch to Cursor and start a new subscription. Or you can try local-first and stop depending on someone else's product roadmap.

Bodega One launches beta in May 2026 with full launch on July 6. Join the waitlist to get early access and a $30 promo code for completing the beta.

Join the Bodega One waitlist | Read the docs | See pricing

Ready to own your tools?

Beta opens May 2026. Complete 14 days and earn a $30 promo code.