Augment Code is removing inline completions (Next Edit + Completions) from Indie, Standard, and Legacy plans on March 31, 2026. Enterprise is unaffected. Augment isn't shutting down. They're pivoting to Intent, a multi-agent orchestration product. If you relied on their completions, your options are: stay with Augment for Intent, switch to Cursor (from $20/mo), or go local-first with Bodega One ($79 one-time). This post walks through each option honestly.
If you use Augment Code on an Indie ($20/mo), Standard ($60/mo), or Legacy plan, you just got the email. On March 31, 2026, inline completions are gone. Next Edit and Completions, the features many developers signed up for, will stop working on those tiers.
This isn't speculation. Augment announced it directly. Enterprise customers keep everything. Everyone else loses the completion features.
Let's be clear about what's not happening: Augment Code is not shutting down. The company has $252M in funding and $20M in revenue. They have 156 employees and they're actively building their next product. This is a strategic pivot, not a collapse.
But if you're an individual developer or a small team that relied on those completions for daily work, the distinction between “shutting down” and “removing the feature you actually use” is academic. You need a plan before April 1.
Why Augment is doing this
Augment is going all-in on Intent, their multi-agent orchestration product. The thesis: the future of AI coding isn't autocomplete. It's autonomous agents that plan, execute, and verify multi-step tasks across your codebase.
It's not a bad thesis. The industry is clearly moving in that direction. But the execution has friction:
- Intent is macOS only. There's no Windows timeline. There's no Linux roadmap. If you work on either platform, Intent isn't an option today.
- Credit-based pricing replaced flat pricing in October 2025. Power users have reported 2-3x cost increases under the new system. Predictable billing is gone.
- The completion sunset removes the most-used daily feature from the plans most individual developers are on.
Give Augment credit where it's due: their Context Engine is genuinely impressive. It indexes 100K+ files, they have 100+ native tools, and their MCP support is solid. For enterprise teams with big monorepos, it's a real differentiator. The question is whether that matters to you if you can't use their completions.
What your options actually are
You have three realistic paths. Each one has real tradeoffs.
Option 1: Stay with Augment and use Intent
If you're on macOS and the agent-first workflow appeals to you, Intent could be the right move. Augment's deep context indexing is their strongest feature, and Intent builds on top of it.
The downsides:
- macOS only. No Windows, no Linux.
- Credit-based pricing means your costs will vary month to month.
- You're giving up the completion workflow you already had dialed in.
- You're betting on a new product that's still early.
Option 2: Switch to Cursor
Cursor is the most obvious alternative. It's a VS Code fork with strong AI features, including tab completions. $20/month for Pro. Huge community. Active development.
The downsides:
- It's a subscription. $20/month is $240/year, $720 over three years.
- Your code goes to their servers for every completion and chat message.
- You're locked to their model choices (though they offer some flexibility).
- You're back in the same position: dependent on a company that can change pricing, models, or features at any time.
Option 3: Go local-first
Run your own models. Own the tooling. No subscription. No code leaving your machine. This is the path we built Bodega One for.
The downsides (being honest):
- You need a GPU. Local inference requires hardware. A 12GB+ VRAM card is the sweet spot.
- Local models aren't as strong as frontier cloud models. The gap is narrowing fast, but it exists.
- Initial setup takes 10-15 minutes (Ollama install + model pull + connect to IDE).
The case for local completions
Inline completions are the most latency-sensitive AI feature you use. Every millisecond matters. Every extra keystroke of delay breaks your flow.
This is where local models have a structural advantage. Fill-in-the-Middle (FIM) models running on your own GPU have:
- Zero network latency. No round-trip to a server. The completion is generated on the same machine where you're typing.
- Zero cost per keystroke. Cloud completions cost money per token. Local completions cost electricity. After the hardware, the marginal cost is effectively zero.
- Zero data exposure. Your code never leaves your machine. No server logs. No training data pipelines. No questions about who sees what.
- Zero dependency on someone else's business decisions. Nobody can sunset your local model.
The recommended FIM model for local completions right now is qwen2.5-coder. At the 7B size with Q4 quantization, it runs comfortably on 8GB VRAM and produces completions that are genuinely useful for day-to-day work. Not perfect. But fast, free, and private.
For a full breakdown of which model to run at each VRAM tier, see our Local LLM Guide.
How Bodega One fits
Bodega One is a local-first AI desktop IDE. Full Monaco editor, AI chat, and an autonomous coding agent. It runs on Windows, macOS, and Linux.
Here's how the features map to what you had with Augment:
| Feature | Augment Code | Bodega One |
|---|---|---|
| Inline completions | Removed from Indie/Standard | FIM via local models |
| AI chat | Yes (cloud) | Yes (local or cloud, your choice) |
| Autonomous agent | Intent (macOS only) | Built-in, all platforms |
| Code verification | Not specified | QEL - 3-level automated verification |
| Model support | Augment's models | 10+ providers (Ollama, LM Studio, vLLM, OpenAI, Groq, etc.) |
| Platforms | VS Code extension + Intent (macOS) | Windows, macOS, Linux |
| Privacy | Cloud-processed | Air-gap mode with 9 enforcement layers |
| Pricing | $20-60/mo (credit-based) | $79 one-time (Personal) / $109 one-time (Pro) |
The big difference: Bring Your Own LLM (BYOLLM). You bring whatever LLM you want. Run Ollama locally for zero-cost completions. Use OpenAI when you need frontier reasoning. Switch between them per-message. You're not locked to any single provider.
For the full feature-by-feature comparison, see Bodega One as an Augment Code alternative.
Pricing: what you'd actually pay
Let's do the math over three years, since that's when the difference becomes impossible to ignore.
| Tool | Year 1 | Year 2 | Year 3 total |
|---|---|---|---|
| Augment Indie | $240 | $480 | $720+ |
| Cursor Pro | $240 | $480 | $720 |
| Cursor Pro+ | $720 | $1,440 | $2,160 |
| Bodega One Personal | $79 | $79 | $79 |
That's $79 total vs. $720. Not per year. Total. See the full breakdown in our 3-year cost analysis.
Getting started with local completions
Here's the fastest path to get running with local completions:
- Install Ollama - One command. Works on Windows, macOS, and Linux. Visit
ollama.comand follow the install instructions for your OS. - Pull a FIM model -
This is the recommended model for inline completions. 7B parameters, Q4 quantization, runs on 8GB VRAM.ollama pull qwen2.5-coder:7b - For chat and agent tasks, pull a larger model if your hardware supports it:
14B is the sweet spot for 12GB VRAM cards.ollama pull qwen2.5-coder:14b - Connect to Bodega One - In settings, select Ollama as your provider. The connection is automatic (localhost:11434).
For the full walkthrough with troubleshooting and model recommendations by VRAM tier, see Setting up Ollama with Bodega One.
If you prefer LM Studio (especially on Mac), we have a dedicated guide: LM Studio + Bodega One setup guide.
Hardware quick reference
You don't need a 4090. Here's what actually works at each tier:
- 8GB VRAM (RTX 3060, 4060) - Qwen2.5-Coder 7B Q4. Good completions, good chat.
- 12GB VRAM (RTX 3060 12GB, 4070) - Qwen3-14B Q4. Strong completions, strong agent work.
- 16-24GB VRAM (RTX 4070 Ti Super, 4090) - Qwen2.5-Coder 32B Q4. This is the gold tier. Completions rival cloud quality.
- Apple Silicon 16GB+ - Qwen3-8B MLX Q4. Unified memory means the whole model fits without discrete GPU.
For the complete GPU guide, see Which GPU do you actually need for local AI?
March 31 is a week away
If you're an Augment Code user on an Indie, Standard, or Legacy plan, your completions stop working in days. Not months. Days.
You can wait and see what Intent looks like (if you're on macOS). You can switch to Cursor and start a new subscription. Or you can try local-first and stop depending on someone else's product roadmap.
Bodega One launches beta in May 2026 with full launch on July 6. Join the waitlist to get early access and a $30 promo code for completing the beta.
Related posts
Ready to own your tools?
Beta opens May 2026. Complete 14 days and earn a $30 promo code.