Augment Code is removing inline completions (Next Edit + Completions) from Indie, Standard, and Legacy plans on March 31, 2026. Enterprise is unaffected. Augment isn't shutting down. They're pivoting to Intent, a multi-agent orchestration product. If you relied on their completions, your options are: stay with Augment for Intent, switch to Cursor (from $20/mo), or go local-first with Bodega One Code (free for personal use, $39 one-time for commercial). This post walks through each option honestly.
Update (June 19, 2026): The March 31 completion sunset has since happened, and Augment's pricing moved further. The Indie ($20/mo) and Standard ($60/mo) individual plans referenced below are now retired. Augment's cheapest plan today is Business at $100/mo flat (up to 50 seats, with $100 of usage included), plus a custom Enterprise tier. The three options below still hold; only Augment's pricing changed, and it went up.
If you use Augment Code on an Indie ($20/mo), Standard ($60/mo), or Legacy plan, you just got the email. On March 31, 2026, inline completions are gone. Next Edit and Completions, the features many developers signed up for, will stop working on those tiers.
This isn't speculation. Augment announced it directly. Enterprise customers keep everything. Everyone else loses the completion features.
Let's be clear about what's not happening: Augment Code is not shutting down. The company has $252M in funding and $20M in revenue. They have 156 employees and they're actively building their next product. This is a strategic pivot, not a collapse.
But if you're an individual developer or a small team that relied on those completions for daily work, the distinction between “shutting down” and “removing the feature you actually use” is academic. You need a plan before April 1.
Why Augment is doing this
Augment is going all-in on Intent, their multi-agent orchestration product. The thesis: the future of AI coding isn't autocomplete. It's autonomous agents that plan, execute, and verify multi-step tasks across your codebase.
It's not a bad thesis. The industry is clearly moving in that direction. But the execution has friction:
- Intent is macOS only. There's no Windows timeline. There's no Linux roadmap. If you work on either platform, Intent isn't an option today.
- Credit-based pricing replaced flat pricing in October 2025. Power users have reported 2-3x cost increases under the new system. Predictable billing is gone.
- The completion sunset removes the most-used daily feature from the plans most individual developers are on.
Give Augment credit where it's due: their Context Engine is genuinely impressive. It indexes 100K+ files, they have 100+ native tools, and their MCP support is solid. For enterprise teams with big monorepos, it's a real differentiator. The question is whether that matters to you if you can't use their completions.
What your options actually are
You have three realistic paths. Each one has real tradeoffs.
Option 1: Stay with Augment and use Intent
If you're on macOS and the agent-first workflow appeals to you, Intent could be the right move. Augment's deep context indexing is their strongest feature, and Intent builds on top of it.
The downsides:
- macOS only. No Windows, no Linux.
- Credit-based pricing means your costs will vary month to month.
- You're giving up the completion workflow you already had dialed in.
- You're betting on a new product that's still early.
Option 2: Switch to Cursor
Cursor is the most obvious alternative. It's a VS Code fork with strong AI features, including tab completions. $20/month for Pro. Huge community. Active development.
The downsides:
- It's a subscription. $20/month is $240/year, $720 over three years.
- Your code goes to their servers for every completion and chat message.
- You're locked to their model choices (though they offer some flexibility).
- You're back in the same position: dependent on a company that can change pricing, models, or features at any time.
Option 3: Go local-first
Run your own models. Own the tooling. No subscription. No code leaving your machine. This is the path we built Bodega One Code for.
The downsides (being honest):
- You need a GPU. Local inference requires hardware. A 12GB+ VRAM card is the sweet spot.
- Local models aren't as strong as frontier cloud models. The gap is narrowing fast, but it exists.
- Initial setup takes 10-15 minutes (Ollama install + model pull + connect to IDE).
The case for local completions
Inline completions are the most latency-sensitive AI feature you use. Every millisecond matters. Every extra keystroke of delay breaks your flow.
This is where local models have a structural advantage. Fill-in-the-Middle (FIM) models running on your own GPU have:
- Zero network latency. No round-trip to a server. The completion is generated on the same machine where you're typing.
- Zero cost per keystroke. Cloud completions cost money per token. Local completions cost electricity. After the hardware, the marginal cost is effectively zero.
- Zero data exposure. Your code never leaves your machine. No server logs. No training data pipelines. No questions about who sees what.
- Zero dependency on someone else's business decisions. Nobody can sunset your local model.
The recommended FIM model for local completions right now is qwen2.5-coder. At the 7B size with Q4 quantization, it runs comfortably on 8GB VRAM and produces completions that are genuinely useful for day-to-day work. Not perfect. But fast, free, and private.
For a full breakdown of which model to run at each VRAM tier, see our Local LLM Guide.
How Bodega One Code fits
Bodega One Code is a local-first AI desktop IDE. Full Monaco editor, AI chat, and an autonomous coding agent. It runs on Windows, macOS, and Linux.
Here's how the features map to what you had with Augment:
| Feature | Augment Code | Bodega One Code |
|---|---|---|
| Inline completions | Removed from Indie/Standard | FIM via local models |
| AI chat | Yes (cloud) | Yes (local or cloud, your choice) |
| Autonomous agent | Intent (macOS only) | Built-in, all platforms |
| Code verification | Not specified | QEL - 3-level automated verification |
| Model support | Augment's models | 10+ providers (Ollama, LM Studio, vLLM, OpenAI, Groq, etc.) |
| Platforms | VS Code extension + Intent (macOS) | Windows, macOS, Linux |
| Privacy | Cloud-processed | Air-gap mode with 9 enforcement layers |
| Pricing | $100/mo Business (Indie/Standard retired) | Free (Personal) / $39 one-time (Pro) |
The big difference: Bring Your Own LLM (BYOLLM). You bring whatever LLM you want. Run Ollama locally for zero-cost completions. Use OpenAI when you need frontier reasoning. Switch between them per-message. You're not locked to any single provider.
For the full feature-by-feature comparison, see Bodega One Code as an Augment Code alternative.
Pricing: what you'd actually pay
Let's do the math over three years, since that's when the difference becomes impossible to ignore.
| Tool | Year 1 | Year 2 | Year 3 total |
|---|---|---|---|
| Augment Business (cheapest plan now) | $1,200 | $2,400 | $3,600 |
| Cursor Individual | $240 | $480 | $720 |
| Cursor Individual (Pro+ option) | $720 | $1,440 | $2,160 |
| Bodega One Code Personal (free) | $0 | $0 | $0 |
| Bodega One Code Pro (commercial) | $39 | $39 | $39 |
That's $0-$39 total vs. $720 to $3,600. Not per year. Total. See the full breakdown in our 3-year cost analysis.
Getting started with local completions
Here's the fastest path to get running with local completions:
- Install Ollama - One command. Works on Windows, macOS, and Linux. Visit
ollama.comand follow the install instructions for your OS. - Pull a FIM model -
This is the recommended model for inline completions. 7B parameters, Q4 quantization, runs on 8GB VRAM.ollama pull qwen2.5-coder:7b - For chat and agent tasks, pull a larger model if your hardware supports it:
14B is the sweet spot for 12GB VRAM cards.ollama pull qwen2.5-coder:14b - Connect to Bodega One Code - In settings, select Ollama as your provider. The connection is automatic (localhost:11434).
For the full walkthrough with troubleshooting and model recommendations by VRAM tier, see Setting up Ollama with Bodega One Code.
If you prefer LM Studio (especially on Mac), we have a dedicated guide: LM Studio + Bodega One Code setup guide.
Hardware quick reference
You don't need a 4090. Here's what actually works at each tier:
- 8GB VRAM (RTX 3060, 4060) - Qwen2.5-Coder 7B Q4. Good completions, good chat.
- 12GB VRAM (RTX 3060 12GB, 4070) - Qwen3-14B Q4. Strong completions, strong agent work.
- 16-24GB VRAM (RTX 4070 Ti Super, 4090) - Qwen2.5-Coder 32B Q4. This is the gold tier. Completions rival cloud quality.
- Apple Silicon 16GB+ - Qwen3-8B MLX Q4. Unified memory means the whole model fits without discrete GPU.
For the complete GPU guide, see Which GPU do you actually need for local AI?
The completions are already gone
March 31, 2026 has passed. If you were on an Augment Indie, Standard, or Legacy plan, your inline completions stopped working then, and those individual tiers have since been retired entirely. The decision you were putting off is now in front of you.
You can move to Augment's Business plan at $100/mo, if a team plan fits. You can switch to Cursor and start a new subscription. Or you can go local-first and stop depending on someone else's product roadmap.
Bodega One Code beta is free and open to everyone. Full launch coming later this year. Download free to try it.
Common questions
- When did Augment Code remove completions, and do the old plans still exist?
- Augment Code removed inline completions from its Indie, Standard, and Legacy plans on March 31, 2026. Enterprise kept everything. Those individual plans have since been retired entirely; Augment's cheapest plan is now Business at $100/mo flat, with a custom Enterprise tier above it. The company pivoted to Intent, its multi-agent orchestration product.
- What are the main disadvantages of switching to Cursor from Augment?
- Cursor costs $20 per month ($240 annually), with your code leaving your servers daily for completions and chat. You are locked into their model choices and back in subscription dependency, which means Cursor could change pricing or features at any time, exactly like Augment just did.
- Can I run AI completions locally without a subscription?
- Yes. Local FIM models like qwen2.5-coder 7B run on 8GB VRAM with zero network latency, zero per-keystroke cost, and zero data exposure. After hardware purchase, completions cost only electricity. Nobody can sunset your local model.
- How much does Bodega One Code cost over three years compared to Cursor?
- Bodega One Code Personal is free. Cursor Individual (formerly Cursor Pro) costs $720 over three years ($20 monthly). Bodega One Code Pro is $39 one-time for commercial use. The difference becomes significant once you factor in recurring subscription pricing over time.
Written by the Bodega One team. We build Bodega One Code, the local-first AI IDE, and we write here about local models, AI costs, and what we learn shipping it. More about the team and why we build local-first on the about page.
Related posts
Stay in the loop
Build-in-public updates, model picks, and Copilot/Cursor news as it breaks.
Follow @BodegaOneAI on X →