01 Frontier · market-leading
Claude Code Frontier
What. Anthropic's flagship coding agent. CLI + IDE extensions + cloud sessions.
Models. Claude (Sonnet, Opus, "Mythos" preview); multi-provider via gateway in v2.x.
Standout. Plan mode; Skills/Hooks/Subagents trio; 25 lifecycle hooks; parallel git-worktree agent teams; 1M-token Opus context at flat rate; mobile remote control.
Pricing. Pro $20/mo · Max $100/mo · top $200/mo · Enterprise seat ~$125/user/mo.
Leads at. Token efficiency (~5.5× fewer tokens than Cursor in independent tests), extensibility, 1M context.
Lags at. No native IDE; slower than Cursor for inline edits.
Cursor Frontier
What. VS Code fork; "agent workbench" since Cursor 2.0.
Standout. Tab autocomplete (best in class), Composer, Background Agents on cloud VMs, BugBot, scheduled/event-triggered automations, codebase indexing.
Pricing. Pro $20 · Pro+ $60 · Ultra $200/mo. Teams $40/user/mo. Seat + per-token credits.
Leads at. IDE-first ergonomics, tab completions, fast inline edits.
Lags at. Token economics (188K vs 33K on equivalent task); usable context (200K nominal, 70–120K in practice).
Codex CLI / Codex IDE Frontier
What. OpenAI's terminal coding agent + IDE companion.
Standout. Tight integration with OpenAI Agents SDK + Responses API; strong on Apps SDK / Connectors; built-in eval harness.
Pricing. Bundled with ChatGPT Plus/Team/Business + per-token overages.
Lags at. Plugin/extensibility less mature than Claude Code's Skills/Hooks.
Sourcegraph Amp Frontier
What. CLI agent on top of Sourcegraph's code graph (VS Code extension killed March 2026).
Standout. Oracle subagent (codebase analysis), Librarian subagent (external libs); 1M context; multi-repo via Sourcegraph graph; AGENT.md config; shared team threads.
Leads at. Monorepo / multi-repo context — best in class for large codebases.
Lags at. Terminal-only since March 2026.
Cognition Devin + Windsurf Frontier
What. Post-July 2025 acquisition: Devin (autonomous cloud) + Windsurf (IDE/Cascade). Combined enterprise ARR +30% in 7 weeks post-acquisition.
Standout. "Devin Sessions" — async cloud agent runs for hours; tight ticket → PR loop; Windsurf's Cascade for in-IDE multi-step flows.
Pricing. Devin per-task / per-ACU; Windsurf seat + token.
Leads at. Autonomous long-running cloud tasks; ticket-to-PR pipelines.
Lags at. Variable autonomy quality; independent benchmarks find lower SWE-bench Verified than top open agents on equivalent models.
02 Tier 2 · strong, niche-specific
Aider OSS
Terminal pair programmer; local repo + tight git integration.
Standout. Aider leaderboard — one of the most-watched independent benchmarks; excellent diff-driven workflow.
Best fit. Senior devs who prefer terminal + git; cost-sensitive teams routing via OpenRouter / LiteLLM.
OpenHands OSS
Open platform for cloud coding agents (formerly OpenDevin). Series A $18.8M.
Adoption. AMD, Apple, Google, Amazon, Netflix, NVIDIA (per company claims).
Performance. 72% SWE-bench Verified with Claude 4 — top tier among open agents.
Best fit. Self-hosting requirement; full control of the sandbox.
SWE-agent (Princeton / Stanford) OSS
Research framework — NeurIPS 2024; defined the Agent-Computer Interface (ACI).
Best fit. Research, paper-writing, agent-architecture experimentation. Not a product.
GitHub Copilot Workspace · agent mode
GitHub-native ticket → spec → plan → implementation loop.
Best fit. Heavy GitHub shops; PR-native flows.
Lags at. Less extensible than Claude Code / Cursor; gated to GH ecosystem.
Continue.dev OSS
Open-source IDE plugin (VS Code + JetBrains); user-controlled context and providers.
Best fit. Teams wanting a defensible OSS option with full provider control. Pairs with LiteLLM gateway.
Cline (formerly Claude Dev) OSS
VS Code extension; transparent autonomous agent with explicit human approvals.
Best fit. Devs wanting autonomy with full visibility; security-sensitive teams.
Zed AI
Native Rust editor with agentic features; Hashimoto-aligned "harness engineering" ethos.
Published position. Zed "Agentic Engineering" — the middle-path framing, three-phase model, and "stochastic tool mastery." Worth recommending the position even to clients who don't use Zed. → our take.
Best fit. Performance-sensitive devs; existing Zed users; teams that want a thoughtful vendor stance, not just a product.
JetBrains AI Assistant · Junie
JetBrains' native agent across IDEA, PyCharm, GoLand, etc. Junie = the autonomous piece.
Best fit. JetBrains-heavy shops; enterprise compliance where switching IDEs isn't viable.
Replit Agent
Cloud IDE + agent — full-stack from idea to deployed app.
Best fit. Greenfield / prototyping / non-engineers; "ship in a day" scenarios.
Plandex OSS
Terminal agent focused on long-running, multi-file planning + diffs.
Best fit. Migrations and large refactors.
Augment Code
Enterprise focus; "Intent" platform — developer-in-the-loop autonomy.
Best fit. Regulated enterprises wanting an autonomy slider.
03 Recommended defaults by client profile
| Client profile | Recommended core | Notes |
|---|---|---|
| OSS startup, cost-sensitive | Aider + LiteLLM / OpenRouter | Multi-provider, escape from raw API price |
| Regulated enterprise · JetBrains shop | JetBrains AI / Junie + self-hosted gateway | Compliance + IDE familiarity |
| Modern web · TS shop | Claude Code + Cursor in parallel | Claude Code for autonomy, Cursor for inline edits |
| Monorepo · large codebase | Sourcegraph Amp + Claude Code | Amp code graph + Claude Code extensibility |
| Ticket-driven product team | Devin / Copilot agent + Claude Code | Pair autonomous cloud with hands-on |
| Self-hosting requirement | OpenHands or Continue.dev + own gateway | Full control, no SaaS dependency |
| Mobile · non-engineer shop | Replit Agent | Lowest-friction path to running software |
04 Provider independence & cost levers
- Gateways. LiteLLM, OpenRouter, Portkey, Bedrock, Vertex, Azure OpenAI, or in-house proxy. Standardize on OpenAI-compatible API so harness can swap.
- Prefer seat-based pricing where workloads are bursty — much lower TCO than raw API pricing for heavy users.
- Cache aggressively. Anthropic prompt cache (5-min TTL), provider-side request cache, embedding cache.
- Batch off-peak. Anthropic Batch API and similar give ~50% off for non-real-time work.
- Self-host for bulk, non-sensitive workloads. Open-weights coding models can dramatically cut spend.