Patterns & Frameworks — Agentic Coding

Source: research/patterns.md 23 patterns in 6 categories

Foundations

Building Effective Agents

Anthropic · Dec 2024

Task → Workflow → Agent ladder. Five workflow patterns. Start simple; add complexity only when it earns its keep.

Yan's 7-pattern map

Eugene Yan

The 2×2 — data ↔ user × defensive ↔ offensive — that organizes the whole applied-LLM surface.

Interwoven workflow

Zed · Nathan Sobo

Middle path between utopia and skepticism. Three-phase model: deterministic → stochastic → interwoven.

Harness > model

Mitchell Hashimoto

Once frontier models are within 10% of each other, the harness — loop, tools, memory, sandbox — is the differentiator.

Context

Context engineering

Karpathy · Lance Martin

Write / Select / Compress / Isolate — Martin's taxonomy for filling the context window with the right tokens.

Context rot

Drew Breunig

Four failure modes: Poisoning, Distraction, Confusion, Clash. Fixes: Pruning, Summarization, Offloading.

Caching · thinking · compute

Anthropic

Prompt cache (highest-ROI quick win), extended thinking, computer use. Distinct from Yan's semantic caching.

Process

Plan mode

Boris Cherny · Claude Code

Plan = design document. Never let an agent write code until the plan is approved.

Spec-driven coding

Cursor · Kiro · Amp · Devin

Humans write structured specs/intent docs; agents fill them in. Spec becomes the eval.

Coding-agent loops

ReAct · Reflexion

Edit → test → read output → decide. Critic loops add verification when tests are weak.

Evals & error analysis

Hamel Husain · Eugene Yan

Spend 60–80% of dev time on error analysis. Match metric to task. LLM-as-judge with four bias mitigations.

SWE-bench Verified

Princeton · OpenAI

Frontier scores + the harness-delta finding (scaffolding moves the same model 5–15 points on SWE-bench Pro).

Feedback flywheel

Eugene Yan

Capture explicit + implicit signals. They become the next eval set and the next fine-tune corpus.

Architecture

Skills · Hooks · Subagents

Anthropic · Claude Code

Capability bundles (auto-loaded), deterministic lifecycle guards, isolated-context specialists.

Multi-agent debate

Anthropic vs Cognition

+90% on internal evals (Anthropic) vs fragile-by-default (Cognition). Resolution: reads fan out, writes single-threaded.

Reads fan out

Cognition · follow-up

Read with many, write with one. Best one-line guidance for clients designing their first multi-agent system.

Model Context Protocol

Anthropic

Typed, auditable agent-to-product interface. ~78% enterprise adoption, ~9,400 public servers.

Verifier / critic

All major review agents

Pair a generator with a separate model whose job is to find faults. Used by CodeRabbit, Greptile, Graphite.

Discipline

Caveats

What we treat with skepticism

House position

Autonomous-everything pitches. Long-horizon swarms without parallelization. Vector-RAG-as-answer for code.

Patterns we apply to client work