Once frontier models are within ~10% of each other on SWE-bench, the differentiator is the harness — loop, tool surface, memory, sandbox, validation. Build tooling agents can invoke to self-check; treat each agent mistake as a chance to bake in a guardrail.
Mitchell Hashimoto · Anthropic Skills/Hooks/Subagents · see also People · Hashimoto
Core platform-engineering pitch: invest in the harness once, swap models freely. Aligns with the "avoid provider lock-in" plank in Cross-cutting.